This week the Perma team is releasing new software that is a building block for any individual or organization creating a web archive. Scoop is a highly-tunable single page capture library that prioritizes fidelity and provenance, drawing on our decade of experience archiving citations for law journals and courts.
When designing Scoop, we focused on making high-quality, signed web captures that you can take with you and host anywhere you want, while still being able to verify where they came from.
Why does that matter? Because we want to update how people talk to each other — and convince each other — about what content has been on the web.
We’ve all seen them: the contextless, but authentic-looking screenshot tweeted out to thousands of followers and proliferated throughout different networks, often jumping platforms.
Media literacy and years of experience seeing photoshopped or faked content could help you parse what is real and what is fake, but the more time we spend on the internet and seek our news there, the more we will fall victim to inauthentic web content. Given the state of information on the web, this is likely to only get worse, not better.
Our IT departments and savvy-minded friends will give us tips on how to avoid phishing scams: call the friend directly, log into your bank account via the online portal instead of clicking that link, or otherwise meet the information at its source to guarantee you’re not being duped. How do we validate authenticity, though, when the thing we’re seeking is fragile: a dynamic web page, vulnerable to link rot?
We trust things that we know represent reality as much as possible, and we trust things that we know the origin of. Basically, we trust witnesses.
Here’s the thing: running a web archive up to this point has been so complex that it is necessarily centralized. Witnessing what is on the web has come down to just a few centralized archives who are trusted to maintain their collections, whether it is the indispensable Internet Archive or our own Perma.cc.
Even the most established archives have potential to be manipulated, and no one archive can serve all the needs our users have for web witnessing. With tools like Scoop, and others pioneered by our friends at the Webrecorder project, we don’t have to. Advances in web technology are making it more plausible to decentralize the means of web archiving throughout the entire pipeline, from creation to storage and playback. But what about that trust factor?
Scoop is a highly-tunable single-page capture engine that has compatibility with recently crafted .wacz signing standards. As a default, extensive provenance information is included for traceability and transparency. Additionally, as a guiding light in our design we captured the web under a no alterations principle, prioritizing an “as is” state over potentially smoother playbacks to strengthen the value of the record’s testimony.
Scoop is a library that can be used as a witness. Learn more about the specs on Github, and keep an eye out for stories about how this technology can be used and deep dives about its capabilities.
Perma Links for sources in this blog post:
- https://www.theatlantic.com/technology/archive/2021/09/eric-schmidt-artificial-intelligence-misinformation/620218/ archived at https://perma.cc/3VSA-S6MX
- What the ephemerality of the Web means for your hyperlinks archived at perma.cc/TYW6-FQ5F
- Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations archived at perma.cc/D29D-MV4L
- REWRITING HISTORY: MANIPULATING THE ARCHIVED WEB FROM THE PRESENT archived at https://perma.cc/K853-FF3V