By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
PulseReporterPulseReporter
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Reading: The Open Supply Device That Has Preserved 150,000 Items of On-line Proof
Share
Notification Show More
Font ResizerAa
PulseReporterPulseReporter
Font ResizerAa
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Have an existing account? Sign In
Follow US
  • Advertise
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
PulseReporter > Blog > Investigations > The Open Supply Device That Has Preserved 150,000 Items of On-line Proof
Investigations

The Open Supply Device That Has Preserved 150,000 Items of On-line Proof

Pulse Reporter
Last updated: August 13, 2025 10:17 am
Pulse Reporter 17 hours ago
Share
The Open Supply Device That Has Preserved 150,000 Items of On-line Proof
SHARE


Contents
Automated Archiving and Collaboration – When to Use This Device?Help BellingcatThe Way forward for Internet ArchivingWhat Modified, What StaysSubscribe to the Bellingcat e-newsletterA New Structure

Bellingcat’s Auto Archiver is a device aimed toward preserving on-line digital content material earlier than it may be modified, deleted or taken down. Publicly launched in 2022, it has preserved over 150,000 net pages and social media posts up to now. The Auto Archiver has been utilized by Bellingcat’s journalists to protect info on dozens of fast-paced occasions such because the Jan. 6 riots – once we first used the device internally – in addition to collect digital proof for our Justice and Accountability mission and to observe Civilian Hurt in Ukraine.

The Auto Archiver has additionally been adopted by each massive newsrooms and NGOs. It has been  utilized by particular person researchers, journalists, activists, archivists, lecturers and builders as effectively.  With curiosity within the device sturdy, we’ve labored onerous so as to add to and enhance it over time. However we’ve used the previous few months to take a step again and to construct a brand new and extra sturdy ecosystem to additional assist particular person organisations and researchers use and profit from it.

Our intention has been to make it extra dependable and even simpler to make use of for extra individuals. In the present day, we’re completely satisfied to announce an up to date model of the Auto Archiver which incorporates many new options like:

  • Detailed documentation for all options and configurations
  • A user-friendly interface designed for groups utilizing a shared occasion
  • A brand new modular construction that improves the startup pace and reliability of the device
  • New options like chain of custody, perceptual hashing for deduplication, and strategies to keep away from anti-bot measures and captchas on web sites
  • A user-friendly device to configure the Auto Archiver, with out the necessity to edit configuration textual content information
Screenshot of recent Documentation web site for the Auto Archiver

For an in-depth have a look at the adjustments made on this secure model of the Auto Archiver, see the What Modified, What Stays part additional down on this article.

Automated Archiving and Collaboration – When to Use This Device?

The most recent model of the Auto Archiver has an easy-to-use net interface and a simplified set up course of that makes it extra simple to arrange than earlier than. Nevertheless, some technical expertise are nonetheless required for this preliminary course of, and there are different instruments out there that might meet lots of your archiving wants.

Help Bellingcat

Your donations instantly contribute to our capability to publish groundbreaking investigations and uncover wrongdoing world wide.

If all you want is to archive a couple of unauthenticated URLs, we suggest utilizing the Wayback Machine or Archive.in the present day. Alternatively, WebRecorder’s browser extension ArchiveWebPage can create a replayable archive of a web site you go to – even for content material behind login partitions. For batch processing, the Wayback Machine has a bulk add service that accepts Google Sheets. When you individually must file all of your browser interactions and retailer content material alongside the best way there are paid choices like Hunchly. Lastly, if all you have an interest in are movies and are snug with the command line, yt-dlp will most likely be sufficient to obtain these, even in bulk.

However in the event you’re hoping to automate your archiving, or archive a lot of URLs in a collaborative setting, then that is the place the Auto Archiver actually shines. Its modular framework permits you or your crew to customize archiving based mostly in your wants, and supplies a approach to generate metadata that ensures others can belief that your archived content material has not been tampered with. 

Study extra about what websites the Auto Archiver can archive right here.

The Way forward for Internet Archiving

Archiving the net is tough, particularly when logins, captchas, and different bot prevention methods are in place. We’ll do our greatest to maintain enhancing our Auto Archiver, however we observe that it needs to be simply considered one of many instruments in your researcher’s toolkit. You may discover a wide range of different helpful instruments within the Bellingcat Open Supply Investigation Toolkit.

Nonetheless, if you wish to help us on this journey of archiving essential info, you possibly can:

  • Obtain and use this device
  • Donate on to Bellingcat
  • Check, give suggestions, and develop new options in our GitHub

For newsrooms:
When you work in a newsroom or analysis crew and need to entry a demo or assist to deploy the Auto Archiver internally you possibly can attain us at contact-tech@bellingcat.com with the Topic “Auto Archiver at [my team/organisation]” and inform us extra about your organisation and archiving wants. Constructing a larger adoption base is one of the simplest ways to make sure the way forward for this device and its versatility.

What Modified, What Stays

Subscribe to the Bellingcat e-newsletter

Subscribe to our e-newsletter for first entry to our printed content material and occasions that our workers and contributors are concerned with, together with interviews and coaching workshops.

Now that we’ve given a broad overview of the device and its adjustments, what follows is a deeper have a look at how completely different components of it work and work together. This can possible be of larger profit for extra technical customers, and we once more stress that profitable customers of the device will possible want some technical data to set it up for the primary time. 

However assist is on the market with our reside Auto Archiver Documentation. That is the place you’ll all the time discover the most recent info on methods to set up, configure or debug the device. Even when some elements talked about on this article change within the coming years, the documentation might be your go-to area for the updated directions. 

You probably have questions or issues please open an concern on GitHub. That’s the place others can even be going to for assist and constitutes our shared data area.

A New Structure

Many open supply researchers, together with at Bellingcat, favour utilizing the Auto Archiver with the Google Sheets integration, which permits customers to work collaboratively by including hyperlinks to a spreadsheet and letting the Auto Archiver run within the background. Nevertheless, we’ve now made it easier to combine the Auto Archiver into different methods. One such instance is ATLOS, a collaborative investigations platform that built-in the Auto Archiver and which has been used by Bellingcat and the Centre for Info Resilience. 

Integration is feasible by way of the brand new modular structure of the Auto Archiver and will be seen within the two new initiatives that we not too long ago made public underneath open supply code licenses: the Auto Archiver API and the Auto Archiver Internet Interface.

A display screen seize of the brand new Auto Archiver Internet Interface displaying the Google Spreadsheets administration web page, the place customers can allow the Auto Archiver to run periodically on new or current spreadsheets.

Modules are the constructing blocks of the archiving pipeline and inform the device methods to run. They element the place to seek out the URLs, which archiving strategies to make use of, what further processing to hold out on archived content material and the place and methods to retailer it. Every module falls into a particular class:

  1. Feeder modules specify the place to learn the URLs from. There’s one for Google Sheets, for instance. 
  2. Extractor modules obtain media and different metadata from a URL: our most versatile one is the Generic Extractor, which makes use of yt-dlp to obtain movies. Nevertheless, extractors will be tailor made for particular platforms just like the Telethon Extractor, which requires a Telegram account to obtain all media and metadata from the messages in public or personal chats an account has joined. 
  3. Enricher modules enhance the worth of the archived content material with further info or checks, equivalent to hashing or timestamping the content material for future consistency or chain of custody validations. 
  4. Formatter modules gather and show the results of the method in a single formatted output. We use the HTML Formatter, as proven in this Bluesky publish instance.
  5. Storage modules inform the device the place to place the information it downloaded or generated. The best is to retailer it domestically. However to make sure higher preservation the most effective apply is to make use of cloud storages like S3 or Google Drive. 
  6. Database modules merely point out the place to avoid wasting a file of this archive, equivalent to whether or not archival was profitable and which strategies have been used. You should use a CSV file and Google Sheets, for instance. 

The modules documentation will be discovered right here and it’s there that will help you perceive how every module works and is configured. Configuring which modules to make use of is completed by way of a YAML file. If you’re not snug with these, we’ve you coated with a brand new interface referred to as the configuration editor the place you possibly can visually create or edit your modules configuration. In truth, the primary time you run the Auto Archiver a minimal working YAML configuration file is generated which you need to use immediately to learn URLs from the command line and retailer archived content material domestically.

Some platforms rate-limit or outright block IPs based mostly on inauthentic behaviour. One of many methods we make use of to bypass that’s sending visitors via a proxy, which you’ll be able to configure in particular modules just like the Generic Extractor . We now have been utilizing Oxylab’s Residential Proxies as a part of their Undertaking 4beta efficiently for over a yr, however know that there are a number of good suppliers on the market. 

If you’re a developer, you possibly can design new modules as wanted utilizing Python code, and we welcome it if you wish to contribute these again to our code. Think about a Feeder that’s always scraping URLs from a Bluesky account, or an Enricher that makes use of an AI mannequin to detect and blur graphic content material. All of that’s attainable and straightforward to construct on this new structure. 

We hope you’ll benefit from the up to date device.

Please give us any suggestions or strategies for enhancements by contacting us by way of contact-tech@bellingcat.com.


Bellingcat is a non-profit and the power to hold out our work depends on the type help of particular person donors. If you need to help our work, you are able to do so right here. You can too subscribe to our Patreon channel right here. Subscribe to our Publication and comply with us on Bluesky right here and Instagram right here.



You Might Also Like

Absentee poll drop field use in Wisconsin sharply down from 2020

Houthi-Managed Port Receives Vessel from Occupied Crimea After UN Inspection Physique Grants Clearance

Why Trump’s second commerce struggle could possibly be worse for US farmers

Hidden particulars of Putin’s non-public life present his ‘actual worldview,’ new e-book claims

Tyson Meals reduce contracts with poultry farmers. Now the corporate is working to silence their authorized combat. 

Share This Article
Facebook Twitter Email Print
Previous Article Japandi Design Concepts from Our Seaside Home Japandi Design Concepts from Our Seaside Home
Next Article Brandon Blackstock Was Relationship Kelly Clarkson Assistant Brandon Blackstock Was Relationship Kelly Clarkson Assistant
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!

More News

Capital One airline and lodge switch companions: How you can maximize your miles
Capital One airline and lodge switch companions: How you can maximize your miles
12 seconds ago
Musk says he is suing Apple for not that includes X and Grok in ‘Should Have’ part of app retailer
Musk says he is suing Apple for not that includes X and Grok in ‘Should Have’ part of app retailer
3 minutes ago
23 Items For The Individual At all times Making Film References
23 Items For The Individual At all times Making Film References
21 minutes ago
GPT-5 Would not Dislike You—It May Simply Want a Benchmark for Emotional Intelligence
GPT-5 Would not Dislike You—It May Simply Want a Benchmark for Emotional Intelligence
53 minutes ago
Cisco Techniques deserves extra respect in AI, and its quarterly outcomes show it
Cisco Techniques deserves extra respect in AI, and its quarterly outcomes show it
1 hour ago

About Us

about us

PulseReporter connects with and influences 20 million readers globally, establishing us as the leading destination for cutting-edge insights in entertainment, lifestyle, money, tech, travel, and investigative journalism.

Categories

  • Entertainment
  • Investigations
  • Lifestyle
  • Money
  • Tech
  • Travel

Trending

  • Capital One airline and lodge switch companions: How you can maximize your miles
  • Musk says he is suing Apple for not that includes X and Grok in ‘Should Have’ part of app retailer
  • 23 Items For The Individual At all times Making Film References

Quick Links

  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Disclaimer
2024 © Pulse Reporter. All Rights Reserved.
Welcome Back!

Sign in to your account