
The Wayback Machine is currently archiving web pages at an incredible rate of 150 TB of data every day, and its base is located in a church in San Francisco.
An article from CNN highlighted that this facility, which was once a Christian Scientist church, now contains 29 years of web history. The Archive started in 1996, with an entire year’s worth of web pages only needing 2 TB of storage. Today, a single day of data collection adds up to 150 TB, and the total copy of all archived data is a staggering 175 petabytes.
In October, they celebrated reaching a total of one trillion archived web pages. While symbolic servers are in the church, the vast majority of the original data lives in a warehouse outside of San Francisco, with additional backups stored globally to prevent potential data loss from disasters.
This approach has been crucial, especially considering that past administrations have erased significant amounts of online government content.
The Internet Archive doesn’t just focus on web pages; it has also digitized 49 million books, 13 million audio recordings, 10 million videos, and 5 million images. However, it faced challenges recently as it lost a lawsuit that required the removal of 500,000 books from its library. Founder Brewster Kahle remarked on this loss, claiming that “the world has become stupider” as a consequence of the ruling.
In addition to its historical endeavors, the Archive is also forward-looking, working on ways to capture how information is presented via modern AI, keeping pace with the evolving nature of news consumption.
If you’re in San Francisco, you can even visit the site and take a free tour of this significant repository on Fridays at 1 PM.
