Internet Archive Hit by Fire


The offices of the Internet Archive project have been badly hit by a fire. Appropriately enough, no data was lost, but staff are appealing for donations to help replace damaged equipment.

The project is best known for its WayBack Machine, a tool that archives copies of web pages over time, aiming to keep track of how both the content and design of sites has evolved. Altogether it has done 12 complete “crawls” of the web and currently has around 2 petabytes of data covering 364 billion pages.

Fortunately none of this data was lost in the fire in the Richmond area of San Francisco (and even more fortunately, nobody was hurt.) The fire didn’t affect the Internet Archive’s main building, but rather a smaller adjacent building next door (to the left in the above picture) where staff scan printed documents and microfilm, which form part of the same digital library as the web pages, alongside audio and video recordings.

The project’s Brewster Kahle notes the incident is ironically a good example of the benefits of a digital library. He says that even had the main building been destroyed, no data would have been lost thanks to multiple offsite backups.

Fortunately only a small batch of the project’s physical materials such as books is taken to the scanning building at a time. Of the material that was lost in the fire, about half had already been scanned. Kahle says staff are still working to assess whether any irreplaceable materials were lost, but the bulk of it was books donated by used book stores.

The real loss in this case is around $600,000 of scanners and other “high end digitization equipment”, plus damage to the building itself. Kahle is appealing for donations to the project, which is a registered 501(c)(3) non-profit organization.

(Image credit: Internet Archive)