The Internet Archive is a non-profit organization based in the United States that keeps a variety of old content alive on the web for future access. What kind of content can you dig up, and why should you care? You’re about to find out.
Rescue Old Web Content
In the year 2012, quite a long time ago in Internet-time, I took a class that involved designing a timepiece, and blogging about the design process. Aspiring designers would finish with an online portfolio piece to show future employers. Unfortunately, I naively created my team’s blog on a service called Posterous, which closed its services mere months after. I procrastinated backing up the blog until it was too late – backups were no longer available. It was almost lost forever. Almost.
Of course, as we’re all becoming increasingly aware, anything that ever lives online has exceptional longevity, and can often be revived from the dead. Fortunately for me, my work lived on, in the Internet Archive.
All you have to do to use the Web Archive is type in the URL of the page you want to see old versions of into the field in the Wayback Machine section and click the Browse History button.
You’ll be presented with a calendar marking all the times the page was crawled and has versions for you to review. Click on a highlighted date to see what the site used to look like.
From the content available at the most recent date, I managed to source all the text and many of the photos, and recreate the simple timepiece blog. If you’ve ever let a site of yours like an old blog die, and wished you could retrieve your writing, I recommend you give it a shot – it’s saved me many times.
Even if you’re not looking for your own content, The Internet Archive is a great place to go to find out what content a broken link used to link to. I’ve used it to find old journal articles while in university. Remember though, if you’re linking to scholarly material found using the Wayback Machine, make sure to cite it properly.
Depending on how long ago it was, and how large your picture files were, the Wayback Machine may only include the text of the site, unfortunately. No guarantees that it will work at all though, because many content types can’t be archived by the Wayback Machine.
If this doesn’t let you access the content you’re looking for (perhaps the site content you’re looking for is newer than what is available at the Wayback Machine), check out our other ways to find content from websites that won’t load.
Removing content from the Wayback Machine
What if you don’t want your website archived? It’s as simple as adding a couple of lines of text to your site’s robots.txt file. For more detailed instructions, check out our previous article on how to get your site removed from places on the web.
Find out about all books ever published
We’ve touched on Open Library as a place to access free e-books before, but did you know that Open Library is aiming to make a page for every book every published? It’s a useful resource not just for finding books to read, but for discovering what’s out there. It’s a Wiki format, so everyone is welcome to create an account to both borrow books to read, and to edit pages about the books too.
Of course, it being a wiki format means it can get a bit messy at times with inconsistently used naming conventions and duplicate entries. This means sometimes you will have to decide which source you want to improve and which to let die.
If you’ve found a book that has copies available for you to read, it will have a little green book “Read” icon beside the title, which launches a web-app e-book reader in your browser that looks like this.
On a side note, it was really lovely to see what the original Alice’s Adventures In Wonderland book looked like with all the page ornaments that we no longer include. All of the pages were as beautiful as the one you see above.
Find Images from Space
The image below is of what the Earth looks like during an eclipse. Amazing, isn’t it? If you’re looking for other high-resolution images of outer space and celestial bodies, you’re in luck, thanks to a partnership between NASA and the Internet Archives.
The Internet Archive and NASA partnered in 2008. NASA’s images available on the Internet Archive generally free-of-copyright, provided you cite that they’re from NASA / the Internet Archive or as otherwise specified.
The URL “www.nasaimages.org” used to point to the archive, but at the time of this writing it points to a different organization with no affiliation to either the Internet Archive or NASA, even though it is still referenced on the Internet Archive’s site.
We’ve also got a round-up of 6 other great places to look for space images.
Access Specific Subject Matter Collections
Archive-It allows partnering organizations to collect Internet archives of the subjects that are important to them, that the general public can access. This is a treasure trove if you are conducting research on a specific topic and want to dig into the resources that have already been collected and somewhat sorted.
It’s a distinctly different research angle than what a Google search will turn up, so I encourage citizen historians to give it a try. There are archives for everything from the 2014 Winter Olympics, to the Japan Earthquakes of 2011, to the World Without Oil alternate reality event.
One thing to be mindful of is that because the resources are being collected by an organization, not by an algorithm (which is not exactly neutral but almost certainly less biased), some resources you may expect to find may be left out intentionally as part of the curation process. Either way, it makes for an excellent starting place to see what organizations considered relevant enough to include in the archive.
Hear Live Music Recordings
Ever miss a concert and wish you were there, or want to re-live one you loved? Have the concert experience at home with the Internet Archive’s Live Music Recordings.
I really enjoyed the Jack Johnson Live at Coachella concert recordings (there are over 100 of his performances!) I found on the site. It was of high quality, and included the audience cheering but not at obnoxiously loud levels like watching a YouTube video of a live concert that was recorded by an audience member. I can’t speak for other recordings, but I’m hopeful.
Note that only the audio is made available from this source, but this is fair and it arguably video would only make the experience worse. Concert video is generally of low or otherwise inconsistent quality, and it takes up much more space.
You might also like our list of places to get free legal concert recordings.
The Internet Archive’s collections may not be the prettiest interface, and there’s a lot of information it just doesn’t provide, but overall it’s a fantastic source for finding old Internet content.
However, don’t forget that some of the best resources for researching are your local libraries, museums, and archives. The humans who work there can be phenomenal researchers, and usually their services are free or at low cost too. Lots of these places are also evolving in recognition of the role that the Internet is playing in helping ordinary people access information, becoming a hub for people to gather and learn together and discuss what they find.
What are your thoughts? Can online sources like the ones above give you all the information you need? Is there anything about these physical institutions that the online world can’t replace? What kinds of innovations in information storage and organization are you looking forward to in the future?
Image Credit: Eclipsed Earth via NASA/Internet Archive
Explore more about: The Internet Archive.