Internet Archive
The Internet Archive is a US non-profit organization and the website it runs. As the name suggests, it is a massive archive of digital text, sound, music, video and even software and webpages. Most of its content (except webpages) is either public domain (e.g. its copyright has expired or it has been created by the US government) or under a free license such as Creative Commons.
All of this means that in it you can find rare songs, obscure archive footage of nuclear explosions, technical and historical texts of almost anything imaginable, and... Moon landing denial videos. The site also hosts a huge collection of social guidance films from the Prelinger Archives. Of course, the Zeitgeist movie is also on board, in several different variants.[1]
Their headquarters is in a former Christian Science church, which earns them even more coolness points in RationalWiki's books.
The Wayback Machine
The Internet Archive also runs bots
The Wayback Machine refers to the subsection of the site that serves as an access point to the web archive. Just enter a URL and go. The name is a reference to the fictional Wayback Machine, a time machine used by a slightly obscure cartoon character of note during the McCarthy years.
Restrictions
The Wayback Machine originally could handle indefinite HTTP requests.
In mid-2019, the Wayback Machine has added a new cap of 20 HTTP requests per minute per IP address. When reaching the limit, it returns “429 Too Many Requests”, which resets after one minute.
This restriction was tightened down to 5 requests per minute in early October 2019, no matter the size of the page, rendering it effectively unservicable for mass archivals without the help of automated scripts and/or proxies/VPN's.
The following new error message appears when surpassing the limitation (exact quote from source, including '“are you” grammatical error):
'“Too many requests – Please email info@archive.org if you have questions about why are you being blocked”.—
External links
- Internet Archive homepage
- Archive.is An alternative web archiving service which only process by explicit user request.
- WebCite is a different entity that will archive a publicly accessible copy of a website on demand. WebCite also honors robots.txt
References
- Zeitgeist (2007). "Zeitgeist - The Movie".