88
42
I want to get all the files for a given website at archive.org. Reasons might include:
- the original author did not archived his own website and it is now offline, I want to make a public cache from it
- I am the original author of some website and lost some content. I want to recover it
- ...
How do I do that ?
Taking into consideration that the archive.org wayback machine is very special: webpage links are not pointing to the archive itself, but to a web page that might no longer be there. JavaScript is used client-side to update the links, but a trick like a recursive wget won't work.
14
I've came accross the same issue and I've coded a gem. To install:
– Hartator – 2015-08-10T06:32:40.320gem install wayback_machine_downloader
. Run wayback_machine_downloader with the base url of the website you want to retrieve as a parameter:wayback_machine_downloader http://example.com
More information: https://github.com/hartator/wayback_machine_downloader3
A step by step help for windows users (win8.1 64bit for me) new to Ruby, here is what I did to make it works : 1) I installed http://rubyinstaller.org/downloads/ then run the "rubyinstaller-2.2.3-x64.exe" 2) downloaded the zip file https://github.com/hartator/wayback-machine-downloader/archive/master.zip 3) unzip the zip in my computer 4) search in windows start menu for "Start command prompt with Ruby" (to be continued)
– Erb – 2015-10-02T07:40:28.2333
start="5">