4
1
I'm trying to mirror a WordPress site that is no longer updated so that I can remove the php backend. I have no desire to worry about updating the site again. I realize this will break dynamic parts such as search and comments and I'm ok with that loss of functionality. If there's a better way to do this, I'm open to other suggestions beyond wget.
I am currently using the following command.
wget -vN --server-response --wait=6 --domains=example.com --exclude-directories=admin --mirror --random-wait=on http://example.com -o ~/exampleFetch.log
The problem is that some pages are not being saved and wget is outputting the following error.
Cannot write to "example.com/archives/2009/09/16/example-post-title" (Not a directory).
This is because there are links to archived months on the main page that list all posts for a given month. For instance:
example.com/archives/2009/09
is saved as a file locally by wget, because
http://example.com/archives/2009/09/
does return a sensible page.
Hopefully I'm missing a switch or have misunderstood one. Thanks for your time.
Thanks. This would indeed work, but I would prefer to use it as a last resort.
Ideally, I was hoping the directories like 2005/11/ would get saved like 2005/11/index.html so that subdirectories could be made inside. Though, I wasn't quite clear on that in my original post. – Tor – 2011-04-03T19:18:42.840
If that's the case you'll probably need some bash renaming magic and a few passes. But you can manually do it. Something like for i in {2005..2011}; do mv $i $i.temp; mkdir $i; mv $i.temp $i/index.html; done – OmnipotentEntity – 2011-04-03T22:17:38.990