How to crawl your own website to save to cache

0

I'm using Squid, a caching program, to cache my website. However to do so it appears that each page has to be accessed at least once before Squid can cache it. My question is, is there a program which will quickly crawl through my website, access all the pages once so that Squid can cache them.

Cheers

user2028856

Posted 2013-07-17T08:50:11.363

Reputation: 103

Answers

2

You can use wget for that. After setting the http_proxy environment variable to point to your proxy run it with options similar to below (linux commands below).

export http_proxy=http://127.0.0.1:3128/

wget --cache=off --delete-after -m http://www.mywebsite.org

Brian

Posted 2013-07-17T08:50:11.363

Reputation: 8 439

For the command export http_proxy is this a permanent setting to direct all future internet access through the proxy port? Also for the second command, what does it do exactly? I'm not going to download my website I'm I if I use that command – user2028856 – 2013-07-17T09:30:19.303

While it is set most Unix style commands will use the proxy yes. The wget command will download your website (-m is for mirror) but delete the local copy afterwards (--delete-after). The end result is you will have pulled your site through the proxy. The --cache=off means to not take stuff from the squid cache but have it get content from the web server (useful for knocking stale pages out of the squid proxy cache). – Brian – 2013-07-17T09:42:23.977

@user2028856 export will have effect only for the duration of that command shell, and only within that command shell. Other shells will not be affected unless they are started from within the shell in which you have done export. So just exit from the shell or close the terminal/command prompt window when wget has finished, and everything will be back to normal. – a CVn – 2013-07-17T09:47:36.247

I would like to set a permanent setting for that server so that all future outgoing connections has to pass through the proxy, what setting do I need to change for this to happen? – user2028856 – 2013-07-17T09:50:37.383