How to go about pre-caching normally daily visited web pages onto my home network storage?

1

I recently "cut the cable" and downgraded from cable internet (40 Mb/s) to DSL (5 Mb/s). It's awful but I'm stuck with it for a year. What I would like to do is pre-cache everything on pages I visit daily (on my NAS) and everything linked to from that page. The first page of HN for example. I'd like all devices on my network to access the same cache (so no browser add-in solutions please). I would like the cache to automatically clean old content (age based, cache size, whatever). I'm using Tomato on my router.

I'm sure I could figure out how to re-route requests in Tomato with a custom DNS and it wouldnt be terribly hard to set up a python job to cache the pages, but It would take me a full day or more I'm sure.

Others with slow internet must have worked out something similar. I'm just not finding much with the search terms I am using. Anyone know of a tutorial on how to set this up? Anyone have any experience doing something similar? Are there any turn-key solutions (commercial or not) out there?

I realize static pages are getting more and more rare these days. Maybe this is a fruitless endeavor. A better example would be to pre-cahce the imagur links from the reddit or something like that.

This probably runs afoul of some site's terms/conditions, but I'm only planning on making one request a day.

bpowah

Posted 2014-08-20T17:32:23.617

Reputation: 11

Question was closed 2015-02-24T09:31:06.217

Answers

2

Typically proxy software will also have options for caching results - something like squid (no affiliation, free/open source) running on the NAS (or perhaps on the same device as tomato, if it's beefy enough) would work and is pretty much turnkey, although you would need to setup a script to poll the websites that you want cached.

You can use some wget's to do that polling as described at Preload your cache.

You then either setup all your devices to use that proxy server, or you can make tomato use the new squid instance as a transparent proxy server. There are some instructions for dd-wrt at Squid Transparent Proxy and the steps should be similar for tomato.

I should add that this will not work (without more extensive configuration) for HTTPS sites - HTTPS is designed to be resistant against MITM attacks, and the proxy would act as a MITM - so the traffic would be encrypted through it, thus making it unable to cache the content - you could work around this by adding a cert to it and then installing your cert into your web browser, but that would definitely not be turnkey. Note that this IS done by companies who wish to spy upon their employees HTTPS traffic, such that their proxy can view the content.

user2813274

Posted 2014-08-20T17:32:23.617

Reputation: 915