0

I'm using mod_pagespeed with mod_cache.

When mod_pagespeed is off and mod_cache is off I see the following header:

cache-control:public,max-age=7200,must-revalidate

When mod_pagespeed is on and mod_cache is off, I see the following header on the response:

cache-control:max-age=0, no-cache, must-revalidate

As expected pagespeed has rewritten the cache-control.

However, when mod_pagespeed is on and mod_cache is on I see the following:

cache-control:public,max-age=7200,must-revalidate

According to the docs:

"By default, PageSpeed serves all HTML with Cache-Control: no-cache, max-age=0 because the transformations made to the page may not be cacheable for extended periods of time."

Why is the html being served as cacheable when mod_pagespeed and mod_cache is enabled?

DD.
  • 3,024
  • 10
  • 34
  • 50

2 Answers2

1

There appears to be a bug when running mod_pagespeed 1.11.33.2-0 with Apache Httpd 2.4.23 running mod_cache.

For some reason mod_pagespeed does not rewrite the cache headers which leaves the html publically cacheable.

The workaround I used was to have virtualhost on port 81 running as a caching server with no pagespeed.

<VirtualHost *:81>
ProxyPass / ajp://tomcat-ipaddress:8009/
ProxyPassReverse / https://final-hostname/
ModPagespeed off
RemoteIPHeader X-Forwarded-For
CacheEnable disk /
CacheHeader on
</VirtualHost>

On virtualhost 443 or 80, you can then proxy the host on 81.

 <VirtualHost _default_:443>
 ProxyPass / http://localhost:81/
 ProxyPreserveHost On
 ModPagespeed on
 ProxyPassReverse / https://final-hostname/
DD.
  • 3,024
  • 10
  • 34
  • 50
-1

Because by default mod_cache runs in quick handler mode:

http://httpd.apache.org/docs/current/mod/mod_cache.html#cachequickhandler

which means it touches the response "last", after mod_pagespeed has performed its transformations.

Use the

CacheQuickHandler off
AddOutputFilterByType ... 

example to order filters as appropriate.

Jonah Benton
  • 1,242
  • 7
  • 13
  • This doesn't make sense. If pagespeed acts first then it will already have changed the cache-control to no-cache and then mod_cache will not attempt to cache. – DD. Sep 15 '16 at 06:47
  • Pagespeed isnt seeing the request, it was already cached. Try making sure the ondisk cache is deleted, then have both on, then issue a request. – Jonah Benton Sep 15 '16 at 10:28
  • ...how are you making these assumptions? It is seeing the requests as they are being optimised. Also the optimisations are browser specific (e.g. chrome gets webp). The pagespeed optimisations should never be cached as pagespeed is supposed to modify the cache header. – DD. Sep 15 '16 at 11:36
  • no assumptions, lots of experience with mod_cache. Are there instances of the urls being visited in the mod_cache cache? What do logs report about the order in which the filters are being run? – Jonah Benton Sep 15 '16 at 13:00
  • What problem is being solved by using modcache and pagespeed together? – Jonah Benton Sep 15 '16 at 13:24
  • Pagespeed optimizes the page and mod_cache caches the page so you dont have to hit the proxy backend every request. – DD. Sep 15 '16 at 13:29
  • So there are two possibilities- pagespeed is in the front (sees the response last) and for some reason is failing to overwrite the cache header, or modcache is in front and is improperly injecting the header, though allowing pagespeed changes otherwise to go through. – Jonah Benton Sep 15 '16 at 17:00
  • Educated guess based on available info is the latter rather than the former, so thing to try is still to disable the cache quick handler and explicitly order the filters- cache then pagespeed. – Jonah Benton Sep 15 '16 at 18:33
  • Quick handler was always disabled..apparently the order of modules is non-deterministic: http://www.apachetutor.org/dev/request#d5 – DD. Sep 17 '16 at 07:48
  • That's a poor choice of words by the author. Filter processing order is deterministic, though there are many stages and individual modules may opt in at any stage. Processing is data driven, and any filter may programmatically decide to end processing, or to make changes to downstream processing. So order is not necessarily predictable on an individual request basis, because it is data driven, but it is deterministic- requests with the same data characteristics will always be seen by the same code paths in the same order. – Jonah Benton Sep 17 '16 at 14:44
  • Was an attempt made to order the filters cache first, then pagespeed, with AddOutputFilterByType? – Jonah Benton Sep 17 '16 at 14:46
  • Yes: AddOutputFilterByType CACHE;MOD_PAGESPEED_OUTPUT_FILTER text/html – DD. Sep 17 '16 at 16:22