1

I have a PHP application running on an Apache2 server. I started off with HTML5 Boilerplate, and the .htaccess includes "1 month" expiry policy for all images. For example:

ExpiresByType image/jpeg                            "access plus 1 month"

My application also allows a user to replace an existing image with a new one (under the same name and path). However, once replaced, the user continues to see the old image until the page is force-refreshed (say, Ctrl-F5).

I noticed from the logs that technically, since the image is set to expire 1 month from now, the browser doesn't ask the server for it. Only on a force refresh does the browser ask for it, at which point Apache2 may issue a 304 (if the file never changed) or send the new image with a 200 (if it changed).

What I want is a simple straightforward mechanism to:

  • Keep the existing setup where the images expire 1 month from now. Browsers must not need to fetch it if they already have it.
  • But if the image is updated by the user, automatically have the browser fetch the new image. There should be no need to tell the user to force a refresh.
  • The fetching of new image must also extend to other users or other browser after the last point.

As an alternative, I wouldn't mind if I can conditionally set the expiry policy to "immediate" for a particular path only (images in other folders must follow the general expiry policy). I can live with the numerous 304s. I guess.

I don't want to simply change the filename to a new one. The filenames are dynamically generated, and there are several - tracking their "new names" would add complexity. (Although if there's no other easy solution, I'll have to explore this.)

ADTC
  • 143
  • 2
  • 11
  • To anyone using `ExpiresByType`, remember to `a2enmod expires` (enable the *expires* module) and restart Apache. – ADTC Apr 01 '16 at 10:07

1 Answers1

1

You're asking for much: Explicitly configuring that some data is valid for a month, just then it isn't as soon as you change your mind. In general, you'll have to make up your mind.

  • If the 304s don't hurt you: Shortening the interval might be a solution
  • Popular mitigations of this problem involve a timestamp parameter to the image: Don't load logo.png, rather load logo.png?t=20160327083700 where the timestamp is either static through all of your application (and updated at your discretion) or the last change date of the file (which will add server side burden)
  • Following your comment you might also succeed with adding the timestamp elsewhere on the path (so that your image names stay the same): /20160327083700/logo.png. However, read my last paragraph on premature optimization

As long as you tell everybody out there that a resource is valid for a long time don't be surprised that they take your word for it and reuse it until the time is over.

For the timestamp parameter above, I've used a somewhat readable timestamp (e.g. today). You probably want to just use an unformatted milliseconds since 1970 or whatever unformatted value you easily get. It just needs to change with every update of the file.

Edit: As you state in the comment, browsers tend to interpret the ? in the URL as cache-bypass - and better request the resource again early. With that, your best bet might be to shorten the timespan. Especially when you say that you'll survive the 304s: Go down to "access plus 1 hour" (or even 10 minutes). This would be the right thing to do anyway: State that content can be cached for 1 month if and only if it can be cached for 1 month.

And, by the way, let me remind you of the old saying "premature optimization is the root of all evil" - I bet that "access plus 1 month" did not come from any performance tuning session, but rather from your guess. And, as you see with this question, this premature optimization actually causes a lot of extra work to resolve the conflicts that it introduced.

Olaf
  • 908
  • 5
  • 7
  • It's funny, I decided to go for the second point, taking the hint from [here](http://stackoverflow.com/a/22429796/1134080). But once I did that, the browser started hitting the server on every page load (not force reload) for any image with the timestamp parameter, despite the 1 month cache setting. The server return 304 when the image hasn't changed. – ADTC Mar 27 '16 at 07:10
  • Edited. Thinking of it - I might rearrange and make the last paragraph on "premature optimization being the root of all evil" more prominent at the beginning of the answer... (but not now) – Olaf Mar 27 '16 at 07:20
  • FYI, "access plus 1 month" came from HTML5 Boilerplate. Maybe they guessed it, but I didn't. Performance tuning may come later. Anyway the 304s are better than having the user's browser only show the new image after 10 minutes. I need it reflected as soon as the user uploaded a new image, not after 10 minutes. The mtime solution works great for this, albeit with the cache bypass problem and the additional work for server. – ADTC Mar 27 '16 at 07:40
  • I'd say: Don't worry about additional work for the server unless you've measured that this is a problem. Otherwise you're creating the most complicated scenario and updates will always need to have that (potentially unnecessary) optimization in mind. Make it work first, then tune it *if necessary*. Or, if you really want to go the most complicated route: Add the timestamp only when the file was modified within the last 10 minutes / 1 hour (whatever your cache duration is). But don't tell anyone that I've suggested that. – Olaf Mar 27 '16 at 10:10