151

I heard recently that Nginx has added caching to its reverse proxy feature. I looked around but couldn't find much info about it.

I want to set up Nginx as a caching reverse proxy in front of Apache/Django: to have Nginx proxy requests for some (but not all) dynamic pages to Apache, then cache the generated pages and serve subsequent requests for those pages from cache.

Ideally I'd want to invalidate cache in 2 ways:

  1. Set an expiration date on the cached item
  2. To explicitly invalidate the cached item. E.g. if my Django backend has updated certain data, I'd want to tell Nginx to invalidate the cache of the affected pages

Is it possible to set Nginx to do that? How?

Dave Cheney
  • 18,307
  • 7
  • 48
  • 56
Continuation
  • 3,050
  • 5
  • 29
  • 38
  • Not tested, but from https://gumroad.com/l/ngx_purge: "ngx_purge is a pure Lua module for Nginx that allow user to purge object from nginx cache.". – Jaime Hablutzel May 29 '19 at 02:08

13 Answers13

99

I don't think that there is a way to explicitly invalidate cached items, but here is an example of how to do the rest. Update: As mentioned by Piotr in another answer, there is a cache purge module that you can use. You can also force a refresh of a cached item using nginx's proxy_cache_bypass - see Cherian's answer for more information.

In this configuration, items that aren't cached will be retrieved from example.net and stored. The cached versions will be served up to future clients until they are no longer valid (60 minutes).

Your Cache-Control and Expires HTTP headers will be honored, so if you want to explicitly set an expiration date, you can do that by setting the correct headers in whatever you are proxying to.

There are lots of parameters that you can tune - see the nginx Proxy module documentation for more information about all of this including details on the meaning of the different settings/parameters: http://nginx.org/r/proxy_cache_path

http {
  proxy_cache_path  /var/www/cache levels=1:2 keys_zone=my-cache:8m max_size=1000m inactive=600m;
  proxy_temp_path /var/www/cache/tmp; 


  server {
    location / {
      proxy_pass http://example.net;
      proxy_cache my-cache;
      proxy_cache_valid  200 302  60m;
      proxy_cache_valid  404      1m;
    }
  }
}
outcassed
  • 1,280
  • 9
  • 11
49

You can specifically invalidate cached pages through

proxy_cache_bypass       

Say you want to cache a page, set cache this way

location = /pageid {
  proxy_pass http://localhost:82;
  proxy_set_header   Host             $host;
  proxy_set_header   X-Real-IP        $remote_addr;
  proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
  proxy_ignore_headers Set-Cookie; 
  proxy_ignore_headers Cache-Control; 
  proxy_cache_bypass        $http_secret_header;
  add_header X-Cache-Status $upstream_cache_status;
}

Now, when you want to invalidate that page and cache again

Do a secret curl call with the header

curl "www.site.com/pageid" -s -I -H "secret_header:true" 

It will invalidate and cache it.

Works from nginx 0.7.

As an added bonus the add_header X-Cache-Status can be used to check if the page is from cache or not.

Cherian
  • 791
  • 1
  • 6
  • 13
  • 1
    This can only update cached pages when the new page is cacheable as well. If you have removed a page (404 or other errors are now served by the backend), the page now sends a Set-Cookie or a "Content-Control: private" header, the cached content will not be "invalidated". – rbu Jan 27 '16 at 14:25
38

I suggest you give Varnish a try. Varnish is specifically designed as a reverse proxy cache. It will honor all cache control headers you send from the origin server, which satisfies your first request.

For your second request, explicit invalidation. My strong recommendation is to change the name of the url of the resource you want to invalidate, either by renaming the file or using some form of query string cache buster. Varnish does have a PURGE operation that will remove the resource from Varnish's cache, but it will not give you control over any other caches between you and the user. As you've said you want to explicitly purge a resource, then standard http control headers won't help you. In that cases the most foolproof way to defeat the caching of a resource is to rename it.

Dave Cheney
  • 18,307
  • 7
  • 48
  • 56
  • Could you explain what did you mean by "renaming the file or using some form of query string cache buster"? I'm not sure I understand why it's not a good idea to use an operation like PURGE. – Continuation Jun 24 '09 at 15:54
  • 5
    +1 for varnish. It's always much better to use the right tools for the job. – Tom O'Connor Nov 17 '09 at 11:53
  • 4
    @below: There's almost no hope of touching varnish in the arenas of performance and versatility. This is backed by one of the lead FreeBSD kernel developers and a dedicated team based in Europe. Varnish is in production at twitter, heroku and many more. –  May 02 '11 at 12:59
  • You shouldn't rely on `PURGE` because, as Dave said, "it will not give you control over any other caches between you and the user." and "the most foolproof way to defeat the caching of a resource is to rename it." – womble Jul 19 '11 at 19:27
  • 2
    The simplest example of a cache-buster is to append a version number in a query string to a static resource, so style.css becomes style.css?123. When you want to push a new version of the file you change the url of the resource to style.css?124 and now the caches will pick it up as an entirely new asset to be cached separately. Apache will serve the file style.css with any query string appended, so no changes to the actual file are required. – chmac Apr 20 '12 at 12:27
  • 3
    If possible, it's best to put the cache buster into the filename itself, such as `style.v123.css` because some caches will not cache requests that have a query string. – Noah McIlraith Jul 29 '12 at 15:34
  • the cache-buster string idea is great for css/js/images, but it's problematic for HTML. By way of example, you won't see "cache busters" in SO question URLs. The URL of a question page should not change every time someone posts an answer/comment. – Frank Farmer Oct 11 '12 at 21:28
10

Most caching tools (Citrix) allow a force-refresh (Ctrl+r) to repopulate a cached page.

Here's a trick I found to do something similar in nginx.

server  {
        # Other settings
        proxy_pass_header       Set-Cookie; # I want to cache logged-in users
        proxy_ignore_headers    X-Accel-Redirect;
        proxy_ignore_headers    X-Accel-Expires Expires Cache-Control;
        if ($http_cache_control ~ "max-age=0") {set $eac 1;}
        proxy_cache_bypass $eac;
}

This assumes that when you do a Ctrl+r in your browser, the Cache-Control header has max-age=0 in its request. I know Chrome does this, but I have not tried in other browsers. Adding more header fields can be easy, by just adding more if statements that set the $eac variable to 1.

growse
  • 7,830
  • 11
  • 72
  • 114
Randy Wallace
  • 101
  • 1
  • 2
8

For invalidating selected pages you can use "cache_purge" patch for nginx-0.8.x which does exacly what you want ;)

It's available here.

5

Caching is pretty new function in nginx (and not so well documented for now), but stable enough to be used in production.

SaveTheRbtz
  • 5,621
  • 4
  • 29
  • 45
4

I believe NginxHttpProxyModule is capable of caheing http requests. Look for the directives starting with:

proxy_cache

Yes, it is possible to control cache behaviour via directives like:

proxy_cache_valid
Taras Chuhay
  • 645
  • 3
  • 9
3

Based on the fact that you can't find docs on it, I'd be a little bit wary about relying on it in production. Have you considered Varnish? It's my "nginx of reverse proxies", small, lightweight, doing one job and doing it well.

womble
  • 95,029
  • 29
  • 173
  • 228
2

You can control Nginx's cache expiration with multiple directives/parameters:

  • proxy_cache_valid 200 302 10m;
  • adding one of the HTTP headers below (priority is important - check out my blog post):
    • Expires
    • Cache-Control
    • X-Accel-Expires
  • the inactive parameter in the proxy_cache_path directive:

    proxy_cache_path /data/nginx/cache keys_zone=one:10m inactive=60m;

I recommend my blog post if you want to learn more about Nginx caching.

The purging topic is really interesting since this feature exists only in Nginx Plus (Nginx's commercial edition). I really like @randy-wallace answer. But there are also other possibilities like the ngx_cache_purge module.

The simplest thing you can do is remove the cached file manually:

  • generate your hash key:

    echo -n ‘httpczerasz.com/time.php’ | md5sum
    
  • remove the file from the filesystem:

    rm /data/nginx/cache/1/27/2bba799df783554d8402137ca199a271
    
czerasz
  • 547
  • 1
  • 8
  • 14
2

If you use eTags on your application and put nginx in front of it then it will take care of the expiration for you, because if the eTag changes it will invalidate the cache.

Martin Murphy
  • 190
  • 1
  • 9
  • Really? It seeems like ngnix matches the etag and never talks to the application to find out if there is an updated etag. – John Naegle Jun 26 '13 at 18:55
2

For future visitors: Meanwhile the nginx reverse proxy has caching integrated and the docs are available at:

Syntax: proxy_cache zone | off;

Default: proxy_cache off;

Context: http, server, location

Defines a shared memory zone used for caching. The same zone can be used in several places. Parameter value can contain variables (1.7.9). The off parameter disables caching inherited from the previous configuration level.

  • Hi Tarik, the question was very specific on what needs to be achieved, and it's a bit beyond of 'just enable cache'. – asdmin Mar 01 '19 at 07:28
0
fastcgi_cache_path  /opt/nginx-cache  levels=2:2   keys_zone=img:50m;

    location /img/ {
        fastcgi_pass $backend;
        include fcgi_params;
        fastcgi_intercept_errors off;   
        fastcgi_cache_key $server_addr$request_uri;       
        fastcgi_cache img;
        fastcgi_cache_valid any 1m;
        fastcgi_hide_header Set-Cookie;
    }

This creates cache for /img/ location. It is in /opt/nginx-cache. Objects are cached for 1 minute.

You can write different response codes instead of any.

Now you can't invalidate cache for selected pages. Maybe in 0.8.x it will be possible.

lexsys
  • 2,863
  • 5
  • 30
  • 34
0

There is an nginx plugin called ncache which claims to be "a web cache system base on nginx web server. faster and more efficient than squid. "

sajal
  • 582
  • 6
  • 12