3

I am troubleshooting the setup of Varnish 3.x on my Ubuntu server. I'm running Drupal 7 on two sites set up on the box, via named-based vhosts. Before trying to get Varnish to play nice with Drupal I'm trying to just get Varnish to a PNG from cache.

Here are the headers I get from a curl -I request of the PNG file:

HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
Last-Modified: Sun, 07 Oct 2012 21:18:59 GMT
ETag: "a57c2-3850-4cb7ea73db6c0"
Accept-Ranges: bytes
Content-Length: 14416
Cache-Control: max-age=1209600
Expires: Thu, 25 Oct 2012 22:55:14 GMT
Content-Type: image/png
Accept-Ranges: bytes
Date: Thu, 11 Oct 2012 22:55:14 GMT
X-Varnish: 1766703058
Age: 0
Via: 1.1 varnish
Connection: keep-alive
X-Varnish-Cache: MISS

Here are the headers for the same file, but bypassing Varnish (port 8080):

HTTP/1.1 200 OK
Date: Sat, 13 Oct 2012 13:16:17 GMT
Server: Apache/2.2.22 (Ubuntu)
Last-Modified: Sun, 07 Oct 2012 21:18:59 GMT
ETag: "a57c2-3850-4cb7ea73db6c0"
Accept-Ranges: bytes
Content-Length: 14416
Cache-Control: max-age=1209600
Expires: Sat, 27 Oct 2012 13:16:17 GMT
Content-Type: image/png

Here is the Varnish VCL file I'm using (It's a default VCL configuration designed for Drupal):

# Default backend definition.  Set this to point to your content
# server.
#
backend default {
  .host = "127.0.0.1";
  .port = "8080";
}

# Respond to incoming requests.
sub vcl_recv {
  # Use anonymous, cached pages if all backends are down.
  if (!req.backend.healthy) {
    unset req.http.Cookie;
  }

  # Allow the backend to serve up stale content if it is responding slowly.
  set req.grace = 6h;

  # Pipe these paths directly to Apache for streaming.
  #if (req.url ~ "^/admin/content/backup_migrate/export") {
  #  return (pipe);
  #}

  # Do not cache these paths.
  if (req.url ~ "^/status\.php$" ||
      req.url ~ "^/update\.php$" ||
      req.url ~ "^/admin$" ||
      req.url ~ "^/admin/.*$" ||
      req.url ~ "^/flag/.*$" ||
      req.url ~ "^.*/ajax/.*$" ||
      req.url ~ "^.*/ahah/.*$") {
       return (pass);
  }

  # Do not allow outside access to cron.php or install.php.
  #if (req.url ~ "^/(cron|install)\.php$" && !client.ip ~ internal) {
    # Have Varnish throw the error directly.
  #  error 404 "Page not found.";
    # Use a custom error page that you've defined in Drupal at the path "404".
    # set req.url = "/404";
  #}

  # Always cache the following file types for all users. This list of extensions
  # appears twice, once here and again in vcl_fetch so make sure you edit both
  # and keep them equal.
  if (req.url ~ "(?i)\.(pdf|asc|dat|txt|doc|xls|ppt|tgz|csv|png|gif|jpeg|jpg|ico|swf|css|js)(\?.*)?$") {
    unset req.http.Cookie;
  }

  # Remove all cookies that Drupal doesn't need to know about. We explicitly 
  # list the ones that Drupal does need, the SESS and NO_CACHE. If, after 
  # running this code we find that either of these two cookies remains, we 
  # will pass as the page cannot be cached.
  if (req.http.Cookie) {
    # 1. Append a semi-colon to the front of the cookie string.
    # 2. Remove all spaces that appear after semi-colons.
    # 3. Match the cookies we want to keep, adding the space we removed 
    #    previously back. (\1) is first matching group in the regsuball.
    # 4. Remove all other cookies, identifying them by the fact that they have
    #    no space after the preceding semi-colon.
    # 5. Remove all spaces and semi-colons from the beginning and end of the 
    #    cookie string. 
    set req.http.Cookie = ";" + req.http.Cookie;
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");    
    set req.http.Cookie = regsuball(req.http.Cookie, ";(SESS[a-z0-9]+|SSESS[a-z0-9]+|NO_CACHE)=", "; \1=");
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

    if (req.http.Cookie == "") {
      # If there are no remaining cookies, remove the cookie header. If there
      # aren't any cookie headers, Varnish's default behavior will be to cache
      # the page.
      unset req.http.Cookie;
    }
    else {
      # If there is any cookies left (a session or NO_CACHE cookie), do not
      # cache the page. Pass it on to Apache directly.
      return (pass);
    }
  }
}

# Set a header to track a cache HIT/MISS.
sub vcl_deliver {
  if (obj.hits > 0) {
    set resp.http.X-Varnish-Cache = "HIT";
  }
  else {
    set resp.http.X-Varnish-Cache = "MISS";
  }
}

# Code determining what to do when serving items from the Apache servers.
# beresp == Back-end response from the web server.
sub vcl_fetch {
  # We need this to cache 404s, 301s, 500s. Otherwise, depending on backend but 
  # definitely in Drupal's case these responses are not cacheable by default.
  if (beresp.status == 404 || beresp.status == 301 || beresp.status == 500) {
    set beresp.ttl = 10m;
  }

  # Don't allow static files to set cookies. 
  # (?i) denotes case insensitive in PCRE (perl compatible regular expressions).
  # This list of extensions appears twice, once here and again in vcl_recv so 
  # make sure you edit both and keep them equal.
  if (req.url ~ "(?i)\.(pdf|asc|dat|txt|doc|xls|ppt|tgz|csv|png|gif|jpeg|jpg|ico|swf|css|js)(\?.*)?$") {
    unset beresp.http.set-cookie;
  }

  # Allow items to be stale if needed.
  set beresp.grace = 6h;
}

# In the event of an error, show friendlier messages.
sub vcl_error {
  # Redirect to some other URL in the case of a homepage failure.
  #if (req.url ~ "^/?$") {
  #  set obj.status = 302;
  #  set obj.http.Location = "http://backup.example.com/";
  #}

  # Otherwise redirect to the homepage, which will likely be in the cache.
  set obj.http.Content-Type = "text/html; charset=utf-8";
  synthetic {"
<html>
<head>
  <title>Page Unavailable</title>
  <style>
    body { background: #303030; text-align: center; color: white; }
    #page { border: 1px solid #CCC; width: 500px; margin: 100px auto 0; padding: 30px; background: #323232; }
    a, a:link, a:visited { color: #CCC; }
    .error { color: #222; }
  </style>
</head>
<body onload="setTimeout(function() { window.location = '/' }, 5000)">
  <div id="page">
    <h1 class="title">Page Unavailable</h1>
    <p>The page you requested is temporarily unavailable.</p>
    <p>We're redirecting you to the <a href="/">homepage</a> in 5 seconds.</p>
    <div class="error">(Error "} + obj.status + " " + obj.response + {")</div>
  </div>
</body>
</html>
"};
  return (deliver);
}

I'm getting a MISS and age 0 every time. If I'm understanding correctly, this means the file isn't being returned from Varnish's cache. Is there a problem with my Varnish config?

UPDATE:

As suggested, I started with a basic VCL file and I'm still getting misses every time. The VCL config is:

# Default backend definition.  Set this to point to your content
# server.
#
backend default {
  .host = "127.0.0.1";
  .port = "8080";
}

# Respond to incoming requests.
sub vcl_recv {
  unset req.http.Cookie;
}

# Set a header to track a cache HIT/MISS.
sub vcl_deliver {
  if (obj.hits > 0) {
    set resp.http.X-Varnish-Cache = "HIT";
  }
  else {
    set resp.http.X-Varnish-Cache = "MISS";
  }
}

# Code determining what to do when serving items from the Apache servers.
# beresp == Back-end response from the web server.
sub vcl_fetch {
  unset beresp.http.set-cookie;
}

I am continuing to troubleshoot.

Justin
  • 895
  • 3
  • 12
  • 26
  • That whole big cookie logic block seems like a likely suspect; can you try removing it (and either strip all cookies to allow the default logic to cache, or put in a `return(lookup);`)? – Shane Madden Oct 12 '12 at 06:24
  • Can't get any Varnish hits even with a basic VCL. I'm digging deeper, because I must be missing some obvious. – Justin Oct 13 '12 at 13:35
  • Can you post a curl -I from the Apache request on port 8080? That will tell us more than the headers of the Varnish'd page. – Karel Oct 13 '12 at 07:39
  • good catch :) i don't know about that. But my site now is running after read your sample configuration. Thanks !! – risnandar Jul 27 '13 at 06:59

2 Answers2

4

Well it turned it to be something super simple that I can't believe I missed: I was using a Apache Basic Authorization on the site and it looks like Varnish by default returns a pass when it sees the Authorization header.

Justin
  • 895
  • 3
  • 12
  • 26
  • Good catch! What I've always gone with is to include the default logic in my own config file, so that it's visible - then the `return` at the end of each function makes sure that nothing other than the config you've defined will be applied. – Shane Madden Oct 13 '12 at 18:17
  • One side note: Even if varnish does the base authorization itself and chaches the object, the browsers might still use uncached content – Alex Oct 26 '12 at 10:12
  • I've been banging my head on this all day.. Thank you!! – Nick Rolando Mar 04 '16 at 00:30
  • WHERE DO YOU PUT IT?! – Justin Thomas Aug 24 '16 at 20:53
0

you need to include following line to your httpd.conf that is responsible for drupal virtualhost.

Header unset ETag

also, I suggest you to go/use: Varnish HTTP Accelerator Integration | drupal.org

alexus
  • 12,342
  • 27
  • 115
  • 173
  • Thanks! I tried that, and confirmed the etag was no longer appearing in the headers but it's still a miss each time. Also, the site does use the Varnish module. – Justin Oct 12 '12 at 02:05
  • @Justin update your initial post/question with reflected output, also where did you get that varnish's default.vcl file from? I think Shane Madden is right I also think it has something to do with that part of varnish configuration, start w/ little and work your way to what you have right now, you need to try few things out its kind of hard to give correct answer just like this, but etag is for sure not needed in order for cache to start working. – alexus Oct 12 '12 at 17:43
  • I got the file from the here: https://fourkitchens.atlassian.net/wiki/display/TECH/Configure+Varnish+3+for+Drupal+7 I will start from vanilla and go up. Thanks! – Justin Oct 13 '12 at 13:19