34

We just migrated to Amazon AWS. We currently have an EC2 instance that's working well. It's running Nginx in front and Apache in the back-end. That's running well also. All sites are launched properly and includes the Cache-Control header for files that are served from the EC2.

The problem is with ALL static files we placed in Amazon S3 that's being accessed through CloudFront CDN. We can access the files fine (and no issue with CORS), but apparently CloudFront doesn't serve files with Cache-Control header. We want to leverage on browser caching.

The way I see it, the EC2 instance doesn't play a role here as the static files are being served directly by S3+CloudFront, the request does not go to the Web Server in EC2.

I'm at a complete lost.

Question: 1) How do I set the Cache-Control in this case? 2) Is it possible to set the Cache-Control? From S3 or CloudFront?

Note: I've hit a few pages in Google where you can set the Header in S3 for individual objects. That's really not a productive way to do it specially since in my case we are talking of several objects.

Thanks!

jarvis
  • 1,956
  • 4
  • 17
  • 31
  • Please post a URL for an object in S3 and the applicable CloudFront URL. I'd like to see the behavior you describe myself. Alternately post CURLs for both, showing headers. – Tim Apr 14 '16 at 21:09
  • I've been able to add a custom header "Expires: Sun, 15 Oct 2027 13:46:07 GMT" by editing origin in https://console.aws.amazon.com/cloudfront/home. However it doesn't seem to work. How did you do it finally? – Manolo Oct 18 '16 at 16:03

2 Answers2

42

I've hit a few pages in Google where you can set the Header in S3 for individual objects. That's really not a productive way to do it specially since in my case we are talking of several objects.

Well, "productive" or not, that is how it actually is designed to work.

CloudFront does not add Cache-Control: headers.

CloudFront passes-through (and also respects, unless otherwise configured) the Cache-Control: headers provided by the origin server, which in this case is S3.

To get Cache-Control: headers provided by S3 when an object is fetched, they must be provided when the object is uploaded into S3, or added to the object's metadata by a subsequent put+copy operation, which can be used to internally copy an object into itself in S3, modifying the metadata in the process. This is what the console does, behind the scenes, if you edit object metadata.

There is also (in case you are wondering) no global setting in S3 to force all objects in a bucket to return these headers -- it's a per-object attribute.


Update: Lambda@Edge is a new feature in CloudFront that allows you to fire triggers against requests and/or responses, between viewer and cache and/or cache and origin, running code written in Node.js against a simple request/response object structure exposed by CloudFront.

One of the main applications for this feature is manipulating headers... so while the above is still accurate -- CloudFront itself does not add Cache-Control -- it is now possible for a Lambda function to add them to the response that is returned from CloudFront.

This example adds Cache-Control: public, max-age=86400 only if there is no Cache-Control header already present on the response.

Using this code in an Origin Response trigger would cause it to fire every time CloudFront fetches an object from the origin, and modify the response before CloudFront caches it.

'use strict';

exports.handler = (event, context, callback) => {
    const response = event.Records[0].cf.response;

    if(!response.headers['cache-control'])
    {
        response.headers['cache-control'] = [{ 
            key:   'Cache-Control', 
            value: 'public, max-age=86400' 
        }];
    }

    callback(null, response);
};

Update (2018-06-20): Recently, I submitted a feature request to the CloudFront team to allow configuration of static origin response headers as origin attributes, similar to the way static request headers can be added, now... but with a twist, allowing each header to be configured to be added conditionally (only if the origin didn't provide that header in the response) or unconditionally (adding the header and overwriting the header from then origin, if present).

With feature requests, you typically don't receive any confirmation of whether they are actually considering implementing the new feature... or even whether they might have already been working on it... it's just announced when they are done. So, I have no idea if these will be implemented. There is an argument to be made that since this capability is already available via Lambda@Edge, there's no need for it in the base functionality... but my counter-argument is that the base functionally is not feature-complete without the ability to do simple, static response header manipulation, and that if this is the only reason a trigger is needed, then requiring Lambda triggers is an unnecessary cost, financially and in added latency (even though neither is necessarily an outlandish cost).

Michael - sqlbot
  • 21,988
  • 1
  • 57
  • 81
  • https://stackoverflow.com/a/30225271/846727 - TADA!! – Kunal Aug 26 '17 at 04:45
  • 2
    Tada, indeed, @Kunal. That is an example of what I referred to in the answer as *"added to the object's metadata by a subsequent put+copy operation."* Use it with caution, and test, because there are caveats. It will reset all of your datestamps and may have implications for encryption. It may also change object etags from multipart to single part format, which is a different algorithm, and will confuse any system that has stored the etags elsewhere for future integrity checks. If versioning is enabled on the bucket, your storage cost doubles unless you clean up the old versions. – Michael - sqlbot Aug 26 '17 at 19:22
  • 1
    The new Lambda@Edge service now also provides a mechanism that does allow Cache-Control response headers (among others) to be added on-the-fly. I've updated the answer with a working example of how that can be done. – Michael - sqlbot Aug 26 '17 at 19:44
  • @Michael-sqlbot what are the Lambda's required permissions? I am try to implement this and getting `com.amazonaws.services.cloudfront.model.InvalidLambdaFunctionAssociationException: The function execution role must be assumable with edgelambda.amazonaws.com as well as lambda.amazonaws.com principals. ` – Broshi Sep 13 '17 at 09:12
  • 1
    @Broshi the role's "trust policy" needs to list both the lambda and edgelambda services. Take a look at http://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html#lambda-edge-permissions. – Michael - sqlbot Sep 13 '17 at 09:48
1

Since November 2021, this can now be done natively within Cloudfront without using a Lambda@Edge function.

  1. Go to Cloudfront > Policies > Response headers and click "Create response headers policy"
  2. Enter a name, e.g "CacheHeaders", and add a custom header like: enter image description here
  3. Once the policy is created, edit the behaviour for your distribution and select the policy under the Resposne headers policy section: enter image description here
Andy
  • 131
  • 1
  • 6