10

I have an Amazon S3 bucket that contains thousands of JPGs and similar media assets. (It's the storage for my forums.) I don't host a static site in this bucket (no HTML of any kind) and it's completely open for the public to read (so the forum software can just link to the images).

However I'm paying through the nose on S3 for bandwidth. Someone said I should stick CloudFlare in front of the S3 bucket and it would save a lot of $. Cool! But I don't get the process. CloudFlare seems to want to take over all the DNS for my main site, which is very much not something I want.

I just need https://my-bucket.s3.us-east-1.amazonaws.com to be cached/CDN'd by CloudFlare. And obviously I can't change Amazon's DNS ;-)

I don't think this would be difficult but I can't seem to find a relatively simple explanation of the right way to set this up. (In other words: I am lacking the "big picture"/main steps.)

(This is covered a bit on the CloudFlare site but I'm confused by their instructions and my use case is also different than what they talk about.)

Eric
  • 1,087
  • 2
  • 12
  • 24
  • It looks like you found the link to explain this, particularly the Mike Tabor site in the article you referenced. The bucket needs a custom domain to use Cloudflare, so set up the custom domain for your bucket first, then add Cloudflare. The real problem you have is that your links are referencing your bucket's AWS URL. How easy would it be for you to mass update those links in the forums? – drussey Jul 18 '20 at 02:29
  • I can handle the forum part, but you hit on the part where I need help. "The bucket needs a custom domain to use Cloudflare, so set up the custom domain for your bucket first, then add Cloudflare." Right-- how do I do that? Do I make a new domain or use my existing domain (and hand over all DNS for my entire site to CloudFlare)? Does it have to be named anything inparticular? Etc. This is where I just need some guidance. Thank you. – Eric Jul 19 '20 at 14:26
  • 1
    These are all questions of preference. On free Cloudflare, you will have to use cloudflare's DNS for the domain that you choose, although you can use the "grey cloud" DNS-only mode to avoid using cloudflare on any entry you'd like. If you manage your own nameservers or like your current name servers, going with a separate domain just for cloudflare might be a good idea. It seems like you might also benefit from reading through the getting started basics of cloudflare: https://support.cloudflare.com/hc/en-us/articles/360027989951-Getting-Started-with-Cloudflare – drussey Jul 19 '20 at 17:42
  • 1
    Also, it may be worth it to change your DNS to Cloudflare so that you can add Cloudflare DDoS protection to your forums, especially if you don't currently have some protections in place. – drussey Jul 23 '20 at 14:00
  • Hi @Eric, I am going to be in a similar position to yourself regarding data transfer costs soon and I wonder if you can clear something up for me: S3 transfer costs to the open internet are about $0.09 per GB and CloudFront isn't much different. If you set up CloudFlare you're still going to have to pay for the transfer between S3 and CloudFlare, so are you relying on CloudFlare's caching at their edge locations and the free bandwidth they offer to deliver your savings? – MSOACC Nov 07 '20 at 14:47
  • @Eric another question, when I read through drussey's answer he talks about creating DNS records and so on. I am like you where I cannot change the DNS settings of my current domain. Did you get this set up without transferring your domain to CloudFlare's control? I am surprised by how involved the whole process is. – MSOACC Nov 07 '20 at 14:49
  • @MSOACC and anyone else wondering yes you will still incur transfer to Cloudflare, but any cache hit on Cloudflare will save you transfer from S3/Cloudfront. It should severely reduce the bandwidth usage. Sites serving fewer files more frequently will save more bandwidth than sites serving many files less frequently. – drussey Apr 28 '21 at 23:04

1 Answers1

11

Here is a list of steps you will have to take to get Cloudflare to work for your S3 bucket, I will try to elaborate later as necessary, but there are quite a few steps here:

Step 1: Set up your domain-based bucket

Note: This hostname will have to be onboarded into Cloudflare later (in step 3), so choose it based on that. The entire hostname will have to be used to serve the bucket, so don't choose something that is already in use. I will use static.example.com for this.

Go into S3 and create a bucket with the name of this domain, you will want to select the existing bucket with your images as the copy settings from bucket. You will want the bucket to be in the same region as the existing bucket for Step 2: Bucket creation for static.example.com

Step 1.5 (optional): Test the custom domain without Cloudflare

Add a test image image.jpg to the bucket with public viewable permissions, load it into the root.

CNAME the hostname's DNS entry (static.example.com) to the endpoint of your Amazon bucket. This is usually the name of the bucket (in this case, static.example.com) plus the standard S3 URL including the region. For our example, it would be: static.example.com.s3.us-west-2.amazonaws.com, but replace us-west-2 with your actual region.

Now see if the test resource is available at http://static.example.com/image.jpg, make sure to use http because https will not work here. It may take a little while to propagate DNS from the above step.

Step 2: Copy all the resources from the old bucket

You will need to follow this guide to copy all the resources from the old bucket to the new. The guide is in depth enough to follow to copy all your resources over to the new bucket, although if you have a very large bucket you may run into issues here.

Verify that the expected resources are at http://static.example.com/old_image_path.jpg.

Step 3: Sign up and setup Cloudflare

This step is to onboard your domain into Cloudflare. Cloudflare has a set of instructions for this. If you have any existing records, make sure they are there in the onboarding list. One entry that should be there is the static.example.com -> static.example.com.s3.us-west-2.amazonaws.com CNAME that was set up in step 1.

Once this is set up, make sure the DNS entry is in "orange cloud" mode A.K.A. proxied (and cached). Make a test request again from step 2, but look that the image has signs it is coming from Cloudflare servers. This will include a Server: cloudflare header as well a cf-cache-status header that is indicating whether you are retrieving from Cloudflare's cache and saving bandwidth. This may take some time as the previous step needed a DNS change to propagate.

You should be able to use https in requests at this point, but if not, go to the SSL/TLS > Edge Certificates tab of Cloudflare and verify that Universal SSL is enabled, or that another type of certificate is properly configured. Note: Your SSL mode in Cloudflare's SSL/TLS tab must be "Full" or "Flexible". This encrypts the image between the user's browser and Cloudflare, but not between Cloudflare and AWS. Additionally all requests covered by this certificate would be subject to a similar middle man attack in "Full" or "Flexible" mode. The method outlined here cannot be used to serve the images over https. To do this you will need to create a Cloudfront distribution.

Step 4: Switch image link references

This is specific to the original question. All image link references will have to switch from the old s3 bucket to the new hostname in Cloudflare, static.example.com.

Step 5: Tweak cache settings

You may want to tweak the Cache settings to save bandwidth. There are two main ways to achieve this. Cloudflare will use the Cache-Control header to determine how to determine how long to cache. You can mass update the S3 cache control header according to these instructions. Additionally Cloudflare gives you the option to override this header with the Caching > Configuration > Browser Cache TTL option. This will modify effectively modify the Cache-Control: max-age value to the larger of the two. A longer TTL will keep the images cached longer in the users' browsers, but will also keep images in Cloudflare's edge cache longer and reduce the load from your S3 bucket.

Disclaimer:

Although cloudflare does offer seemingly free bandwidth, there is a limit to their generosity. Please refer to Section 2.8 of Cloudflare's terms. It seems like the above mentioned website is probably in violation of these terms and if Cloudflare decides your usage is too much it is possible that your site will be removed. At that point you will have to DNS-only mode and go back to paying for S3 or negotiate with cloudflare some terms and payment to continue using Cloudflare's bandwidth.

drussey
  • 251
  • 2
  • 6
  • 1
    Another key thing to do is ensure the headers for the objects indicate caching for a long period is acceptable, to avoid pulling data from the bucket that hasn't changed. This article will help https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html – Tim Jul 19 '20 at 20:45
  • 1
    Thank you for this excellent answer! I accepted it and also edited it to add a little clarification and fix a couple of small things. It's a great answer. – Eric Jul 23 '20 at 12:20
  • When you say "CNAME the hostname's DNS entry (static.example.com) to..." Is this something that I have to do in my hosting provider's DNS (Route 53 hosted zone, for example)? I'm assuming YES, since your instructions say to set up Cloudflare some time after that step. – cdabel Apr 29 '21 at 06:03
  • 1
    @cdabel Yes, this is done in your DNS server, which is usually managed by some hosting provider like Route 53 / AWS. You could also do step 3 first (or have already done this in the past) and make this entry as a "Grey cloud" DNS-only entry in Cloudflare. Your goal here is to have the DNS entry for static.example.com resolve to amazon's S3 servers in the same region as your bucket setup. This part of Step 1 is just testing that the bucket works with your custom domain. You can skip it if you are confident that it is set up correctly. – drussey Apr 29 '21 at 15:57
  • @drussey You've helped me greatly! Any way to get the images to load over SSL? My site is hosted on Lightsail, btw, and now that I have my subdomain pointing to the s3 bucket, I prefer them to get served over SSL. – cdabel May 01 '21 at 09:28
  • 1
    @cdabel after you switch over to proxied mode in step 3 you can use https to access the images. Your SSL mode in Cloudflare's SSL/TLS tab must be "Full" or "Flexible". This encrypts the image between the user's browser and Cloudflare, but not between Cloudflare and AWS. The traffic between Cloudflare and AWS should generally be over trustworthy internet backbones, but for all serious production usages this should be considered insecure. I believe the only to fix this and to use "Full (Strict)" in Cloudflare is to create a Cloudfront distribution. I can make a guide later if anyone needs it. – drussey May 03 '21 at 15:53
  • Wow, that's exactly why my images aren't loading over SSL. I am using Full (Strict). btw, I found this guide: https://medium.com/@sambecker/getting-cloudflare-cloudfront-s3-to-cooperate-over-strict-ssl-f70090ebdec – cdabel May 04 '21 at 00:26