5

I read about setting access to S3 bucket only from a particular website in Amazon Docs. And I quote :

Suppose you have a website with domain name (www.example.com or example.com) with links to photos and videos stored in your S3 bucket, examplebucket. By default, all the S3 resources are private, so only the AWS account that created the resources can access them. To allow read access to these objects from your website, you can add a bucket policy that allows s3:GetObject permission with a condition, using the aws:referer key, that the get request must originate from specific webpages.

This is exactly my requirement.

But this doesn't look convincing to me from security point of view. Can't someone run some JavaScript from browser while keeping my website opened and access these objects or any other way this can be broken ?

In general, can I rest assure that these files will not be accessible any other way than being accessed using the web site ?

P.S. : Not really expert in security, just trying to absolutely confirm because I intuitively felt its not safe. I am assuming that intruder has somehow got the list of files (absolute keynames) stored in my bucket.

Amit Tomar
  • 153
  • 1
  • 7

1 Answers1

5

Your intuition is correct in the most important way -- this is not, in any meaningful sense, "secure."

But the way it's defeated is not quite what you envision.

When you click a link on a web page, or your browser loads an image embedded in a page, your browser connects to the web server and sends a request. In the request are the HTTP headers, including one called Referer:. While not always present (for various reasons that may be outside the scope of this answer), the accompanying value is the URL of the page you were looking at -- in the same browser tab -- when the link was clicked or the image was loaded.

Creating such a policy in S3 implements access control by string-matching this value, submitted by the browser, and almost entirely devoid of any value from a security perspectice.

So Javascript in another tab while your page is open isn't really the concern. The concern is a malicious user crafting requests with a forged Referer: header -- easily done with test tools like curl (curl -v http://example-bucket.s3.amazonaws.com/secret-image.jpg -H 'Referer: http://example.com/good-guy.html') or browser plugins. Now the bucket says "oh, you're on the Good Guy page from example.com? Here's that secret image file you asked for."

So with such an obvious limitation, what is this mechanism actually useful for?

Not securing your content.

It is, however, useful to prevent widespread hotlinking to your content, embedding links to your assets in third party sites, which amounts to theft of your bandwidth. I once encountered a scam site where the creator had googled for pages showing pictures of gift cards. He found such images on one of my sites, but he didn't download them... he embedded links to my images into his scammy HTML. A configuration denying on referer significantly reduces such annoyances... but as it is suitable for very little else.

With S3, you can generate signed URLs on the fly (assuming your site is dynamic) and embed them in the html. This is perhaps the most effective solution, since you specify the period of time for which the URL is valid, and these URLs are immune to tampering to the point of computational infeasibility. CloudFront, in conjunction with S3, allows you to create signed cookies that accomplish similar results but without needing to sign each individual link in each page you render, but individual signed URLs for objects with a very short expiration time, using HTTPS and delivered over HTTPS are an effective mechanism of access control, unlike referring page policies.

  • Thanks for such a detailed answer. I have always heard of forging of one or the other headers of HTTP, for one reason or other(user-agent for scrapping?), being done rather simply (Just like one line example you shared). You think the protocol design was not really good or that it was never designed to be used to do basic gate keeping security of anyways and just that people are misusing it with a false sense of security !? – Amit Tomar Sep 02 '16 at 20:41
  • I would say it's not a weakness in the protocol design, as it does not appear to have been intended as anything but a useful statistics-gathering feature. *"The Referer request-header field allows the client to specify, for the server's benefit, the address (URI) of the resource from which the Request-URI was obtained. This allows a server to generate lists of back-links to resources for interest, logging, optimized caching, etc. It also allows obsolete or mistyped links to be traced for maintenance."* -- [RFC 1945 "HTTP/1.0" May, 1996](https://tools.ietf.org/html/rfc1945#section-10.13) – Michael - sqlbot Sep 02 '16 at 21:23