20

Let's say I develop an application that,

  1. Allows any user to upload a file of only white listed mime content type and extensions (word and pdf).
  2. Serves those files with the allowed extension and content type.

Is this a security risk? Why?

Will any browsers infer the content type from the file bytes (using magic headers) and discard the content type I am specifying in the headers?

What is the solution to make this secure? Should I assert the files being uploaded have magic bytes matching the client provided mime type?

I know this question is similar to Is it safe to store and replay user-provided mime types? , but don't think it is a duplicate.

Andy
  • 505
  • 2
  • 5
  • 11
  • Note that even if you allow only 1 file type (e.g. PDF), attackers can still put malware inside the uploaded files without changing the file type at all. The [Payload Distribution Format](https://www.youtube.com/watch?v=Ktt-D6iBP0c) and Word macros are popular attack vectors for delivering malware. – ChocolateOverflow Mar 16 '21 at 02:07

4 Answers4

11
  1. HTML content-sniffing, as outlined by Krzysztof; primarily but not exclusively IE.

  2. jar: URL vulnerability in older Firefox

  3. Content resembling a JAR file (GIFAR et al) - hosting something that could be interpreted as a Java archive implies XSS because of the different Same Origin Policy enforced by Java.

  4. A file containing something that looks a bit like crossdomain.xml content can be targeted by loadPolicyFile call to get XSS for a Flash caller.

Should I assert the files being uploaded have magic bytes matching the client provided mime type?

Insufficient against chameleons (files equally valid as either type).

What is the solution to make this secure?

The only strong mitigation is to serve your untrusted uploaded content from a different hostname from your main site.

  1. That different hostname can be a subdomain of your main site if you are only worried about direct JavaScript XSS/Same Origin Policy.

  2. It must be a disjoint subdomain if you need to prevent it from reading cookies (such as session IDs). That is, it can be insecure.example.com if you have your main site on www.example.com, but in that case you must not also allow your site to respond on bare ‘example.com’. example.com cookies can't be prevented from inheriting into insecure.example.com on IE. Best practice: redirect all access to example.com to www.example.com.

  3. It must be a completely different domain if you need to prevent cookie forcing: that is, insecure.example.com can write a cookie that will be read by www.example.com, potentially overriding cookies set by www.example.com. This is a less severe vulnerability: the ability to read the cookies typically leads to session hijacking, whereas cookie forcing leads only to denial of service (as www.example.com won't work without its cookies).

  4. It must also be served on a different IP address, if you are concerned with a preventing cookie-stealing using a vulnerability in older Java plugins, or if you have other services running on ports which you would not want applets to be able to access.

Between them, browsers and plugins have made hosting uploaded files securely a terrible ordeal.

bobince
  • 12,494
  • 1
  • 26
  • 42
  • 2
    Actually `X-Content-Type-Options: nosniff` would be sufficient. We don't have to do any of the 4 steps you mentioned. – Pacerier Mar 29 '15 at 00:14
10

Unfortunately, in your example Internet Explorer will still try to detect the MIME type from first 256 bytes of files content (it's called MIME sniffing). Citing from the MSDN documentation on the subject:

2. If the server-provided MIME type is either known or ambiguous (my note: application/pdf is known), the buffer is scanned in an attempt to verify or obtain a MIME type from the actual content. If a positive match is found (one of the hard-coded tests succeeded), this MIME type is immediately returned as the final determination, overriding the server-provided MIME type (this type of behavior is necessary to identify a .gif file being sent as text/html). During scanning, it is determined if the buffer is predominantly text or binary.

I've just confirmed this behaviour by using the following PHP file:

<?php header('Content-Type: application/pdf'); ?>
<html>
<p>Hello, <b>world</b></p>

This displays HTML content in IE8/Win XP SP2.

You can alter this process by specifying X-Content-Type-Options: nosniff HTTP header in reponse, but it is supported in IE8+. Luckily, other browsers trust HTTP headers and MIME sniffing is much more lighter in them, see e.g. Firefox docs. There's a very good whitepaper documenting MIME sniffing in various browsers.

If possible, you should also try to check magic bytes from file contents. I would also advise you to use Content-disposition: attachment header.

Glorfindel
  • 2,235
  • 6
  • 18
  • 30
Krzysztof Kotowicz
  • 4,068
  • 20
  • 30
  • I'm not able to reproduce the content sniffing behaviour with the exact PHP file you've just given and IE8/Win XP SP3 via one of Microsoft's [VMs for testing IE](http://www.modern.ie/en-us/virtualization-tools#downloads) - I just get the File Download dialog. Perhaps SP3 changed the content-sniffing behaviour? I don't have any idea how to obtain access to an SP2 machine or VM to test on. By the way, do you know whether the issue you're describing applied only to *known* MIME types or also to unknown ones like `application/json`? – Mark Amery Jan 18 '14 at 18:47
  • Just wanted to say, I think this IE feature is called 'quirks mode'? – deed02392 Mar 15 '21 at 16:40
7

Content sniffing. Your proposal is not enough: it will be vulnerable to content-sniffing attacks. I've written elsewhere about strategies to prevent content-sniffing attacks. There are a variety of defenses. Here are the main ones:

  • Include a Content-Type: header in the response. Make sure it includes a valid MIME type (avoid invalid MIME types.)

  • Include a X-Content-Type-Options: nosniff header in the response. This will turn off IE's content-sniffing algorithms, on recent versions of IE.

  • If you don't intend for the content to be viewed in the browser, it can help to set Content-Disposition: attachment, to make the browser treat it as a file download.

Even these steps are not guaranteed to be enough. For instance, if the user is using IE6, they'll still be vulnerable.

(If this sounds annoying, you sure are right. Blame the Apache folks for including a crummy default configuration that broke web standards, for many years, and ignored pleas to do something about it. Unfortunately, now it is too late: we are stuck with a large deployed base of browsers that do dangerous things.)

Separate domain. A better defense is to host the user-provided content on a separate domain, which is used only for user-uploaded content. That way, a successful content-sniffing attack cannot attack your site's content. One user's upload can still attack other users' uploads, but that may be tolerable.

Check your whitelist. You seem to assume that PDF and Word files are harmless. However, those are two powerful and dangerous file formats. PDF is notorious for being a vector. Malicious PDF files abound, and can successfully penetrate many older PDF viewers. The PDF risk is so high that Chrome takes special precautions before allowing you to download and view a PDF document in your PDF viewer. Word is also a powerful and dangerous file format, that can be a host to attacks. For this reason, I would not consider Word or PDF harmless.

You may be able to redirect users to Google Docs, to view the Word/PDF file in their browser through Google Docs. Google will convert the Word/PDF file to HTML and then send it to the viewer's browser. This may or may not be acceptable in your circumstances.

Scan file uploads for viruses. I recommend that you scan all user-uploaded content for viruses, using some virus or malware scanner. For PDF files, see also How to scan a PDF for malware?. You might want to scan the upload immediately when it is uploaded. You might also consider periodically re-scanning older files (this may catch some malware that wasn't previously detected, as antivirus/malware definitions are updated).

More information. See also Mozilla's checklist for file uploads, in their secure coding standard. It's a pretty good list of good security practices.

Summary. In summary, the most powerful and effective defense you can use is to place the user-uploaded content on a separate domain. Then, as additional protection, you may want to consider the other defenses listed here.


Update: I just learned about one more problem with your scheme. Apparently, Flash ignores the Content-Type header, which could allow loading a malicious SWF, which can then do everything you'd do with a XSS. (Sigh, stupid Flash.) Unfortunately, no amount of whitelisting can stop this attack. Consequently, it appears that the only safe solution is to host the user-uploaded content on a separate domain.

D.W.
  • 98,420
  • 30
  • 267
  • 572
  • Do you think that having whitelist of allowed extensions / content-types is necessary at file sharing service? I think owner of such site shouldn't make any checks for viruses or other stuff as the main aim of such site is to give the highest availability. Am I right? – Andrei Botalov Feb 23 '12 at 20:08
  • If you allow all content types, you allow attackers to upload a malicious HTML or Flash or Javascript file; browsers will then execute that malicious stuff as though it came from the site itself. That's (almost by design) a self-inflicted XSS vulnerability. Of course, it is up to the site owner what risks they want to defend against. I'm just trying to give some potential defenses for those who *are* concerned about these risks. Those who aren't concerned should, as always, feel free to ignore this and follow whatever they are comfortable with. – D.W. Feb 24 '12 at 05:24
  • Files are put to separate domain so site is not vulnerable to XSS in such way. Do you think that file sharing service should scan file for viruses/malware, keeping a list of allowed content-types, extensions? – Andrei Botalov Feb 24 '12 at 06:26
  • @Andrey, even on a separate site, one user's uploaded files can still attack another user's uploads. Whether this is worth worrying about depends upon your situation. As far as whether sites should scan for viruses/malware, that's a subjective question one could argue either way. I suggest that you examine your own circumstances and form your own opinion about the cost-benefit tradeoff. All I'll say is that if you want to do the most possible to protect your users, scanning the files for viruses/malware is one you might be able to help some users. – D.W. Feb 24 '12 at 06:42
2

I think the other 2 answers are good. But here's another thing to think about: even if you were completely successful in restricting what types of content you serve, Word docs and PDF docs can still be malicious by themselves.

A PDF doc can exploit any of the hundreds of vulnerabilities that exist in older versions of Adobe reader. The best mitigation against this is to scan each uploaded file with antivirus/malware detection. (Although that's still a limited mitigation due to the cat-and-mouse nature of virus signatures.)

Mark E. Haase
  • 1,902
  • 2
  • 15
  • 24