9

We are building an application where users can upload resumes in our system for our administrators to download. We are having a debate about restricting the content type of the files that can be uploaded.

I'm having a hard time piecing together exactly the security concerns I have with allowing any content type to be uploaded.

Is there a security risk of allowing any content type to be uploaded?

Andy
  • 505
  • 2
  • 5
  • 11
  • 1
    Similar to http://security.stackexchange.com/questions/8648/is-it-safe-to-store-and-replay-user-provided-mime-types – this.josh Feb 10 '12 at 00:23

2 Answers2

9

It must be noted that a file does not have an inherent "content-type" per se. A file is a bunch of bytes, and has a name. When you download a file from a Web server, the server infers a content-type (such as "application/pdf") from whatever clues it can find, mostly the so-called "extension" (the few letters at the end of the file name; e.g. ".pdf" is assumed to indicate a PDF file), and sometimes the file contents themselves: for instance, when a Web server distributes an HTML file, it also looks within the file header for a "meta" tag which would override the default choice for Content-type, like this:

<head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
</head>

So you have two operations on your server: upload, and download. Upload is mostly safe: the file comes, and is stored. Download can be a worry: when an administrator will download a file, he will do so by clicking on some link within a Web browser, and the Web server will infer a content-type, as described above. The Web browser will then use the content-type to decide what to do with the downloaded file, and this might not necessarily be "suggest to the user to save it somewhere". For instance, if someone uploads a .html file, the Web browser will interpret it as HTML, displaying it and possibly executing whatever Javascript is in it. Furthermore, the file will come from your own server, so chances are that the administrator Web browser will trust that file by default. Various nasty things may happen at that point.

So you should filter the content-types under which you will serve the files when downloaded; and mind the file name, too, because even if the file is just saved on the administrator system, it may still be a .exe file which the administrator will execute when clicking on it.

Moreover, allowing any kind of file to appear on your server may be an indirect tool to leverage an attack. There are some security holes in which the attacker can somehow force the execution of an arbitrary file on the server; an unfiltered upload mechanism allows the attacker to first push exactly the kind of executable file he would like to see executed on the server.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 1
    To summary, is that a) control the download content type and b) filter the upload extension? – 700 Software Feb 09 '12 at 17:40
  • 3
    A concrete example of this last kind; simply upload a PHP file with a `.php` extension instead of a PDF file with a `.pdf` extension and then request it in a browser. If you have PHP installed in your web server, this file will be executed on the server. – Ladadadada Feb 09 '12 at 17:43
8

Typically the problem is not due to uploading, but hosting and same-origin/malware issues.

Imagine the following scenario:

  1. Alice uploads a file which is served from https://example.com/alice/foo.
  2. Bob uploads a file which is served from https://example.com/bob/bar.
  3. Charlie downloads .../alice/foo, and later downloads .../bob/bar.

Alice and Bob are sharing an origin, https://example.com/. If the files are just treated by the browser as static content, then that doesn't introduce problems. But if the files can contain scripts, then Alice might store credentials via cookies and Bob would be able to intercept those credentials.

Many file types can embed scripts -- HTML, SVG, Flash, ..., etc. Hosting active content introduces all kinds of same-origin issues.

Mike Samuel
  • 3,873
  • 17
  • 25
  • Even static files may have problem when the browser tries to *be smarter* and sniff it's content type. – Pacerier Jul 13 '12 at 02:37
  • @Pacerier, When I talk about "static content" I am not referring to static files. By "if the browser treats it as static content", I mean if the browser does not treat it as containing privileged code. In this sense, code is *dynamic* -- regular textual or visual content is *static*. – Mike Samuel Jul 13 '12 at 06:26
  • Ic, probably it should have been written as *non-code* content instead? – Pacerier Jul 13 '12 at 06:35