11

I am working with a system that allows users to upload CSV files, that are downloaded by other users.

The system validates (amongst other things) that all CSV files can be parsed by an RFC 4180 compliant parser, and are valid UTF-8. It ensures that when files are downloaded, they have Content-Type: text/csv; charset=utf-8, and Content-Disposition: attachment; filename="download.csv".

A concern has been raised that the system could be used to transmit malware or malicious code.

Are there any known mechanisms where a malicious CSV file could cause code to be executed by the recipient? If so, is there any further validation that would reduce the risk posed?

Stevoisiak
  • 1,515
  • 1
  • 11
  • 27
James_pic
  • 2,520
  • 2
  • 17
  • 22
  • 1
    Anything can contain malicious code. – xvk3 Jul 12 '17 at 18:02
  • 2
    @WillV This is true, but malicious code that is not executed, or malicious code that requires significant social engineering to get the user to execute, is much less of a problem than malicious code that is executed as a result of doing something innocuous – James_pic Jul 12 '17 at 20:16
  • The parser of said CSV, just values seperated by commas, would need to have a vulnerability, which the file would need to exploit. Unlikely. – xvk3 Jul 12 '17 at 20:22

2 Answers2

11

Yes, there are some examples of malicious CSV files causing random "code" execution. People choose to open CSV files in MS Excel or Open Office or such software which have macro execution capabilities.

Some examples:

https://www.contextis.com//resources/blog/comma-separated-vulnerabilities/ https://hackerone.com/reports/72785

If your environment does not use popular applications such as MS Excel to open CSVs, the risk is significantly reduced. I would also look for the presence of external, potentially malicious links in the downloaded CSV that might be hosting drive-by downloads (hence you would want to avoid visiting these links).

whoami
  • 1,366
  • 9
  • 17
  • That's exactly the sort of thing I was interested in. We can't control the software our users will use, and Excel and Open Office are both likely to be targets. – James_pic Jul 12 '17 at 20:11
  • @James_pic, perhaps consider displaying the CSV file in your webpage instead, if applicable? But from the links, it seems that the main thing you'd have to do is ensure that users either cannot have formulas in their CSVs (it seems very unusual to me to do that, as CSV is usually reserved for plain old data, with dynamic stuff using something like XLSX or ODS). If you really must support formulas, maybe offer a warning to users about possible malicious content? – Kat Jul 18 '17 at 19:00
  • 1
    From the first link, the suggested fix is that if a column starts with `=`, `@`, `+`, or `-`, escape it with a single quote (`'`) before it. However, this might cause issues for legitimate CSV files that aren't being opened in spreadsheet applications. Might be something that needs to be communicated to users as a result, otherwise you risk breaking people's automation, etc. – Kat Jul 18 '17 at 19:03
  • @Kat Based on this link: https://blog.zsec.uk/csv-dangers-mitigations/, we're thinking it'll be sufficient to block anything that starts with one of those prefixes *and* contains `|` (and maybe also `=` with `DDE`, to mitigate the equivalent issue in OpenOffice or LibreOffice). And only displaying it in a browser is a non-starter - the users need to be able to process the CSV files in the sorts of data processing tools that are common for this sort of thing. – James_pic Jul 19 '17 at 11:05
  • Oh, and escaping is also a non-starter in this case, since many users won't take their data anywhere near a spreadsheet (they'll put it though a Perl script, or stick it into an ETL routine, or import it into R, or that sort of thing), so this would break that workflow. – James_pic Jul 19 '17 at 11:09
  • And yes, a warning about malicious content is also likely to be part of the approach we take - something like blocking anything that we're sure is harmful, and putting a big warning in front of anything that could be interpreted as formulae. – James_pic Jul 19 '17 at 11:11
4

Yes, it may contain arbitrary system commands that will be executed on the machine where you are opening the CSV file. Your spreadsheet software will render the CSV values as the injected commands and execute after giving you multiple warnings.

Example - Create a CSV file with the following 2 lines -

User name,Email,Designation

=2+5+cmd|' /C calc'!A0,a@b.com,SSE

Save it and open using MS excel. Calculator will open in your Windows system.

For further reading -

Arka
  • 551
  • 2
  • 6
  • 11