3

I have a form that takes user input and records them in a table. My form sanitizes user input by deleting certain characters (mentioned below). I can export a CSV of the table, which outputs the cells as it is.

The correct way to prevent a CSV injection is to prefix a single quote to the start of a cell that looks like a formula, i.e. those starting with =, +, -, @ or |.

However, if the input instead escapes characters like ;, ! and | by deleting them before it gets recorded in the table for CSV export, will that be sufficient by itself?

So far, the only 'injection' I have managed to do is '=SUM(1+1)', due to it being impossible to use popular payloads like @SUM(1+1)*cmd|' /C calc'!A0 or DDE ("cmd";"/C calc";"!A0")A0, as the key characters get deleted.

isopach
  • 491
  • 1
  • 3
  • 14
  • 2
    You’re a bit short on details, the actual implementation may meet the general terms described but not be a complete defence. I suggest you provide more details on how it is implemented. – wireghoul Aug 02 '18 at 10:57
  • 1
    To follow up on wireghoul, it would be helpful to know what engine is parsing your CSV and executing them as formulas, Excel? A JavaScript web page? Is this text passing through a command-line? Something else? What counts as a dangerous character for injection depends on what software is parsing the text, which you have not told us. – Mike Ounsworth Aug 02 '18 at 11:39
  • Also useful to know: where the data is coming from and what the trust model is. ie is your form used by internal employees, and therefore mostly trustworthy, or public-facing where you expect the data to contain actual viruses? – Mike Ounsworth Aug 02 '18 at 12:59

1 Answers1

6

I don't think that stripping some characters can be a strong solution, especially if you develop the functionality by yourself: someone could manage to bypass this check by encoding these characters for examples, and get the formula executed anyway (Maybe your CSV reader, i.e. Excel, parse the formula differently than you and execute it anyway). Best solution is to use already tested mitigations, like the one you mentioned of putting a single quote at the beginning of the string. Another solution is provided by NCC Group in this report:

When performing a CSV Export, for any cell that starts with an =, -, ", @, or +, add a space to the beginning and remove any tab characters (0x09) in the cell. Alternatively, prepend each cell field with a single quote, so that their content will be read as text by the spreadsheet editor.

meliot
  • 76
  • 4
  • I am very confident that the check cannot be bypassed. Also, how would you execute encoded characters that would still be encoded, in Excel context? – isopach Aug 02 '18 at 11:23
  • 2
    @Yuu I think he means "bypass" in the sense that the data is uploaded in some weird utf-8 encoding, so looks fine to your checker, but then Excel will helpfully convert it back to the local character set and it becomes dangerous again. Or something like that. Without deep knowledge of what the parser is doing internally it's much harder to be confident that you've thought of everything. – Mike Ounsworth Aug 02 '18 at 13:04