What security implications are involved when using a CSV file to fill a SQL table?

Question

I have some data on CSV files and I want automate the input of this data into an SQL database using Powershell. The data is mostly, if not completely, numbers.

I am using a script based on the one found here: SIMPLE POWERSHELL SCRIPT TO BULK LOAD CSV INTO A SQL SERVER TABLE.

What sort of security concerns would I have to address if using this? I don't know much about how SQL injection works, but would I need to be concerned about it in this case?

It's not a _security_ concern per se, but you should be aware that type conversions may not go as you expect, especially regarding decimal precision. If you previously precisely calculated $1.54, and you store that in a floating-point column, it'll actually get stored as (roughly) 1.5400000000000000355. For different reasons, if you have numbers that are too large, they may become wildly different numbers. — Nic, Aug 01 '19 at 01:21
SQL Server, like all major RDBMSs, has a [bulk load utility](https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-2017). I'd start by scripting around that, since presumably the utility is hardened against SQL Injection. — Clockwork-Muse, Aug 01 '19 at 17:54

svartedauden · Accepted Answer · 2019-08-01T17:52:41.703

SQL injection is the "injection" of an SQL query/command with the intention of execution. A good summary and reference is available from OWASP.

In this case, the risk would be that the CSV file includes commands which would, upon being loaded into the SQL database, be executed rather than simply inserted into a table.

You could approach resolving this problem from either the position of the sender or receiver. Approaching this from the sender side - that is, manipulating the input data - only makes sense if you are in control of the data and are concerned about malformed input rather than malicious attacks. Mitigating SQLi at the database side is the correct and far more secure way to do this, however in some situations it can be more effort.

On the sender side, you may choose to parse inputs before they are sent to ensure they only contain the intended data - perhaps only alphanumeric characters. This would generally make sense for something like a webpage form, where malformed input would prompt the user to do something like "Input a valid email address" which serves to both correct user input for accuracy as well as manage what sort of input is sent to the database. In your case, this may be simplest (though not necessarily best) solution if you know what the input CSV should look like and contain in all cases.

The generally better and more versatile solution involves handling this from the position of the data receiver, the SQL server itself. This means performing the same or similar validation as above, but instead as the SQL server receives and saves input. In the case of an SQL injection attempt, this should hopefully catch what looks like a query or malformed input and either discard it or treat it simply as text input - but most importantly, not execute it as a command.

The ways to validate input depend on what your input looks like and what you'd like to do with it in your own situation. OWASP also provides an excellent guide on this which I suggest you review.

It may be additionally useful to look at what SQL queries/injection attempts look like so you can see what sort of special characters and commands you will need to be able to handle if they may be a part of your input dataset.

What security implications are involved when using a CSV file to fill a SQL table?

1 Answers1