My team currently uses ASP.NET with MS SQL Server for our software. Security has only become important since I started, in which there is injection vulnerabilities everywhere.

Due to current integration with Oracle and MS SQL the business decision was never to use parameterized queries. This of course has been an issue.

Implementing Find and replace along with whitelisting of parameters has reduced this issue strongly.

My only issue is, I have read a lot about unicode and other encodings being the cause of sql injection. I dont quite understand this. Currently we sanitise everything like this:

        Const pattern As String = "^[a-zA-Z0-9.=,:\s\\\/\']*$"
        term = term.Replace("'", "''")

        If Not Tools.ValidString(term, pattern) Then
            term = String.Empty
        End If 

    Public Shared Function ValidString(ByVal source As String, ByVal pattern As String) As Boolean
        If source = String.Empty Then Return True

        Dim params As Text.RegularExpressions.RegexOptions = Text.RegularExpressions.RegexOptions.None
        Dim regex As New Text.RegularExpressions.Regex(pattern, params)
        Dim match As Text.RegularExpressions.Match = regex.Match(source, pattern, params)
        Return match.Success
    End Function

Does anyone have an example where unicode/encoded injection could be used, or just a plain example where this regular expression would fail to prevent sql injection.



Can I please not have answers related to standard SQL Injection. I am strongly familiar with this already. ALSO please stop posting saying don't use string sanitisation. There is zero resources in the company to shift all queries to parameterised queries with ADO.NET while also building in logic for it to use ODP.NET if the client uses oracle. OWASP mention the use of whitelisting of characters if parameterising is out of question, so like in the regex, only few characters are allowed. I am not blacklisting characters, as this is stupid.

There is no compliance required for the data we hold. The security is for database integrity, as it would be a nightmare if content was changed.

Our software is a very large cloud application CMS and DMS in one, where 99% of the software is used internal, and only a minority is external and is only used for public review and commenting on the documents.

From my new understanding of Unicode injection. It can only occur if the data is being encoded before being placed into the query, and therefore unicode injection only really occurs in applications with globalisation of data. I am passing raw string fields straight into the string query after the sanitisation above.

Can I please only have an answer from an expert in injection, who can back up my claim that Unicode will not apply in my circumstance?

  • 503
  • 2
  • 6
  • 12
  • [Same question on Stack Overflow.](http://stackoverflow.com/q/22824597/53114) – Gumbo Apr 03 '14 at 02:16
  • possible duplicate of [No single quotes is allowed, Is this SQL Injection point still exploitable?](http://security.stackexchange.com/questions/37749/no-single-quotes-is-allowed-is-this-sql-injection-point-still-exploitable) – Eric G Apr 03 '14 at 19:57
  • Also related: https://security.stackexchange.com/questions/41139/getting-a-sql-injection-past-a-given-regex/41141#41141 – Eric G Apr 03 '14 at 19:58
  • Not a duplicate of no single quotes is allowed. I'm looking specifically for how it could be unicode exploitable, I know exactly what is and isnt exploitable in query string structure. I need an expert with specific examples of how it could be exploitable. Someone on stackoverflow mentioned that as long as i'm leaving the query how it is and not actually encoding the values as I pass them into the query (as used in globalisation of data), then its not unicode exploitable. Can any experts here confirm this? – Cyassin Apr 03 '14 at 22:13
  • 1
    Your castle walls are made of tissue paper, and you are asking us how to protect against a specific make and model of catapult. The problem is not the catapult; it is the tissue paper. – Stephen Touset Apr 03 '14 at 23:49
  • 1
    [This is related](http://stackoverflow.com/questions/910465/avoiding-sql-injection-without-parameters?rq=1). In brief, the fact that you don't know (and honestly, I don't know either) is a very strong argument for revisiting *why* you aren't using parameters. – Tim M. Apr 02 '14 at 23:00
  • [Here](http://stackoverflow.com/a/12118602) is an example of an attack vector in PHP. It's a very obscure case but it's just to give you an idea how crazy things can get. Using parameterized queries would be the way to go IMO. As for your regex, you might make it stricter replacing `\s` with a literal space since `\s` [matches more than just a space](http://rick.measham.id.au/paste/explain.pl?regex=%5Cs) depending on the regex engine. – HamZa Apr 02 '14 at 23:19
  • Parameterised queries are not an option with our resources. – Cyassin Apr 02 '14 at 23:21
  • I don't need examples of how sql injection works. I previously did pen tester work without automated tools. I have only recently heard of unicode attacks and believe im covered but was hoping for someone with experience in unicode attacks directly... – Cyassin Apr 02 '14 at 23:28
  • 1
    @user2338488 I'm not an ASP.NET developer, but if you're checking the query against `^[a-zA-Z0-9.=,:\s\\\/\']*$` then there is nothing to be afraid of "unicode" since you're including only ASCII characters – HamZa Apr 02 '14 at 23:28
  • I thought unicode or other encodings could be represented like /u1006 or /x9d etc? – Cyassin Apr 02 '14 at 23:33
  • 2
    They can be, but if you treat the whole encoded string as characters, you won't have anything to worry about. Don't decode the string. –  Apr 02 '14 at 23:35
  • Thanks Adam, so if for example I dump a sanitised string straight into a MS SQL query that contains encoded characters I will be fine due to them not being decoded anywhere? – Cyassin Apr 02 '14 at 23:38
  • 1
    That is correct. –  Apr 03 '14 at 00:03
  • 1
    @HamZa The obscure case does only exist because of a [misuse of the API](http://stackoverflow.com/q/5288953/53114). – Gumbo Apr 03 '14 at 02:11
  • "Due to current integration with Oracle and MS SQL the business decision was never to use parameterized queries." This doesn't make any sense. Both SQL Server and Oracle provide support for parameterized queries. They both implement ADO.NET, giving you a common interface for using them even. – jpmc26 Oct 20 '17 at 01:48

4 Answers4


There are cases of SQL Injections leveraging the implicit conversion of Unicode homoglyphs from Unicode character string types (NCHAR, NVARCHAR) to character string types (CHAR, VARCHAR). A character such as ʼ (U+02BC) in NVARCHAR may slip through the escaping routine and get translated to ' (U+0027) in VARCHAR, which may result in an SQL Injection when such a string is used to build an SQL statement dynamically.

However, your validation is pretty strict (only characters from the Basic Latin Unicode block and Unicode whitespace characters) and I can’t think of any case where this would fail.

  • 2,003
  • 1
  • 13
  • 17

Please, please, please, please do not use a handmade regex for preventing SQL injection. You should never be writing your own escaping, filtering, or sanitizing functions to prevent SQL injection, XSS, shell injection or the like. These are things you rely on built in and vetted libraries for.

Where you can avoid it, don't even use a standard library escaping function. Use parameterized queries.

For one, that escaping won't help if you're using an unquoted integer value at the end of a query. In pseudocode:

check ValidString(input)
query("SELECT * FROM table WHERE id=" + input)

The user can easily enter 1 OR 1=1 to see all rows in table.

...Or they can enter 1 UNION ALL SELECT password,1,1 FROM password_table or the like. And now the output contains all user passwords.

Are you saying it is impossible to go back and convert all SQL code to use paramaterized queries? If so, I suppose you better be 100% sure you always quote SQL values and never do any integer comparisons or insertions...

If I were you I would just convince your management and/or team that if security is desired in even the slightest, then time needs to be taken to actually rewrite the code in a secure manner.

  • 654
  • 4
  • 8
  • There is no unquoted integer values in the code. The code is also developed so that integer values are not passed as string parameters. We do not require any sort of compliance testing, it is more for the integrity of the database. – Cyassin Apr 03 '14 at 10:50
  • It is impossible to go back and convert all the code. It would need to be also compatible with Oracle, and we only have 2 full time developers and a junior, plus there is almost 2 years worth of work already scheduled to meet customer demands. This is a very large scale custom CMS for a private sector too, its not a small application. – Cyassin Apr 03 '14 at 10:55

The one thing you need most to stop SQL injection is a semi-colon (;). However, it isn't always simple to just eliminate them. You will have situations where a semi-colon is used in a text field as a character, not as a SQL command terminator.

There are plenty of articles that go into detail about how to both inject and prevent SQL in your queries.

If you write text output to a Web page, encode it using HttpUtility.HtmlEncode. Do this if the text came from user input, a database, or a local file.

This brings to mind one of the methods used to protect from SQL injection: encode characters that will cause problems with ones that can't.

Specifically for Unicode versus any other encoding, you will still have to use the same techniques for finding, and either removing or replacing the offending characters in your strings.


If you encode Unicode strings to something like /u0000, then you can leave the string encoded and safely put it into your database without worrying about SQL injection. You will have to write the routine that performs the sanitization though, as it isn't built into SQL Server.

  • I have a good understanding of sql injection in its fundamental use as a previous pen tester. I just need a more direct answer to my question as I hear unicode is the method some large scale attacks have been made and would like more info directly on that. – Cyassin Apr 02 '14 at 23:26
  • 1
    You don't need semi-colons on SQL Server. Ie: `SELECT * FROM table1 DELETE FROM users` works fine. – Edson Medina Apr 27 '16 at 12:08

If you're looking at how unicode can be exploited, see this question, if you're looking for a solution, there are really only two that can be considered really secure: parameterized queries and encoding all of your input into hex or base64 or some other encoding that doesn't leave open the possibility changing the context of the value.

I would seriously reconsider the use of parameterized queries -- thay are in fact remarkably easy to use, while I don't know your code base, It's generally fairly easy to switch over. Particulaly since the introduction of extension methods.

  • 496
  • 2
  • 9