ModSecurity OWASP Core Rule Set - unicode false positive

0

We run some web services.

We use ModSecurity for Apache webserver with the OWASP core rule set.

We have problems with greek and russian requests, because of cyrillic and greek letters.

In the rules of OWASP CRS there are patterns like

"(^[\"'´’‘;]+|[\"'´’‘;]+$)"

In the ModSecurity Log there are UTF-8 code units where should be unicode characters. All ascii letters are shown as characters as should be.

Example:

[Matched Data: \x85 2 \xce\xb7\xce\xbb\xce\xb9\xce\xbf\xcf\x85\xcf\x80\xce found within ARGS:q: 163 45 \xcf\x83\xce\xbf\xcf\x85\xce\xbd\xce\xb9\xce\xbf\xcf\x85 2 \xce\xb7\xce\xbb\xce\xb9\xce\xbf\xcf\x85\xcf\x80\xce\xbf\xce\xbb\xce\xb7]

[Pattern match "(?i:(?:[\"'\\xc2\\xb4\\xe2\\x80\\x99\\xe2\\x80\\x98]\\\\s*?(x?or|div|like|between|and)\\\\s*?[\\"'\xc2\xb4\xe2\x80\x99\xe2\x80\x98]?\\d)|(?:\\\\x(?:23|27|3d))|(?:^.?[\"'\\xc2\\xb4\\xe2\\x80\\x99\\xe2\\x80\\x98]$)|(?:(?:^[\\"'\xc2\xb4\xe2\x80\x99\xe2\x80\x98\\\\]*?(?:[\\ ..."]

Now we know that it was triggered by a request in greek: σουνιου ηλιουπολη (a street in Athen) Thats not our problem. We can figure that out.

The problem is that x80 is part of the character ’ (e2 80 99) and x80 is also part of a greek letter, thats why we get a false positive.

The actual rule that was triggered:

SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|!REQUEST_COOKIES:/_pk_ref/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "(?i:(?:[\"'´’‘]\s*?(x?or|div|like|between|and)\s*?[\"'´’‘]?\d)|(?:\\x(?:23|27|3d))|(?:^.?[\"'´’‘]$)|(?:(?:^[\"'´’‘\\]?(?:[\d\"'´’‘]+|[^\"'´’‘]+[\"'´’‘]))+\s*?(?:n?and|x?x?or|div|like|between|and|not|\|\||\&\&)\s*?[\w\"'´’‘][+&!@(),.-])|(?:[^\w\s]\w+\s?[|-]\s*?[\"'´’‘]\s*?\w)|(?:@\w+\s+(and|x?or|div|like|between|and)\s*?[\"'´’‘\d]+)|(?:@[\w-]+\s(and|x?or|div|like|between|and)\s*?[^\w\s])|(?:[^\w\s:]\s*?\d\W+[^\w\s]\s*?[\"'`´’‘].)|(?:\Winformation_schema|table_name\W))" "phase:2,capture,t:none,t:urlDecodeUni,block,msg:'Detects classic SQL injection probings 1/2',id:'981242',tag:'OWASP_CRS/WEB_ATTACK/SQL_INJECTION',logdata:'Matched Data: %{TX.0} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}',severity:'2',setvar:'tx.msg=%{rule.id}-%{rule.msg}',setvar:tx.sql_injection_score=+1,setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},setvar:'tx.%{tx.msg}-OWASP_CRS/WEB_ATTACK/SQLI-%{matched_var_name}=%{tx.0}'"

For a workaround we adjusted some patterns like [\"'´’‘] to (\"|'||\xc2\xb4|\xe2\x80\x99|\xe2\x80\x98) so it matches the actual combinations of UTF-8 code units that build a character. We could do this for all 55 SQL Injection Rules of the Core Rule Set, but this is a heavy time consuming task.

We wonder if there is just a misconfiguration with the decoding of Apache or ModSecurity. We know all non-ascii and some ascii characters as well are URL encoded with % and UTF-8 by the webbrowsers.

Marco Wagner

Posted 2016-08-16T12:39:45.240

Reputation: 101

Could you make the module reject anything that is not valid UTF-8? That way 0x80 will no longer occur alone (UTF-8 validity will ensure that). – user1686 – 2016-08-31T05:11:09.260

It is valid UTF-8 encoding. e2 80 99 is fine. You did not get the problem here. – Marco Wagner – 2016-08-31T14:13:36.003

No answers