8

how can untrusted values be included in a html-page as javascript string constants? Although the case i am asking uses JSF and Rails, I think this is a general problem independent of the server side framework.

on-attributes

It is obviously not enough to escape \ ' " in on-attributes:

onclick='alert("[variable]")'
variable: hello, world"); alert("XSS
result: alert("hello, world"); alert("XSS")

In Firefox this will show a "hello, world" and an "XSS" alert dialog. As the variable does not contain a ", the Rails function escape_javascript is not sufficient in this context. Appyling html escaping seems to be secure, as only the original dialog box is shown in this case:

result: onlick='alert("hello, world"); alert("XSS")'

script-tags

In <script> tags, however, different rules apply in Firefox:

<script type="text/javascript">
   alert("hello, world&quot;); alert(&quot;XSS");
</script>

This displays just one dialog box; but the quotes are displays in their escaped form. There are two conclusions: 1st) &quot; is not parsed as an JavaScript quotation sign (good), 2nd) it is not parsed according to HTML rules either (bad). So that any attempt to html escape the constant creates broken data.

Skipping html escaping in script-tags, however, is not an option:

<script type="text/javascript">alert("[variable]");</script>
variable: hello, world</script><script>alert("XSS
result: <script type="text/javascript">
        alert("hello, world</script><script>alert("XSS");</script>

The first script tag creates an syntax error because of the missing ". But in Firefox the injected second script tag is executed.

tl,dr

Is there a secure way to write untrusted values to JavaScript constants? This seems to be a total mess, and I am currently leaning towards using html elements (with style="dispaly:none") instead. An XMLHttpRequest call using JSON would be another option. But both approaches feel like cheating.

AviD
  • 72,138
  • 22
  • 136
  • 218
Hendrik Brummermann
  • 27,118
  • 6
  • 79
  • 121
  • For more information: http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet. This also applies to what @AviD had mentioned. –  Dec 24 '10 at 00:04

4 Answers4

7

Another interesting (older) encoding library is the OWASP Encoding Project (previously the "Reform" library) - http://www.owasp.org/index.php/Category:OWASP_Encoding_Project

This is functionally identical to Microsoft's AntiXSS 1.x (not unusual - Mike Eddington wrote a lot of the internal Microsoft library that later became AntiXSS, and then WPL), but supports most common web languages except Ruby/Rails.

It implements a pessimistic encoding approach (encodes everything except what is known to be safe), and gives you results like this:

eval('; --> eval\x28\x27\x3b

eval\u0080 --> eval\u0080

\u003cscript\u003e. --> \x3cscript\x3e.

Justin Clarke
  • 453
  • 2
  • 5
6

As you've noticed, there are at least two different contexts, which require different escaping rules: SCRIPT tags, and attribute values containing Javascript. Rather than trying to find a single answer that works for both, you should be looking for a different solution for each context. I will comment primarily on string literals inside SCRIPT tags.

A simple way to encode string constants in Javascript is to take unsafe characters and translate them to Unicode escapes: e.g., a -> \u0061. I suggest building a small whitelist of safe characters (a-z, A-Z, 0-9), and encoding everything else.

Even better would be to use a standard, well-vetted library to do the escaping. I don't have a specific recommendation, but others might have one. AntiXSS has a good reputation, and JavaScriptEncode looks relevant. Just make sure that it is appropriate for the particular context that the string will be inserted into.

Non-obvious pitfalls:

  • As you noticed, </SCRIPT> can terminate a SCRIPT tag, even when it is inside a quoted string. That's because, when the browser sees a <SCRIPT> open tag, the browser's HTML parser starts looking for a closing </SCRIPT> tag before it starts parsing the Javascript itself using the Javascript parser, and a </SCRIPT> inside a string constant can prematurely close the SCRIPT tag. Therefore, angle brackets are not safe to appear unescaped inside a Javascript string literal. For similar reasons, ampersands are not safe, either. (I thank Adam Barth and Collin Jackson for teaching me about this subtlety.)
  • Quotes can be used to break out of a quoted string, and thus single and double quotes are not safe.
  • Backslashes can be used to break out of a quoted string (imagine ending with a \), and thus are not safe.
  • If you use JSON, and you include untrusted user input inside the name of a property (instead of the value), things can get tricky. Make sure you quote the name and apply the same rules as you would for other string literals. Also, other checks may be needed as well (e.g., disallowing 'valueOf', etc.).
  • Watch out for the character set used on the HTML page. UTF-7 can really ruin your day. I suggest that you treat + and - as unsafe characters, and that you ensure that the page includes an explicit character set in the Content-Type HTTP header or a HTML META tag (to minimize the chance that the browser tries to auto-detect the character set).
  • Be careful not to double-escape. Double-escaping can re-introduce security problems, in some cases.

Useful resources: http://code.google.com/p/doctype-mirror/wiki/ArticleXSSInJavaScript, http://code.google.com/p/doctype-mirror/wiki/ArticleXSSInEventHandlers. Also, I encourage you to check out Blueprint and Dan Kaminsky's Interpolique, two innovative methods for addressing these kinds of issues.

dmitris
  • 215
  • 1
  • 6
D.W.
  • 98,420
  • 30
  • 267
  • 572
6

HTML escaping is not sufficient in onclick attributes if any of your security properties rely on code running -- e.g. a critical onsubmit handler or onload handler. If an attacker can embed codepoint U+2028 or U+2029, then they can cause parsing to break, since those are not allowed unescaped in JS strings.

If you are dealing with XHTML instead of HTML, and you've defined your own % entities, then these are always an avenue of attack, unless you escape % as well, but if you're using XML, then there's less of a difference between script element bodies and attributes.

Mike Samuel
  • 3,873
  • 17
  • 25
5

If you were on ASP.NET, I would recommend you use MS' WPL - the AntiXSS component provides a .JavaScriptEncode() method, exactly what you need.
If you were in JEE, I would have suggested you use OWASP's ESAPI library - there too there is a .encodeForJavaScript() method (I think I spelled it right, from memory...).
Unfortunately I'm not familiar with Rails enough to provide a solution for that, but maybe OWASP has something there too... (EDIT: OWASP is currently working on a port to Ruby, though its not released yet...)

What these methods both do - explicitly encode everything that is not considered a "safe" character.
What it does not do - blacklists of certain "bad" characters, which would need to be constantly updated, since it keeps getting bypassed.
If necessary, you should probably be able to hack that in Rails too, according to the first part. BUT I would not recommend it, unless you absolutley had to.

AviD
  • 72,138
  • 22
  • 136
  • 218
  • [owasp-esapi](http://code.google.com/p/owasp-esapi-java/) is a very nice library, and luckily I can use it in our Rails application because it runs on jRuby. The code is really simple, so porting it to rails should be easy. Calls to Encoder.encodeForJavaScript() end up in Codec.encode() and JavaScriptCodec.encodeCharacter() both method have less than 20 lines together. The trick is to encode non alpha numeric characters as \xHH, which includes the &. – Hendrik Brummermann Dec 24 '10 at 07:01