0

I am learning about XSS and am in the process of trying to understand why escaped HTML added to the DOM is triggering XSS vulnerability.

The application will draw a modal overlay for a form (bootstrap) and add in HTML (both escaped and unescaped) to that portion of the DOM document. It gets the data from a request to the server which auto-escapes any user input that has HTML in it. Below is the code that is causing the vulnerability.

$('.update_button').live('click',function(){
    $('#name').modal()
    var id=$(this).attr('data-id');
    $('#form_holder').hide()
    $('#form_holder_loading').show()
    $.ajax({
        url:'/some/path/',
        data:'id='+id,
        dataType:'json',
        success:function(data){
            // data is an HTML string with HTML entered by the user encoded
            // such that any HTML characters are replaced with their entity
            // e.g. '<' becomes '&lt;'
            $('#user_form_holder_loading').hide()
            $('#user_form_holder').empty().html(data.form)
            $('#user_form_holder').show()
        }
    });
});
Anders
  • 64,406
  • 24
  • 178
  • 215

2 Answers2

3

The way to safely print untrusted data in HTML is by replacing HTML-significant characters with character entities. (That is, you'd replace < with &lt;, etc.) I suppose this is what you mean by "HTML escaping".

Now, if you filter untrusted HTML that way, you can't trigger XSS with jQuery's html() anymore. E.g., this will just print plain text without markup:

$('#element').html("&lt;h1&gt;XSS&lt;/h1&gt;");

Also note that the recommended way to insert untrusted content with jQuery is text(), since you don't have to bother with your own filtering. This is safe:

$('#element').text("<h1>XSS</h1>");

See this JSFiddle for a demo.

Arminius
  • 43,922
  • 13
  • 140
  • 136
  • You are correct when talking about character entities. This particular version of jQuery is old (1.7.2) and might be vulnerable to XSS in that way? – Gerad Bottorff Oct 12 '17 at 20:46
  • 1
    @GeradBottorff No, [it's not](https://jsfiddle.net/3jwrr6oh/). – Arminius Oct 12 '17 at 20:51
  • So if it's not vulnerable and the HTML String being inserted is **.....<script>alert('group1')</script>.....** then why is that getting parsed and executed? – Gerad Bottorff Oct 12 '17 at 21:09
  • @GeradBottorff I don't know your code. I can just show how the function behaves. You may want to try minimize your test case and post it here. – Arminius Oct 12 '17 at 21:13
  • @GeradBottorff: i didn't dig through legacy jq code, but i know it used some dom manips to try to work around some various problematic inputs, in an effort to be more reliable on old browsers. Such dom manips have wierd parsing/escaping side-effects, ex: `x=new Option();x.innerHTML="<b>"; alert(x.text);` – dandavis Oct 12 '17 at 21:33
  • @dandavis I have updated the original post with the JS code. – Gerad Bottorff Oct 12 '17 at 21:50
  • 2
    @GeradBottorff Your example still doesn't prove that the API escapes the HTML properly. What does `console.log(data)` return if you put it in the `success` function? – Arminius Oct 12 '17 at 21:54
  • The bold text in the 3rd comment shows exactly (thought truncated to the HTML part that is escaped) what is returned from the server (I copied it from Burp Suite's proxy history – Gerad Bottorff Oct 12 '17 at 22:00
  • @Arminius I was able to modify the JS to print the data it recieved to the console. Below is the relevant portion (sorry for formatting): `
    `
    – Gerad Bottorff Oct 13 '17 at 15:10
1

jQuery's .html() is dangerous. It parses and evaluates content using eval(), which is frowned upon. html() actually finds <script>tags and executes them, on purpose! Normally when adding such tags as a string they fall into place silently as benign content w/o execution.

If you would use the construct elm.innerHTML=str; instead of $(elm).html(str), the "smart" content processing jQuery performs would be omitted and the resulting dom would be same, minus the vulnerabilities. You still need to escape event attributes, but html()'s glaring <script> vulnerability will be bypassed.

Of course, if you don't need html formatting, Arminius's answer about adding it as plain text is apt.

dandavis
  • 2,658
  • 10
  • 16
  • Where is the `eval` called? I've looked through the code for the `html` method and haven't seen one. Plus shouldn't it not execute any html that has been 'sanitized'? – Gerad Bottorff Oct 12 '17 at 21:04
  • @GeradBottorff: it looks like they moved from an inderect `eval()` to a dynamic script tag injection since i last looked. It's still just as bad however: it executes script content. It's now called as `DOMEval` in the 3.2.1 source. Comment in `domManip()` "_Evaluate executable scripts on first document insertion_". is scary... – dandavis Oct 12 '17 at 21:25
  • The version of jQuery being used is extremely old and would actually pull out escaped script tags and then evaluate them. Thus causing the issue I was seeing. – Gerad Bottorff Oct 16 '17 at 15:07
  • @GeradBottorff: well the new versions do the same thing as well, they just hide `eval()` since everyone hates/fears it. I don't know of a strong advantage of dynamic script tags over eval() security-wise. Anyway, a lot of people are surprised a major library evaluates code that's supposed to be benign. – dandavis Oct 16 '17 at 19:56