12

I always use strip_tags to prevent XSS attacks, but today I saw a post which was telling it's horribly unsafe. As the manual says, it doesn't check for malformed HTML!

Is it true?

What can I do to prevent XSS?

700 Software
  • 13,807
  • 3
  • 52
  • 82
Alireza
  • 1,280
  • 1
  • 20
  • 26

2 Answers2

17

To prevent XSS you need to:

  1. Validate all user input that you'll process (for example - if id GET parameter should be a number, ensure it is with e.g. PHP's is_number() function or using Filter extension). This should not only include GET / POST parameters, but also cookie names, cookie values, HTTP headers, uploaded file names etc. Attackers can manipulate requests in many ways. If you need to accept & display HTML content from the user (e.g. in CMS application) use well-tested HTMLPurifier library to filter out Javascript & other XSS payload and leave only clean, sanitized HTML.

  2. When displaying the value use contextual output encoding (sometimes called 'escaping'). There are different rules on how to encode user-supplied value whether it occurs in:

    • HTML context e.g. <div>{$_GET['id']}</div>
    • HTML attribute context e.g. <div class='{$_GET['class']}'>
    • Javascript context e.g. <script> var a = '{$_GET['id']}'</script>
    • CSS context e.g. <div style='background:url({$_GET['image']})'>

It's best to refer to recommended rules described in OWASP XSS Prevention Cheat Sheet. Read it thoroughly and adher to it - XSS is #2 risk to web applications today so you really need to protect from it. See also OWASP tutorial video on XSS.

Krzysztof Kotowicz
  • 4,068
  • 20
  • 30
  • 3
    You deserve kudos for distinguishing between validation and escaping. If only more than 10 people would do this... – chris Dec 24 '11 at 10:18
  • 1
    @Krzysztof, Did miss a close brace in `
    {$_GET['id']}
    ` and `
    `, or is that valid PHP?
    – 700 Software Dec 26 '11 at 18:48
  • `is_int` will never work for POST/GET values, since they're always strings. `ctype_digit` or, yes, the Filter extension is what you're looking for. :) – deceze Dec 28 '11 at 23:25
  • true, will fix now. – Krzysztof Kotowicz Dec 29 '11 at 01:05
  • I used to use HTMLPurifier extensively but found it to be extremely slow (can only process ~3KB/sec), bloated (~700KB minified), not actually standalone, and I don't like that it wants to write to a cache directory. I recently wrote TagFilter (https://github.com/cubiclesoft/ultimate-web-scraper/blob/master/support/tag_filter.php) to deal with those issues and to correctly parse Word HTML. TagFilter is small (~30KB), fast (~2MB/sec), state-engine driven, stream-friendly, and self-contained. Its HTML and XSS cleanup capabilities are on par with (if not better than) HTMLPurifier. – CubicleSoft Jan 29 '17 at 03:42
4

You should use contextual encoding. See the owasp xss prevention cheat sheet. You have to encode differently depending on where the output is. As an example the following input will be missed by strip_tags, yet cause xss if the output is in the value attribute of a html text input: " autofocus onfocus="alert(1)

bstpierre
  • 4,868
  • 1
  • 21
  • 34
Erlend
  • 2,195
  • 14
  • 13