6

I have one project (in Java) that the programmer used URLEncode to output content as part of HTML instead of HTMLEncode.

What is the risk that it is XSS vulnerable? (Let’s forget that it’s different encoding and he will get different results)

What is the right way to do HTMLEncode in Java?

kalina
  • 3,354
  • 5
  • 20
  • 36
AaronS
  • 2,575
  • 5
  • 22
  • 26

1 Answers1

9

Risk assessment. If the URLEncoded data is inserted into HTML context (e.g., between tags), I do not know of any way to introduce a XSS attack. URLEncode will escape the <, >, and & characters (to %3C, %3E, and %26, respectively). In modern browsers, I believe this is sufficient to prevent XSS for values inserted between tags.

There are some more obscure cases where attacks might be possible. If you are introducing untrusted (but URLEncoded) data into other parse contexts, such as a URL (e.g., the HREF attribute of the A tag), into Javascript, or into CSS, then it is possible that XSS attacks may still be possible despite the use URLEncode. Nonetheless, these contexts are less common, so I suspect they're not what you're talking about.

So, in short, this is unlikely to be vulnerable to XSS (as far as I can see).

Why you shouldn't do it. Nonetheless, as I think you recognize, URLEncode is clearly the wrong solution for this problem. It escapes data in the wrong way, and will cause the data to be mangled when the user views it in their browser. Don't use URLEncode; it is the wrong tool for the job. Instead, use the proper escaping function for the HTML context that you are going to use.

How to do it properly. See my discussion elsewhere about how to select the proper escaping function, and where to obtain an implementation of that escaping function. For instance, OWASP ESAPI is a fine library full of HTML escaping functions that can be used to prevent XSS vulnerabilities.

D.W.
  • 98,420
  • 30
  • 267
  • 572