Encoding output in JSON HTTP API

Question

I'm the author of a JSON REST API. This REST API is consumed by various clients, such as HTML/JS-clients, .NET clients (console applications) and Ruby clients. The output of the API is in JSON format, so it's formatted according to JSON rules and the neccessary special characters are escaped.

A security researcher reported to me that < and > are not escaped in the output, so if an attacker did a request such as HTTP POST https://my.api.example.com/blabla with the JSON body {Value:"<Hello>"} then my API will output something like {Response:"<Hello> is not a valid value"}.

The security researcher then explained that an attacker may be able to do a Reflected Cross-site scripting by using this.

My JSON API always returns a Content-Type: application/json header so my understanding is that modern clients will not try to interpret it as HTML.

< and > are valid characters in my API so I can't filter it out. One option would be to encode the characters it in the output, but HTML-encoding characters in a JSON API seems a bit strange to me. A third option would be to just remove the string from the output and instead say something like "The provided value was not valid", but that would reduce usability a bit.

Are there any good common practices how to handle this?

score 3 · Answer 1 · answered Jan 25 '19 at 16:06

To some degree this is a matter of opinion, but I do not agree that there is a vulnerability here. If a client reflects HTML data from your JSON response without any sanitation, it is a vulnerability in that client and not in your API. Any webpage should treat API responses as untrusted data.

You are correct that HTML encoding the response is a bad idea. Why should the API make any assumptions about what is a proper encoding for the client? What if you want to use the API for a mobile app as well where the data will be displayed in a non HTML context? HTML encoding would just be bad design. It would be solving a problem for the client on the server side.

What I would do instead is to clearly document that the output may contain HTML special characters, and that the client should take appropriate action.

One thing, though: Make sure to respone with a no sniff header. Without it, mime sniffing browsers could interpret the response as HTML or JS, which would be a real vulnerability.

score 2 · Answer 2 · answered Jan 25 '19 at 15:46

2

Can you rely on the Content-Type header? No

As said perfectly in the answer to this security.stackexchange.com question , we can't rely on clients to respect content type headers when it comes to security.

How to Encode JSON?

OWASP provides advice for exactly your situation They say that if the context is HTML, then you encode your output for that context. They also say:

An alternative to escaping and unescaping JSON directly in JavaScript, is to normalize JSON server-side by converting '<' to '\u003c' before delivering it to the browser.

answered Jan 25 '19 at 15:46

mcgyver5

6,807
2
24
45

1

Using JS / JSON escaping is the correct solution here. Output encoding should match the output format, and the output format is JSON. Any remotely well-behaved JSON parsers already knows how to decode (un-escape) JSON character literals - it is required to, by the spec, for some metacharacters - and since they're the same as JS literals it is generally easy to do. – CBHacking Jan 26 '19 at 00:22

score 0 · Answer 3 · answered Jan 25 '19 at 15:51

First of all, the headers are nothing to strictly rely on. And while this potential XSS vulnerability might not be an issue now, you cannot know exactly how your API will be used in the future. Somewhere down the road someone might use responded values in HTML, hence opening the possibility of some XSS.

This might not be your problem per se, you still should validate your input and sanitize your output.

What exactly are the scenarios when <> would be valid characters in your input?`
Would it hurt to remove <> whenever there's an input that does not match the previously defined scenarios?
Would it hurt to use your third option? This is probably the easiest to implement.

As with anything, OWASP has a great guide (called "cheat sheets") for web devs on how to prevent XSS (especially in JSON). Their json-sanitizer project on GitHub might also be worth checking out.

Encoding output in JSON HTTP API

3 Answers3