8

I want to pass a PHP string directly to a JavaScript variable and keep the load on the server to a minimum. I have the following JavaScript in an PHP file for doing this:

<!DOCTYPE html>
<html>
...
<body>

<section id="section1"></section>

<script>
var example = <?php json_encode($string_var, JSON_HEX_TAG);?>;
example.replace(/[<>]/g, (m) => {return m == "<"? "&lt;" : "&gt;";});
document.getElementById('section1').innerHTML = example;
</script>

...
</body>
</html>

The variable $string_var coming from PHP is an user input that was not sanitized in any way (that is, I didn't even execute htmlspecialchars() on it).

Is my example safe? Or do I need to use all these flags for json_encode, i.e. JSON_HEX_QUOT | JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS? I know this example doesn't follow best practices, but is there any concrete risk or vulnerability in it?

References:

  1. Why use the JSON_HEX_TAG?
  2. Answers recommending flags JSON_HEX_QUOT | JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS for PHP's json_encode:
Anders
  • 64,406
  • 24
  • 178
  • 215
flen
  • 205
  • 1
  • 6

1 Answers1

9

JSON_HEX_TAG

If you are echoing into JS inside a HTML document (as you are in your example), this is necesarry or you risk opening up a huge XSS vulnerability. Without this one, an attacker can post something along the lines of this:

</script>alert("XSS");</script>

Since PHP 5.4 this has been fixed, and slashes are escaped by default, but still, you never know what PHP version your code will run on. Better safe than sorry.

As OP points out, even in never versions of PHP you can insert a <!-- to mess up the entire page, possibly causing unexpected behaviour. So the flag is indeed needed in all PHP versions.

JSON_HEX_QUOT and JSON_HEX_APOS

To understand what this does, take a look at the following example:

$array = array(
    "a" => "'",
    "b" => '"',
);

// This will output {"a":"'","b":"\""}
echo json_encode($array);

// This will output {"a":"\u0027","b":"\u0022"}
echo json_encode($array, JSON_HEX_QUOT | JSON_HEX_APOS);

So the quotes around the string literals are never encoded. Quotes inside string literals will be escaped without the flags, and encoded with them.

According to your references, this is used to make the output safe to use inside event handlers (i.e. HTML attributes). It is true that they are not strictly needed inside script tags, but it is not entirely correct that they are safe in event handlers.

Take a look at this example for instance:

<?php $data = array(" onmouseenter=alert(1) " => "foo"); ?>
<a onclick="x = <?= json_encode($data, JSON_HEX_QUOT | JSON_HEX_APOS); ?>">test</a>

Resulting in:

<a onclick="x = {" onmouseenter=alert(1) ": "foo"}">test</a>

I think you are safe if you always enclose your attribute values in single quotes, but still this feels a little risky.

JSON_HEX_AMP

Quoting your reference here:

For compatibility with XHTML non-CDATA script blocks, do & as well.

So since you are doing HTML5 this doesn't apply to you, and I don't think theres a security vulnerability here anyway. Still, it doesnt hurt to encode.

Conclusion

  • For your usage - inside a script tag in HTML5 - just using JSON_HEX_TAG is enough.
  • Doing this inside attributes (event handlers) is dangerous, at least unless you enclose in single quotes.
  • If I were you, I would create a little helper function called safe_json_encode that uses all four flags, and then only use it in script tags. Encoding more than necesarry does not hurt you.

Further notes

  • Make sure to serve the page with a correct content type and character encoding. Messing this up can lead to ways to bypass the encoding.
  • If you later output the variable values to HTML, you need to think about XSS again. Using values from your JSON in e.g. innerHTML or document.write will not be safe. (I see you deal with this on your second line of script.)

Disclaimar: This is based on research I made today. I'm no PHP guru. If you are relying on this for something critical, you probably want to do some more research on your own to make sure I am not missing something here.

Anders
  • 64,406
  • 24
  • 178
  • 215
  • 1
    Thank you for the thorough answer! Excellent point on the `HEX_QUOT` and `HEX_APOS`, but I think the `HEX_TAG` is needed even in PHP 7, otherwise inputting ` – flen Dec 17 '17 at 04:11
  • 1
    As a note: If the user is in control of the value passed to `json_encode`, they might find a way to exploit it so that `json_encode` fails to parse the value and will return `false`. In that case an empty string will be printed to your Javascript. The suggested `safe_json_encode` should handle this case and return `'null'` or similar to prevent JS Parse errors. – JensV Feb 26 '19 at 08:03
  • If I use `JSON_HEX_QUOT|JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_APOS` in php json_encode,can I correctly back to origin when php json_decode? – kittygirl Mar 20 '19 at 11:42
  • @kittygirl What do you mean by "back to origin"? – Anders Mar 21 '19 at 08:17
  • @Anders,`json_decode(json_encode($a,JSON_HEX_QUOT|JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_APOS),true)==$a` – kittygirl Mar 21 '19 at 08:25
  • @kittygirl That's a very interesting question! I don't know the answer, and the manual wasn't much help. If I were you, I would try asking on Stack Overflow! – Anders Mar 22 '19 at 12:35