JSON_HEX_TAG
If you are echoing into JS inside a HTML document (as you are in your example), this is necesarry or you risk opening up a huge XSS vulnerability. Without this one, an attacker can post something along the lines of this:
</script>alert("XSS");</script>
Since PHP 5.4 this has been fixed, and slashes are escaped by default, but still, you never know what PHP version your code will run on. Better safe than sorry.
As OP points out, even in never versions of PHP you can insert a <!--
to mess up the entire page, possibly causing unexpected behaviour. So the flag is indeed needed in all PHP versions.
JSON_HEX_QUOT and JSON_HEX_APOS
To understand what this does, take a look at the following example:
$array = array(
"a" => "'",
"b" => '"',
);
// This will output {"a":"'","b":"\""}
echo json_encode($array);
// This will output {"a":"\u0027","b":"\u0022"}
echo json_encode($array, JSON_HEX_QUOT | JSON_HEX_APOS);
So the quotes around the string literals are never encoded. Quotes inside string literals will be escaped without the flags, and encoded with them.
According to your references, this is used to make the output safe to use inside event handlers (i.e. HTML attributes). It is true that they are not strictly needed inside script tags, but it is not entirely correct that they are safe in event handlers.
Take a look at this example for instance:
<?php $data = array(" onmouseenter=alert(1) " => "foo"); ?>
<a onclick="x = <?= json_encode($data, JSON_HEX_QUOT | JSON_HEX_APOS); ?>">test</a>
Resulting in:
<a onclick="x = {" onmouseenter=alert(1) ": "foo"}">test</a>
I think you are safe if you always enclose your attribute values in single quotes, but still this feels a little risky.
JSON_HEX_AMP
Quoting your reference here:
For compatibility with XHTML non-CDATA script blocks, do & as well.
So since you are doing HTML5 this doesn't apply to you, and I don't think theres a security vulnerability here anyway. Still, it doesnt hurt to encode.
Conclusion
- For your usage - inside a script tag in HTML5 - just using
JSON_HEX_TAG
is enough.
- Doing this inside attributes (event handlers) is dangerous, at least unless you enclose in single quotes.
- If I were you, I would create a little helper function called
safe_json_encode
that uses all four flags, and then only use it in script tags. Encoding more than necesarry does not hurt you.
Further notes
- Make sure to serve the page with a correct content type and character encoding. Messing this up can lead to ways to bypass the encoding.
- If you later output the variable values to HTML, you need to think about XSS again. Using values from your JSON in e.g.
innerHTML
or document.write
will not be safe. (I see you deal with this on your second line of script.)
Disclaimar: This is based on research I made today. I'm no PHP guru. If you are relying on this for something critical, you probably want to do some more research on your own to make sure I am not missing something here.