Why does XSS affect so many websites?

Question

According to an article I read 65% of all websites globally suffer from XSS. Why can't developers find and fix it?

Please help me understand. I'm not from a security or tech background.

If this ends up as another [famous question](https://security.stackexchange.com/questions/128412/sql-injection-is-17-years-old-why-is-it-still-around?lq=1), I call dibs on the rest of the **Why does `X` still exist** series... — Jedi, Jul 07 '16 at 13:14
Another remark: XSS is still widespread, because having an XSS vulnerability doesn't hurt as much as other vulnerabilities do. — Lukas, Jul 07 '16 at 13:19
I said it once and I'll say it again, lack of awareness. I spoke to a developer once with 10+ years of experience in web development that had never even heard of XSS. — Paradoxis, Jul 07 '16 at 14:48
@Jedi Why does why does `x` still exist questions still exist? — smilebomb, Jul 07 '16 at 15:21
Just like SQL injection and similar vulnerabilities, it's all about stupidity and people who call themselves "developers" when in reality they shouldn't even be allowed to be near a computer. — André Borie, Jul 07 '16 at 15:21
@AndréBorie if by "stupidity" you mean "ignorance", then I can agree with that, but how does that have anything to do with people who call themselves "developers" or: engineer, programmer, coder, architect, sr. professional e-mailer, customer handler, pretender of doing hard work, unit test writer, build breaker extraordinaire, meeting engineer, legacy code maintainer, browser of the webs, corrector of wrong people on the internet; for that matter? — gattsbr, Jul 07 '16 at 18:53
@Jedi, the question you refer to was asked by the same author as this one. This made me to think if this is some sort of smart attempt to bring visitors to the website (commercial) that is mentioned in both questions. — VL-80, Jul 07 '16 at 19:44
two words: backwards compatibility. That said, i don't think 65% of all sites even accept user input, especially since comments are now outsourced. A basic CSP will stop XSS in it's tracks, so if you do show other's submits, make sure to use a content security policy http header. — dandavis, Jul 07 '16 at 20:02
Are you kidding? You already asked one terrible question which attracted many upvotes and now you come again. Waiting for your next question: according to X many people have bad passwords, help me to understand why. According to X many people store passwords in clear text. Why? — Salvador Dali, Jul 08 '16 at 04:14
@RoryAlsop and not a surprise that after one bad question you will receive many of the same kind-of-useless questions. Still waiting for my ultimate question: "according to X people still write program with bugs. I am not technical, so can anyone explain me why can't developers just learn how not to make bugs" — Salvador Dali, Jul 08 '16 at 04:19
@VL-80 he has asked 3 questions and all are referencing this website. Something certainly is wrong — Limit, Jul 08 '16 at 06:49
@AndréBorie inexperience !== stupidity. ignorance !== stupidity. let's help them instead of calling them names. That is what this site is for, after all. — jammypeach, Jul 08 '16 at 09:39
@jammypeach that is, assuming they actually want to be helped. — André Borie, Jul 08 '16 at 09:44
@AndréBorie if they are here, reading this, then I'd say they probably do. — jammypeach, Jul 08 '16 at 09:45
@jammypeach sadly, I doubt the people who we're talking about are reading this. Rather, I think they are making yet another vulnerable and awful app by Googling "how do I PHP" and taking the first, outdated link instead of reading proper documentation. — André Borie, Jul 08 '16 at 09:47
@IshanMathur - you lie: https://www.linkedin.com/in/ishanmathur You ***WROTE*** the article. — schroeder, Jul 08 '16 at 11:30

score 32 · Answer 1 · edited Mar 17 '17 at 13:14

32

XSS is a form of code injection, i.e. the attacker manages to inject its own malicious code (usually JavaScript) into trusted code (HTML, CSS, JavaScript) provided by the site. It is similar to SQLi in that it is caused by dynamically constructed code, i.e. SQL statements, HTML pages etc.

But while there are established techniques to solve SQLi (i.e. use parameter binding) it is even harder with XSS. To defend against XSS every user controlled input must be properly escaped, which is much harder than it sounds:

Escaping rules depend on the context, i.e. HTML, HTML attribute, XHTML, URL, script, CSS contexts all have different escaping rules.
These contexts can be nested, i.e. something like <a href=javascript:XXX> is a JavaScript context inside a URL context inside a HTML attribute.
On top of that you have encoding rules for characters (UTF-8, UTF-7, latin1, ...) which again can be context specific.
The context and encoding need to be known when constructing the output: while most template engines support proper HTML escaping they are often not aware of the surrounding context and thus cannot apply the context specific escaping rules.
On top of that browsers can behave slightly different regarding escaping or encoding.
Apart from that you'll often find in grown code with no proper separation of logic and presentation that HTML fragments gets stored in the database and it is not always clear to the developer if a specific value is HTML clean already or need to be escaped.

And while there are some simple ways to protect against XSS by design (especially Content-Security-Policy) these only work with the newest browsers, have to explicitly enabled by the developer and are often only effective after large (and expensive) changes to the code (like removing inline script).

But still, there is a good chance to do it right if you have developers with the proper knowledge and are able to start from scratch using modern toolkits which already take care of XSS. Unfortunately in many cases there is legacy code which just needs to be kept working on. And development is usually done by developers which are not aware of XSS at all or don't know all the pitfalls.

And as long as web development is done in environments very sensitive to cost and time the chances are low that this will change. The aim in this environment is to make the code working at all in a short time and with small costs. Companies usually compete by features and time to market and not by who as the most secure product. And XSS protection currently just means additional costs with no obvious benefit for many managers.

edited Mar 17 '17 at 13:14

Community

1

answered Jul 07 '16 at 11:54

Steffen Ullrich

184,332
29
363
424

5

Can't you use the same encoding (`<>"'&` to their equivalent entities) for HTML, HTML attributes and XHTML? – CodesInChaos Jul 07 '16 at 13:26
5

@CodesInChaos: If you have a pure (X)HTML context this should be ok. But there are slight differences in Javascript context in HTML vs. XHTML context. Try `'; alert(foo); ]]>` within HTML and XHTML context. – Steffen Ullrich Jul 07 '16 at 13:53
What difficulties would there be with having a DOM object attribute which would indicate "This object will never legitimately contain any 'dangerous' objects or attributes outside certain categories, so the effects of any such objects/attributes should be stifled"? While ensuring safety with browsers that lack such a feature might be difficult, having a means of setting one string to create some text that includes bold and italic would seem much cleaner than having to assemble such text as a bunch of DOM objects. – supercat Jul 07 '16 at 16:55
2

@supercat: have a looking at [Content Security Policy 2.0](https://www.w3.org/TR/CSP/) which has nonces to mark parts where inline script is safe and can otherwise forbid inline script at all. But of course you need [browser support for this](http://caniuse.com/#feat=contentsecuritypolicy2), support in the frameworks and knowledge of the developers that this feature exists at all and why they should use it. – Steffen Ullrich Jul 07 '16 at 18:44
2

@supercat: markdown is a safe way to allow user markup. – dandavis Jul 07 '16 at 20:05
This answer explains perfectly to me why I have a fairly good grasp of what SQL injection is but really can't talk about XSS beyond "Uhh, that's where one visitor to a site is able to use weaknesses in the site to attack another visitor. The details are complicated. Now, let's talk more about [anything else]." – mostlyinformed Jul 07 '16 at 22:48
Excellent answer for pointing out that XSS is actually just another form of injection. When everything is a single base string type and hybrid control/data structures are created by concatenating strings, it _feels_ very complicated to apply the correct rules at the correct time. However, if we used more specialized string types (and a strongly typed language), it would be much more obvious when we make mistakes, e.g. you can't pass a UrlQueryParameterString where an HtmlAttributeValueString is expected. – Mark E. Haase Jul 08 '16 at 05:50
The last paragraph is most significant here. Nothing will change while dev contracts are put out to tender without security as a primary concern from the outset. – kaybee99 Jul 08 '16 at 10:03
@dandavis False. Many Markdown processors allow HTML through unfiltered, and even those that don't are likely to be manipulable to generate unsafe output. See for example https://github.com/showdownjs/showdown/wiki/Markdown's-XSS-Vulnerability-(and-how-to-mitigate-it) It would be possible to design a markup language which was secure-by-design, but Markdown is definitely not that language. – IMSoP Jul 08 '16 at 15:36
@IMSoP: Just because there are poorly-written markdown libs out there doesn't make markdown itself unsafe. Markdown itself is completely harmless. I use/recommend https://github.com/chjj/marked, which restisted my (albeit brief) attack attempts, and it's used far and wide, so any holes would be quickly patched; ex: https://github.com/chjj/marked/issues/203 (about 48 hours) – dandavis Jul 08 '16 at 16:40
@dandavis I didn't say Markdown was inherently unsafe, I just said it wasn't inherently safe. The safety is an aspect of the parsing library used, whether that library is parsing Markdown, BBCode, or a restricted subset of HTML. So your bald statement that "markdown is a safe way to allow user markup" is dangerously misleading. (I'm also not sure what that comment is doing attached to this answer, and apologise to the answer's author for the topic drift.) – IMSoP Jul 08 '16 at 17:17
@IMSoP:Is "driving" unsafe because there's a Corvaire? Arguable i guess... My initial comment was to supercat, who talked about having a safe mechanism for untrusted content presentation, which markdown provides for many sites, very similar to his "restricted html" concept. Reading his comment, i thought "markdown"... We could debate semantics of "false", "safe", implication/negation, etc, but I'll concede your point about wording/hijacking as i forgot about all those older libs that _could_ be used; good point! – dandavis Jul 08 '16 at 19:18

Anders · Answer 2 · 2016-07-07T20:59:40.343

It seems easy, but is hard

First of all, the attack surface is huge. You need to deal with XSS if you want to display user input in HTML anywhere. This is something almost all sites do, unless they are built purely in static HTML.

Combine this with that fact that while XSS might seem easy to deal with, it is not. The OWASP XSS prevention cheat sheet is 4 000 words long. You need quite a large sheet to fit all that text in.

The complexity arise because XSS works differently in different contexts. You need to do one thing if you are inserting untrusted data in JavaScript, one thing if you insert into attribute values, another thing if you insert between tags, etc. OWASP lists six different contexts that all need different rules, plus a number of contexts where you should never insert data.

Meanwhile developers are always searching for panacheas and silver bullets. They want to deletage all the hard work to a single sanitize() function that they can always call on untrusted data, call it a day, and go on with the real programming. But while many such functions have been coded, none of them work.

Let me repeat: XSS might seem easy to deal with, but it is not. The main danger lies in the first part of that sentence, not the second. The percieved simplicity encourage developers to make their own home brewed solutions - their own personalized sanitize() functions, filled with generalisations, incomplete black lists, and faulty assumptions. You filtered out javascript: in URL's, but did you think about vbscript:? Or javaSCripT :? You encoded quotes, but did you remember to actualy enclose the attribute value in quotes? And what about character encodings?

The situation is quite similar to SQLi. Everybody kept looking for the one sanitation function to rule them all, be it named addslashes, mysql_escape_string or mysql_real_escape_string. It's cargo cult security, where people think it is enough to ritually call the right function to appease the Gods, instead of actually embracing the complexity of the problem.

So the danger is not that developers are blissfully unaware of XSS. It is that they think that they got the situation under control, because hey, I programmed a function for it.

But why is it hard?

The web suffers from the problem that structure (HTML), presentation (CSS) and dynamics (JavaScript) are all mixed into long strings of text we put in .html files. That means that are many boundaries you can cross to move from one context to another.

Again, this is the same as with SQLi. There you mix instructions and parameter values in one string. To stop SQLi, we figured out that we have to separate the two completely and never mix them - i.e. use parametrised queries.

The solution for XSS is similar:

Separate the scripts from the HTML.
Turn off inline scripts in the HTML.

We created an easy way to do the second part - Content Security Policy. All you need to do is to set a HTTP header. But the first part is hard. For an old web app a complete rewrite might be easier than to surgically disentangle the scripting from the HTML. So then it is easier to just rely on that magic sanitize() and hope for the best.

But the good news are that newer apps often are developed after stricter principles the use of CSP possible, and almost all browsers support it now. I'm not saying we are going to get the percentage of vulnerable sites down to 0%, but I am hopeful that it will fall a lot from 60% in the coming years.

TL;DR: This turned into a rather long rant. Not sure if there is much useful information in here - just read Steffen Ullrichs great answear if you want a clear analysis.

Another thing you may want to add while you are at it later, is overlong UTF-8 sequences. — user, Jul 07 '16 at 14:18
I wrote [my own sanitize function](https://gist.github.com/rndme/709b79625af301d76bd432dfb2ad8feb), and i thought it was easy, what am i missing? it's only hard if you try to implement 1,000,001 things yourself. I run `marked` on the sanitized text to get clickable links and allow markdown formatting, and from everything i've seen, it's safe and sound. It was harder back in the IE days, but most bugs have been fixed, so now it's only gaping holes that need patched up. — dandavis, Jul 07 '16 at 20:08
@dandavis I think the function works fine to stop XSS between tags. If you use it sanitize attribute values when you concatenate a HTML string it would not work, since it does not escape `'`. So understanding the context is key. — Anders, Jul 07 '16 at 21:05
really good to know, thanks! you're right that there's not something equally simple for attribs, since `new Option(0,str).outerHTML.split('"')[1]` doesn't work in IE/Edge... — dandavis, Jul 07 '16 at 21:59

score 7 · Answer 3 · edited Jul 07 '16 at 11:35

7

My set of opinion on security and XSS:

Rule of programming: You can't know everything. Sooner or later you are going to make a mistake.
Rule of programmer: A programmer works 12h a day: 3 is discussing with other programmers random things, 3 is thinking at other things, 3 is discussing on what it should code, 3 it's programming .... projects are made for 12h a day of almost non stop programing.
XSS is simple. And there are a lot of inputs waiting for XSS.
Security still comes last and most of the time is seen as a deficient.

edited Jul 07 '16 at 11:35

Ovifdiu COmrasdd

60
6

answered Jul 07 '16 at 11:31

Lucian Nitescu

1,802
1
13
27

11

Why is #2 relevant here? – Hatted Rooster Jul 07 '16 at 13:56
8

Sorry, but my work day is 8 hours, not 12 hours. MAYBE 10 if I include my commute. – Nzall Jul 07 '16 at 14:56
1

@GillBates I think the idea is something like: programmers already have a full day, not all of which it is humanly possible to be "productive" and so adding more concerns would require re-engineering how they spend their time. Might be worthwhile for management to give them 6 hours a day, and streamline communication and meetings, then expect real productivity. Managing their own expectations of what is reasonable would be good, too. – Jul 08 '16 at 23:26

score 6 · Answer 4 · edited Mar 17 '17 at 10:46

6

As mentioned in the answer to a similar post of yours (SQL injection is 17 years old. Why is it still around?):

There is no general fix for SQLi because there is no fix for human stupidity

Developers sometimes get lazy or careless and that causes them to not check the application they are developing.

Another popular reason is that the developers aren't aware that there is a security issue. They never went through any security training and so they don't know what is a potential security threat and whats not. And for company's that don't understand that a penetration test is important this lack of security knowledge is a potential threat by itself.

Simple security issues such as SQLI and XSS will only be fixed when a company decided to spend time on code review and testing. Until then you have 65% of all websites globally suffering from the XSS vulnerability.

edited Mar 17 '17 at 10:46

Community

1

answered Jul 07 '16 at 11:31

Bubble Hacker

3,615
1
11
20

3

I think the use of the word "stupidity" in this context is rather extreme, insulting and simply incorrect. Programmers who leave their code open to such vulnerabilities are most likely not stupid, but instead (a) ignorant, (b) unskilled or (c) under a lot of pressure from management to deliver features fast and cheap (while security concerns are last on the list, or not at all). – Radu Murzea Jul 08 '16 at 07:48
@RaduMurzea I completely agree with you, you are more then welcome to edit my answer! – Bubble Hacker Jul 08 '16 at 08:14

dandavis · Answer 5 · 2016-07-08T16:34:35.373

The root problem

The web was simply not designed to allow secure multi-authorship or rich interaction. Nobody talked about separating content from presentation until the late 1990s. By that point, like the QWERTY keyboard, we were basically stuck for no good reason with an existing system. Nobody wanted to "break the web", so mistakes were copied and ported to browsers beyond the "prime vector".

The aggravating factor

Before ~2005, virtually everything you saw on a given website was authored or approved by the people who control the site. As the "Web2.0" movement progressed, and people demanded interaction (at least add comments, come on!), the pressure to add these features to existing sites was great. Web folks did what they could to satisfy boss's demands to accomplish a task they knew nothing about: secure interaction. And for a while, probably too long, they got away with it. How many folks buy burgler alarms before they've even been broken into?

Modern Times

Using a CSP will greatly reduce the harm potential of dropping another ball. That doesn't mean you should forgo validation and sanitation by any means, because of Save As.. copies, cached w/o headers, etc, but it does mean that you can be a LOT safer with a couple hours of effort. Yes, the header is long-winded and in a confusing format at first, but invest the hour or two it takes to grok, and the few more it will take to test your site. CSP is really easy to test since you can do a silent warn-only run (look it up).

If you roll a CSP and take traditional measures, those old browsers will die when the hard drive stop spinning, and we'll all live safe and happily ever after...

With Solid State Disks, we might end up with long-term problems down the road. Y2038 anyone? — , Jul 08 '16 at 23:22

Giacomo1968 · Answer 6 · 2016-07-08T16:27:59.527

Others have touched on the classic issues surrounding systems designed by humans for other humans: The reality is laziness and—sometimes—stupidity coupled with “Why would this happen to me?” arrogance.

Oh, how many hours of my life have been spent patching systems and—more importantly—fighting with management to get the time/resources allocated to patch systems so they can be “bullet proof.”

And I’m lucky in that respect: While I am frustrated fighting with management at an organization to get a system patched, many places hire developers to develop a system but then don’t consider basic (and boring)maintenance to be something worth investing in. Why pay a tech a monthly retainer to maintain a system when it is often cheaper to let the system stand until it collapses and then chase a fix in the middle of a breach?

An ounce of prevention being better than a pound of cure often takes a backseat to folks being pennywise, pound-foolish.

That said, the best thing anyone administering a system can do is to have a solid disaster (or non-disaster) recovery plan in place. Have the code version controlled, have the server deployment setup automated in a provisioning script and have backups of databases. That way if/when a system goes down, cleanup is quick instead of being a tedious disaster.

BitsInForce · Answer 7 · 2016-07-09T07:44:00.963

The accurate detection of dangerous security vulnerabilities such as SQLI and XSS during code implementation phase of SDLC is still limited to the type of programming language.

Static analysis for such vulnerabilities is most widely done in PHP.
Some dynamic methods employ fuzzy techniques during testing or deployment phase.

Unfortunately, both methods are still rather limited in their rate of accuracies. Furthermore, XSS and SQLI vulnerabilities target computer applications and can appear in so many disguised forms that are unknown to software developers. As such, I don't think it is a question of stupidity. I can assure some companies have already trained their developers for secure programming but cannot totally block these vulnerabilities. I would agree that there is much more that needs to be done to prevent such critical attacks.

The hackers seem to be able to find and exploit the vulnerabilities, probably using automated tools to do so. Perhaps we would be better off simply hiring them? At least they would be trying to fix problems instead of cause them. For enough money, they would probably take the bait. — , Jul 08 '16 at 23:21

Why does XSS affect so many websites?

7 Answers7

It seems easy, but is hard

But why is it hard?

Related