3

I've been asked to compile a list of the number of published web vulnerabilities or exploits grouped by platform.

With the understanding and caveat that numbers are just statistics, shouldn't be used to predict the future, are totally unfair, worse than lies and damn lies, and probably will kill your cat, etc., etc.: Do these numbers exist anywhere? How can such a list be compiled?

My attempt
I figured the CVE database was a good place to start; you can download their entire database in CSV format and then just grep it. Which seemed reasonable enough, except it appears that there isn't enough data in each row.

For example, of the 59 CVEs this year that mention "Wordpress", there are 14 that don't contain the letters "php". So clearly grepping for "php" won't net the right result. Presumably neither will any of the others.

The data don't have to be perfect, they just have to be reasonably substantiated (i.e. not pulled out of one's derriere), and most importantly the measurement should be neutral (that is, shouldn't obviously favor or penalize any given platform), except to the extent that any one platform is naturally more popular than another.

Recent numbers (last year, this year, whatever) would be best, but whatever exists is even better.

tylerl
  • 82,225
  • 25
  • 148
  • 226
  • 4
    You should include cats as a possible target platform. I'll start you off: all of them are vulnerable to a laser-pointer DoS. – Polynomial Sep 21 '12 at 08:36
  • 1
    @Polynomial: I never thought about it this way, but I really can't disagree. Tylerl: You have to remember that any such data is negatively skewed to OSS: even trivial vulnerabilities are disclosed, while with proprietary software it's basically critical and nothing else. – Hubert Kario Sep 21 '12 at 12:20
  • @HubertKario that's really not true at all. See http://security.stackexchange.com/q/4441/33. Not every OSS vulnerability necessarily reaches CVE, and plenty of non-critical vulnerabilities in closed-source do. – AviD Sep 23 '12 at 16:16
  • 2
    Can you define what you mean by "platform"? Do you mean the programming language (Java, ASP.NET, PHP) that the web application is written in? The web programming framework that the web application is written on top of? The web application that is running? Something else? – D.W. Sep 23 '12 at 17:57
  • 1
    @AviD: (I don't argue one way or the other now) OK, so let's assume that you're right. Then there is a *positive* skew to OSS. What I wanted to say, is that because of different development practices, there *is* a skew in the data based only on number of vulnerabilities. (following is with RMS hat on ;) The only proprietary closed source vendors I deal with are MS and Adobe (Flash only). I don't remember if I ever saw medium or low risk vulnerability fixed by MS Update hotfix... (which would show that the data is non-uniform, on top of being skewed) In conclusion: this data is useless. – Hubert Kario Sep 23 '12 at 22:06
  • 1
    @HubertKario I dont disagree that the data is non-uniform, not to mention incomplete and partial. My only point is that it is not skewed in *any* particular direction, in fact the randomness of the incompleteness is the only uniform aspect. On the other hand, I dont think this data is useless, it has value even without being accurate. – AviD Sep 24 '12 at 00:21
  • 1
    @D.W. I agree that "Platform" is a bit vague. But I was thinking "PHP" as opposed to "Wordpress" or "Joomla" but at the same time ".NET" as opposed to "C#" or "VB.NET" and "Java" as opposed to "Scala" or "Groovy". Perhaps "runtime" would be a more apt term? – tylerl Sep 24 '12 at 04:20
  • @AviD - I don't agree with HubertKario that the data is useless. But it might be worth considering potential biases. For example: commercial web applications are a lot more likely to use languages like ASP.NET or Java; open-source web applications are more likely to use PHP. Now consider that vulnerabilities in open-source web applications are more likely to be reported in a CVE than vulnerabilities in commercial web applications. The consequence: the database might be skewed towards PHP vulnerabilities, not because of inherent problems in PHP, but because of a non-reporting bias. – D.W. Sep 24 '12 at 04:25
  • @HubertKario "*Useless*" is a little extreme. *Imperfect* and *unfair* perhaps, but I already acknowledged that it *would* be unfair and necessarily so. If you could suggest even a single security-related comparison that *is* fair, I'm interested, but AFAIK, there are none. – tylerl Sep 24 '12 at 04:26
  • @D.W. I absolutely agree that there are biases aplenty in the CVE data, but my point is that these biases are *not uniform*, and thus impossible to reliably account for. Besides, as I've said, I dont believe that OSS will automatically and necessarily report more than commercial. Sure, anecdotally this might be common, but not every reported bug in OSS is automatically redirected straight to CVE, just as not every bug disclosed to commercial companies is squashed and hidden. – AviD Sep 24 '12 at 08:38

4 Answers4

3

I know of two pieces of research in the research literature that study the incidence of vulnerabilities on different platforms:

  • An Empirical Analysis of Input Validation Mechanisms in Web Applications and Languages. Theodoor Scholte, Davide Balzarotti, William Robertson, Engin Kirda. SAC 2012.

    • This paper analyzes over 7000 CVEs that reported XSS and SQL injection vulnerabilities in 79 web applications. Then, it uses this data to estimate the prevalence of different kinds of input validation vulnerabilities in different web programming languages.

      The paper's analysis finds some differences between languages. For instance, in the data set, 50% of web applications were built in PHP, but 80% of SQL injection vulnerabilities were in PHP applications, so the paper concludes that web applications written in PHP may be especially at risk for SQL injection. In contrast, web applications written in Java appear to be less prone to XSS and SQL injection vulnerabilities. See the paper for more discussion.

  • Exploring the Relationship Between Web Application Development Tools and Security. Matthew Finifter and David Wagner. WebApps 2011.

    • This paper analyzes the relationship between the choice of programming language and web development framework vs security, by looking at code from 9 teams who built the same web application (same spec, same functionality) in different languages.

      The paper does not find any evidence that choice of programming language affects the security of the resulting application. The experiment does find evidence that the degree of support for security (particularly automatic support for escaping, sanitization, session management, etc.) benefits security. It concludes that, because of the small sample size, additional experiments would be helpful to gain a deeper understanding of the effect of programmer tools on web security.

D.W.
  • 98,420
  • 30
  • 267
  • 572
1

The national vulnerability database is a good reference for statistical data but I'm not sure if they have a listing per platform but you can check it out. http://web.nvd.nist.gov/view/vuln/statistics

John Santos
  • 633
  • 3
  • 9
1

Secunia has in their site some statistical information about vulnerabilities divided by vendor and product. For example to Windows XP Professional:

http://secunia.com/advisories/product/22/?task=statistics

They collect information since 2003. The page is mainly composed by graphics and at this time most of them are being showed.

Pipe
  • 234
  • 1
  • 2
  • I *think* the question is asking about vulnerabilities in web applications (I agree the question is not 100% clear, but that's how I'm reading it). That Secunia site doesn't seem to have statistics on web applications. – D.W. Sep 24 '12 at 04:28
0

I did something similar for a client a few years ago, and I think your direction of CVE is a decent one.
However, as you saw, it is not completely straightforward, nor will you ever be able to get accurate stats - often, the platform is not mentioned at all.

What you will need to do, is create a list of terms that are common for each platform you're interested in, and then accumulatively grep for all of them. E.g. for PHP you would need php wordpress zend drupal joomla etc.

Yes, you will need to build this up iteratively. No, you shouldnt ever expect to have the complete set of stats, but it can give you some kind of picture.
Anyway, CVEs are not all of the known vulnerabilities in these products, not to mention the fact that home-grown apps or specific sites built on these platforms would not hit CVE either...

AviD
  • 72,138
  • 22
  • 136
  • 218