19

IIUC the Heartbleed vulnerability happens due to a bug in the C source code of OpenSSL, by performing a memcpy() from a buffer that is too short. I'm wondering if the bug would have been prevented automatically in other languages that have higher-level memory management systems than C or C++.

In particular, my understanding is that Go, D and Vala each compile to native code, don't need a VM to run, and should allow writing native libraries which provide a C-compatible binary interface.

So, could these languages be used to implement the OpenSSL library with the same interface, and would they offer protection against bugs like the Heartbleed vulnerability?

oliver
  • 541
  • 4
  • 10
  • 10
    Bugs happen. The moment you start assuming that using a particular programming language will protect you from bugs, they will happen *more* and it'll be all the worse when (not if) they happen because you never assumed they could. – Shadur Apr 09 '14 at 07:25
  • 6
    @Shadur But not all bugs are created equal, and different languages have propensities to different types of bugs. It feels disingenuous to hand-wave the issue by saying "you'll always have bugs", which is true but only vacuously. It's like suggesting that people still die in car accidents, therefore seat belts are ineffective at preventing deaths. It's not that I disagree with you in spirit, but it's silly to expect certain classes of bugs that can only occur in an unsafe language to crop up in a safe language. – Doval Apr 09 '14 at 11:53
  • 1
    @Doval It's equally disingenuous to suggest or imply (as the question does) that if only those poor ignorant people had seen the light and written openSSL in a "*proper*" language this problem wouldn't have existed. To use the metaphor, I'm not saying seatbelts are useless, I'm saying that the asker is erroneously implying that wearing one protects you from all possible accidents. – Shadur Apr 09 '14 at 13:01
  • 2
    @Shadur But no one suggested such a thing. To ask if *this particular bug* could've been prevented in a safe language is a perfectly legitimate question. If the OP were under the impression that a safe language could prevent *all* bugs OpenSSL has had/will have, that'd be different; but he was specific in mentioning `memcpy` and "higher-level" memory management. – Doval Apr 09 '14 at 13:07

5 Answers5

20

Actually none of these languages would have prevented the bug, but they would have lessened the consequences.

OpenSSL's code is doing something which, from the abstract machine point of view, is nonsensical: it reads more bytes from a buffer than there actually are in a buffer. With C, the read still "works" and returns whatever bytes lingered after the buffer. With stricter languages, the out-of-bounds memory access would have been trapped, and triggered an exception: instead of reading and sending the bytes, the offending code would just crash, leading (in the context of a Web server) to termination of the current thread and probably closure of the connection, without altering the rest of the server.

So, still a bug, but no longer a real vulnerability.

This has only an indirect relationship with automatic memory management. The real safety net here is the systematic array bounds check on accesses. That systematic check is then indirectly supported by the strict typing (which prevents using anything else than an "array of bytes" as an "array of bytes"). Strict types are themselves indirectly supported by the automatic memory management (the GC), because it prevents dangling pointers and thus the use-after-free conditions which violate strict typing.

The recent "heartbleed" is not qualitatively new thing; OpenSSL already has had quite a few of buffer overflow-related bugs over the years (a number of which coming from the ASN.1 handling code, for certificate parsing). This one is just another to the list.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
16

If you consider the bug as reading out of bounds of the current structure, than this would probably have been prevented in other languages, because one does not have unbound access to memory and would need to implement these things differently.

But I'd rather would classify this bug as missing validation of user input, e.g. it believes that the size sent in the packet is actually the size of the payload. These kind of bugs are not simply fixed by using another language.

Steffen Ullrich
  • 184,332
  • 29
  • 363
  • 424
  • So the bug would be there but it wouldn't be a huge security flaw – Daniel Little Apr 09 '14 at 00:48
  • 3
    Exactly this bug may be not there or does not have the consequences, but there are enough serious input validation bugs, which might have a similar severity. Just look at all the shiny Web2.0 world which does not have the problems of unbound reads but XSS and CSRF attacks to compromise millions of routers and their users. – Steffen Ullrich Apr 09 '14 at 03:43
  • But detection of unsanitized input data *can* be a language feature, as it was in Netscape JavaScript and is today in Ruby. See my answer below – piers7 Apr 10 '14 at 14:26
10

Unfortunately, the bug would not have been prevented, because OpenSSL uses its own memory allocator, rather than the one provided by the system.

The buffer from which the infamous heartbeat data is read is allocated by a function called freelist_extract in ssl/s3_both.c. This function, by default, manages OpenSSL's own list of used/unused memory, and does none of the modern safety checks.

Even if it had been written in another language, assuming that OpenSSL had still kept maintaining its own buffer allocator, then this bug would have happened just the same. By reusing a previous buffer structure, regardless of the programming language, the memcpy or "buffer copy" function equivalent would have done the same thing without raising any errors.

In a modern programming language, this would be something like:

request = last_used_buffer;
/* I'm sure it doesn't actually read bytes like this, but you get the idea */
while (byte = read(connection)) {
    request[i++] = byte;
}

/* ... some time later, in the heartbeat processing function */

output = new Buffer();
output.write(header);
output.write(request, start, len); /* dutifully copies from the request buffer,
                                      but since end was not checked, it can copy
                                      bytes from last_used_buffer */

If instead, OpenSSL had been directly using the system (libc) malloc and free rather than its own allocator, this bug might have been caught a couple of years ago. Many libc implementations provide much better bounds checks on allocated/freed memory, and tools like valgrind could have picked up this bug easily.

This consequence of OpenSSL's memory allocator in the working of the heartbleed bug was mentioned by Ted Unangst at: http://www.tedunangst.com/flak/post/heartbleed-vs-mallocconf

codebeard
  • 251
  • 2
  • 3
7

I think I can answer this question for the specific case of a crypto library written in Go---it is easy, and not at all hypothetical, because there already is a standalone TLS package, crypto/tls in Go that does not depend on any outside library.

Whilst with regard to typical buffer overflows, idiomatic Go is much safer than traditional C, Go offers the keen developer plenty of options to sidestep it---such as imitating C's pointer arithmetic through Go's unsafe.pointer. One wonders if one could agree not to use fragile code in a critical piece of software.

Cryptography, of course, is exactly the kind of software using such fragile code, for good reasons. After all, the constant-time comparisons implemented in the Go package crypto.subtle do indeed require, and have, similarly dire warnings about careful use as those from unsafe. The only remaining question, really, is if any bug can still survive in such an environment.

As far as I can tell, Go indeed implements constant time comparisons of hash values correctly. I haven't bothered to even look if complicated hashes involving S-boxes are calculated in a constant-time fashion---nobody bothered putting them into a package with names such as subtle and warnings about how easy it is to break things, so I'm actually doubtful.

What I did check is that elliptic curve cryptography is not implemented in a constant-time fashion in Go. Nobody even seems to have considered trying in the least---the implementation calls many arbitrary-length integer functions not even designed for cryptographic use, and indeed using non-constant-time algorithms. When this kind of timing side-channel was recently shown to also happen in OpenSSL on one architecture, it was good enough for a paper demonstrating private key compromise.

The same, continuing situation exists in Go, on all architectures. And, well, it doesn't really seem anyone cares about such details as actually working crypto; the focus is just on readability and speed (well, and working in the way that the casual user and some unit tests are fooled by it, I guess). After all, why would you even bother choosing a suitable algorithm when the language already keeps you safe from most ways of introducing buffer overflows, and when choosing a correct algorithm risks ruining Google's assertion that TLS has become computationally cheap? Sarcasm aside, making the language responsible for catching our bugs will just mean all those bugs the language cannot catch for us will still be there.

Finally, libraries like OpenSSL have the advantages that timely bugfixes are a reasonable possibility. Whilst in principle the same is true for Go packages, once something becomes part of Go's core, like the TLS package, it does become affected by the release schedule. Obviously that can interfere with timely deployment of bugfixes. I suppose Go-1.3, due this summer, will certainly not fix the ECC timing issue even if it were recognized as the critical issue it is, given the feature freeze.

  • So do I understand this correctly: the advantages of Go would be nullified for a good OpenSSL reimplementation, because the code would have to be written with unsafe low-level primitives to prevent timing attacks? That's something I would never have thought of. – oliver Apr 09 '14 at 09:26
  • No; if my answer is giving that impression, then my answer is wrong. The point is that a very similar kind of danger lurks in high-level aspects, which are in at least as bad a shape in the existing Go library as they are in OpenSSL. –  Apr 09 '14 at 10:11
  • You're not answering the OP's question, though you make a fair point. I'd say "a safe language would indeed prevent an out-of-bound read, which is what you ask, but since you need to reimplement crypto software, you still need an expert security audit to prevent crypto high-level bugs". Writing OpenSSL code needs that same auditing + preventing bugs only possible in an unsafe language. – Blaisorblade Apr 27 '14 at 16:58
  • So, if you want to reimplement SSL (which some people are doing, see PolarSSL), you need a crypto expert, and you might want to use a better language (alternatively, to incorporate static analysis in your workflow, like done for PolarSSL); I'd say the latter is an inferior alternative security-wise, but better than nothing. – Blaisorblade Apr 27 '14 at 17:02
4

Data tainting was implemented in Netscape JavaScript (navigator 3 and on the server in Enterprise Server) in response to fairly early realizations about the nature of security on the internet. All input coming from the user was considered tainted, unless the flag was cleared, and the taint flag spreads via operations on data (so the result of combining data and tainted data is considered tainted). As a result, tainted data could always be checked for.

This was over 10 years ago. It blows me away that this hasn't found its way into mainstream managed languages like Java or C# (or anyone else's JavaScript implementation for that matter).

If you combined compile-time data taint analysis with a safe memory model (managed code, or at least verifiable), you'd still be left with logic bugs, for sure, but you'd have eliminated whole categories of attacks at a stroke, including both contributing factors to this one.

piers7
  • 201
  • 1
  • 2
  • 3
    Data tainting is implemented in Perl way longer and you can use it still today. But, it will only enforce that you check the data somehow, not the quality of the checking. I've seen enough code which just check against wildcard to make the data untainted. And it does definitely not help to find out the exact logic you need to embed a string inside a JavaScript statement which is put into an HTML attribute etc and additionally work around browser specific differences in interpreting these data. But, I agree that tainting is a useful feature to solve part of the problems. – Steffen Ullrich Apr 10 '14 at 17:52