15

Buffer overflows are nothing new. And yet they still appear often, especially in native (i.e. not managed) code...

Part of the root cause, is usage of "unsafe" functions, including C++ staples such as memcpy, strcpy, strncpy, and more. These functions are considered unsafe since they directly handle unconstrained buffers, and without intensive, careful bounds checkings will typically directly overflow any target buffers.

Microsoft via SDL has banned use of those "unsafe" functions, and provides replacement functions for C++ - e.g. strcpy_s for strcpy, memcpy_s for memcpy, etc (depending on environment). Latest versions of Visual Studio will even let you do this automatically...

But what about "pure" C (i.e. not C++)?
And especially, what about non-MS platforms - including Linux and even non-VS compilers on windows...
Does anyone have safer replacement functions for these? Any recommended workarounds (besides simply doing more bounds checking...)?
Or are we all doomed to continue repeating our use of memcpy?

AviD
  • 72,138
  • 22
  • 136
  • 218
  • +1 interesting, but I can see how its argumentative and otherwise full of rhetoric. – rook Apr 20 '11 at 16:43
  • 3
    @Rook really? I reread it, a day later, and I'm still struggling to find anything remotely argumentative (except at the buffer overflows) or rhetoric - except for suggesting that Microsoft may have done something non-evil... Can you please point it out? (The fellows at the SO question were to busy shouting anti-MS slogans to give any information...) – AviD Apr 21 '11 at 10:25
  • ♦ Well yeah, I mean this was closed on SO for that reason. The Linux kernel acutally prefers using `strcpy()` for static text because it reduces overhead. To be honest I'm not convinced that a blind dedication to "safe functions" is the best solution. I think that implicit memory protection systems have shown that a buffer overflow alone is worthless. – rook Apr 21 '11 at 16:58
  • 3
    @Rook it was closed on SO for the reason I mentioned. strcpy for static text is different from user input, but debating whether or not to use "safe functions" isn't argumentative, it's part of the answer... so where is the argumentative? Please tell me and I will change it. – AviD Apr 21 '11 at 20:58
  • ♦ I don't care. Honestly there many different approaches to security work well, and are avidly disagreed with. – rook Apr 21 '11 at 21:24
  • 2
    @MikeSamuel `strncpy` is inefficient and unsafe: it writes as many '\0' as there are remaining `char` in the buffer (possibly 0 `\0`). The other functions can easily be misused. – curiousguy Aug 17 '12 at 00:49
  • 1
    @MikeSamuel in addition to what curiousguy noted, also there is no verification that the `size_t` parameter is *less* than the size of the buffer. Sure, in a perfect world, a careful uberprogrammer can perform those checks herself - but I have still found way too many mistakes, even off-by-one errors, even from some of the greatest programmers out there, to think that a mistake in this sensitive area is such a begone conclusion. – AviD Aug 17 '12 at 14:34
  • @AviD, fair enough. – Mike Samuel Aug 17 '12 at 15:06
  • 1
    I am wondering the same today in 2021. This was asked in 2011 and many attacks(heartbleed, wannacry(windows!!!)) were made since then which made use of buffer overflow. I don't understand why organizations don't enforce strict rules around usage of unsafe functions, is it because of severe loss of performance resulted by not using unsafe functions? – Nagarjuna Borra Nov 19 '21 at 08:05

8 Answers8

17

Let's be clear here.

  1. insecure is a matter of context, not a case of "the use of" a specific function. If I use memcpy insecurely on a process running as a user with little to lose, no setuid or any such flags, the "worst" I can do is get a shell for that user and go from there. To achieve privilege escalation you need to attack something you can fool into giving you higher privileges.
  2. C/C++ are not "dangerous tools". They are tools. Insert appropriate metaphor about dangerous DIY implement.

The fact is, as Rook has already said, C/C++ and other such other compiled-to-the-machine languages have a place in the world. They're for building fast systems, operating systems, system services etc. They afford you the ability to manage your own memory as you see fit. You're in control.

Unless you introduce some form of automatic memory management, there's no way of truly working out if you've gone past the allocated memory. So, ok, let's introduce a container. Now every memory access call that ever was needs to be checked. Is it in scope for that particular function? How many objects are pointing at it? Where are they in scope? Can I have pointers outside of the scope of the original reference and if so how do I keep track of it? Very quickly, you've got a virtual machine and so you have a managed language.

Also, as was demonstrated over on StackOverflow, it is possible to invoke memcpy_s in a way that is insecure anyway. It doesn't really solve the fundamental problem, just makes it a little harder to make mistakes.

That is the difference. C/C++ is fancy assembly with all the power you need to do anything. Java/Python etc protect you from that at a price: speed and power.

You've said repeatedly over the course of today (I've followed the SO version too) that you still haven't got an answer as to how you develop securely with C/C++ on other systems like Linux. Well firstly, with VS2005/2008 I pretty much habitually set CRT_SECURE_NO_DEPRECATE, but anyway, here's a few things you could do:

  1. You ask on StackOverflow. You take note, learn, read and re-read.
  2. Your compiler is (mostly) your friend. Listen to it. Use warnings. Turn them into errors. So cl /W4 and gcc -Wall -Werror -pedantic -std=c99. Yes, it is OTT in terms of error messages. But if you can't explain each one and justify why you're ignoring it, you don't understand your own code.
  3. Check your memory allocation. valgrind's default invocation checks your allocations and deallocations mean you're not losing memory. If you've got memory leaks, you haven't thought enough about your code. It's a good sign you've got off-by-one errors, invalid bounds checking etc.
  4. Use valgrind again. I've just picked this up, but look, there's an experimental over/under-run checker in valgrind. Will tell you if, for example, you crash off the end of the stack (possible, seeing as that's where most local variables are allocated) or outside the brk()'d heap.
  5. Use splint a.k.a. secure lint, take note of its output. Again, if you can't explain any output it gives, you don't understand your code.
  6. Inwardly digest the C Secure Coding Standard. Specifically relevant here:

    • STR31-C. Guarantee that storage for strings has sufficient space for character data and the null terminator.

      Copying data to a buffer that is not large enough to hold that data results in a buffer overflow. While not limited to null-terminated byte strings (NTBS), buffer overflows often occur when manipulating NTBS data. To prevent such errors, limit copies either through truncation or, preferably, ensure that the destination is of sufficient size to hold the character data to be copied and the null-termination character.

    • STR35-C. Do not copy data from an unbounded source to a fixed-length array

      Functions that perform unbounded copies often rely on external input to be a reasonable size. Such assumptions may prove to be false, causing a buffer overflow to occur. For this reason, care must be taken when using functions that may perform unbounded copies.

      Splint will provide you with some guidance similar to CERT. Note that memcpy is considered the compliant solution in the first example of STR35-C.

  7. If you are using C++, use std::string and boost::shared_ptr (and related; use the appropriate one). There is absolutely no argument to be malloc'ing and memcpying strings in C++ except for interaction with C. Even then string.c_str(), please and leave the manipulations to C++.
  8. Dynamically link where possible. If you've statically linked any 3rd party code into your app and that turns out to have a security problem, you've got it too and you have to redistribute your image too. Not only are shared objects just more convenient generally, you also get security updates by default. I know there are cases where you can't, that's why I say "where possible".
  9. Build this into your development cycle.
  10. Hope you didn't miss something.
  11. Keep up with relevant security community updates that might be relevant.
  12. Be nice to the security community. Acknowledge sec vuln's, take steps to fix them. All code has (or had them) bugs if it is worth running. Simple as that.

At the end of the day I don't think the problem is solved by "secure" functions. I think the problem is solved by the use of some decent tools, a proper development process that rejects poor code from critical versions (late betas and release candidates) and finally an awareness of good practice/current issues.

Finally, memcpy_s isn't part of the C99 standard. It's an extension to the C library (as in not part of the core) and therefore not guaranteed to be on the platform I'm using. memcpy is. For a software project that needs to be compiled cross platform, that'll probably be the deciding factor in which function to use.

  • 3
    A solid answer with zero argumentatitve content! +1 from me – Rory Alsop Apr 21 '11 at 08:24
  • As much as I dislike the 'Good Job' comments.... Very well put. – Scott Pack Apr 21 '11 at 12:32
  • Overall, generally good advice... But specifically wrt the question, there are a few good points here: `... makes it a little harder to make mistakes`, yes, that's kind of the point... and that's what I (and the other security pros) are trying to achieve. No silver bullet. Just trying to enforce and validate STR31/35... – AviD Apr 21 '11 at 21:33
  • The C++ comment wrt `std:string` is apt, however that is unfortunately not the case here... – AviD Apr 21 '11 at 21:34
  • I also want to raise 2 more points wrt `memcpy_s`: though it is not part of C99, it *is* part of ISO/IEC TR 24731. More importantly, memcpy_s (or anything similar) is just another "decent tool" that helps with "awareness of good practice" and enables me (the security auditor) to better validate the "poor code". I *know* it's not on the platform I'm using, that is why I am asking, since there is no reason to compile it cross-platform. – AviD Apr 21 '11 at 21:38
  • @AviD It is, agreed. However, let me add some context as to why people might consider Microsoft slightly... aggressive. MSVC10 does not even support all of C99; specifically, they support `stdint` functions but `stdbool` is missing and so is `inline` which could replace some of those horrid macros. Another feature missing from MSVC 64-bit is inline assembly, which their 32-bit compilers have had for a while I believe. –  Apr 22 '11 at 18:58
  • @AviD you also probably know gcc supports a number of extensions to C, like computed gotos, `__attribute__` decorators and others. Then there's Intel Compilers and LLVM. Getting software to work across all of these is hard work, so the common denominator is to just try and support a standard, like C89, given that C99 is missing from one of the fundamental targets. –  Apr 22 '11 at 19:01
  • 1
    Also, even if memcpy_s were widely used, I'd still advise what I suggest above. The existence of "secure" functions alone isn't going to make the problem of bad coders go away, especially when a friend of mine was taught to use `char` types to index arrays at one of the leading universities in the UK. I guess I'd like to say you should hire great programmers who know what they're doing but most companies don't have an ace team of C devs, so I guess for quality control purposes it isn't such an awful idea to enforce as a coding standard. –  Apr 22 '11 at 19:19
  • @Ninefingers, I understand from your comments that there are some anti-MS sentiment. However, I really don't care a whit about that, one way or another. It has no bearing on the question, which was "any more secure replacements for memcpy" (which has been proven time and again as probabilistically dangerous, that is, likely to be used in an unsafe manner). As an example, I mentioned the well-known bandaid that MS implemented (which btw, was not their "invention"). – AviD Apr 23 '11 at 20:54
  • 1
    Two practical points: cross-compilation is not an issue here, I don't need a common denominator (if I did, I'd probably consider Java... ;) ). Second, as you mention at the end of your last comment, most programmers are *not* great programmers, and that's exactly why I'm looking for this. I would also point out, this wouldnt help for BAD programmers, they can flub anything up... it's for the basically GOOD (but not supergreat) programmers, who will do the right thing when it's pointed out to them - it's for *them* that memcpy_s (or similar) would help immensely. – AviD Apr 23 '11 at 20:58
  • 1
    @AviD I was not being sarcastic. "as you probably know" was an acknowledgement that you probably do understand the different extra features of various compilers. I do not hold any anti-MS sentiment and what I'm stating is fact - go try compile `.c` files with `inline` prefixed functions and it won't work with MSVC, which from my point of view is a shame because I otherwise like Visual Studio. Also, in my final sentence, I practically agreed with your point on Good/Supergreat programmers. –  Apr 23 '11 at 21:39
14

And especially, what about non-MS platforms - including Linux and even non-VS compilers on windows...

There's a cross-platform open source project called CoreFoundation Lite as part of Apple's Core OS, which provides C types for safe manipulation of byte blocks, strings and other basic data types. Relevant to this discussion they implement type checking, bounds checking, memory management, and distinguish between mutable and immutable objects.

Any recommended workarounds (besides simply doing more bounds checking...)?

Microsoft's unsafe.h includes GCC support, by the way. The way it does this is to use GCC's poison pragma to cause an error whenever you use an unsafe function. I additionally wrote about that in a book I wrote (disclaimer: I wrote it).

Or are we all doomed to continue repeating our use of memcpy?

Fundamentally, the solution is "don't do that". So-called safe implementations can stop you from scribbling over memory that isn't yours, but still leave a lot of "abuse cases" open. Bounds-checking is only a small part of the problem. For example, let's say you malloc(1024), and decide to create a pointer to byte 56 and treat that as a reference to a particular struct that's 16 bytes long. If you then ("safely") copy 58 bytes over to the original pointer, then the copy operation will succeed but your structure is broken. Can you detect the breakage? Maybe not: it might still look like a valid structure (especially if you put a magic number at the end, or don't have a magic number).

The problems "stack smashing" and "heap corruption" are actually special cases of the problem "treat memory as a big untyped bag o' bytes". Some other examples of this problem include:

  • STR30-C. Do not attempt to modify string literals While mutable and immutable C strings look the same from the "big bag o' bytes" perspective, in fact attempting to edit a string literal in place leads to undefined behaviour. But code cannot tell whether a char * represents a character array or a string literal, so all of the pain of ensuring correct behaviour must be handled by calling code.
  • MEM01-C. Store a new value in pointers immediately after free() Because it's possible to have a reference to a block of memory that's no longer yours, and in general it's hard to tell whether it's valid or not.
  • MEM05-C. Avoid large stack allocations This one is particularly hilarious when you're dealing with someone who claims you just need to get your bounds correct and the rest falls into place. With this problem, you can have entirely internally consistent and correct code that nonetheless smashes the stack.

The list continues: see the rest of the CERT C Secure Coding Standard.

Luna
  • 101
  • 4
  • 2
    +1 for the first half (although that doesnt really help me - unless there's a port to Linux?), and another +1 for the 2nd half - I agree completely. I always hated the `union` form - it just doesnt sit well with me. But of course, tell this to any veteran C programmer, and he'll jump down your throat with both hands around your windpipe: "you just don't understand C philosophy, thats the correct way to do things here, and the only unsafe thing is because you don't know what you're doing." – AviD Apr 21 '11 at 12:02
  • 1
    I can't help you with arrogant C neckbeards, but I can tell you that CF Lite works on *nix and Windows in addition to Mac OS X and iOS. –  Apr 21 '11 at 12:09
  • 'course, that's one of the benefits of working in C, and low-level memory in general - no constraint. Plenty o' rope... – AviD Apr 21 '11 at 12:09
  • Can I interject here and say that the `union` form is useful for detecting platform endian-ness? It is also useful for breaking up bit fields, for example a 64-bit field could be represented by a union of two 32-bit fields. Admittedly, I personally would use bitshift and logical ands to extract that because those constructs are endian-independent, but they are sometimes useful. –  Apr 22 '11 at 19:26
  • I also disagree that the solution is "don't do that". The problem isn't treating memory as a bag of byte since that's what it is. The problem is input validation, or a lack of it. "I'll never need to check the length of that..." etc. Also, let's face it, C developers aren't the only ones... SQL Injection comes to mind, for example. –  Apr 22 '11 at 19:31
  • 1
    @ninefingers as you can tell, we don't agree on that. Input validation is a small subset of the problems you get by failing to use low-impedance abstractions on top of memory storage. –  Apr 23 '11 at 07:52
  • @AviD: someone (I forget who, maybe Alan Cox) said C++ gives you enough rope "to shoot yourself in the foot" ;-) –  Apr 23 '11 at 19:40
  • @Graham, I know, that's what I was referring to with my "Plenty o' rope" comment :D – AviD Apr 23 '11 at 19:50
  • @Graham I'm intrigued. I admit I'm not an expert on hardware, an area I'd like to improve. Do you have any links / relevant articles to hand on that? I have a high level understanding of Von Neumann vs Harvard Architecture but that's where my knowledge such as it is ends. I'll hold fire on agreeing or disagreeing with you until I understand what you mean fully. Alternatively if it is perhaps better made into a question, I can do that too. –  Apr 23 '11 at 22:19
  • I've put some examples in the answer, there are plenty more. –  Apr 24 '11 at 16:58
6

The comprehensive answer is: stop using C (or C++) ! It is an archaic and downright dangerous tool. Switch to a language with integrated safety (Java, C#, Scheme, OCaml, Python... the choice is large).

A buffer overflow is a bug, among the generic class of bugs known as "the programmer is not fully aware of what he is doing". A language with bounds check (or even the use of memcpy_s()) will not remove such bugs; it will only make consequences a bit less dire (e.g. an exception is thrown, usually implying immediate thread termination). It is like a safety belt: a safety belt does not prevent car accidents, it just tries to keep you alive while your car is mangled into oblivion. So you should use bound checks (in particular languages which incorporate such checks in a transparent manner) for the same reason that you should always fasten your seat belt.

As for C: memcpy_s() and its ilk are actually defined in ISO/IEC TR 24731 (a standard from 2005) so they are supposed to percolate, at some point, into C and C++ compilers. But it seems that only Microsoft is pushing them right now. For the "string" functions (strcpy() and strcat()), there are standard (C99) functions (strncpy() and strncat()) and some BSD operating systems (mainly OpenBSD) are using variants which have a somewhat easier-to-use API (strlcpy() and strlcat()). However, none of those functions correct the bug, which is that the programmer is trying to put data into a too small buffer; they merely make the programmer aware of the fact that his buffer may have an inadequate size.

AviD
  • 72,138
  • 22
  • 136
  • 218
Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • 2
    +1, love hearing some common sense from someone such as you (see some of the ridiculous rhetoric going on at the linked SO question...). Unfortunately, stopping to use C/C++ is not likely, though I would love to get everyone to do that... – AviD Apr 20 '11 at 13:14
  • `strncpy` and such are still considered unsafe, though slightly better than `strcpy`. The problem is the same as with memcpy - you only define how many bytes to copy, but that has no relation to the target buffer... Good to know about the ISO bit, though... Thanks! – AviD Apr 20 '11 at 13:16
  • What's wrong with `strncpy`? If the idea is to control the size of buffers, and `strncpy` supposedly does it, how would you exploit it? – Marcin Apr 20 '11 at 13:49
  • 1
    @Marcin, as one of examples - http://cwe.mitre.org/data/definitions/193.html –  Apr 20 '11 at 14:02
  • 1
    @Marcin strncpy only controls the size of the *source* buffer, not the *target* buffer. Same as with the original memcpy... – AviD Apr 20 '11 at 14:12
  • Wow, then what did they improve really? :) Good stuff, thanks. – Marcin Apr 20 '11 at 14:25
  • @Marcin MS provides additional, "more secure" versions of these methods - e.g. strcpy_s, strncpy_s, amongst others - that do the same thing as memcpy_s, i.e. check the target buffer against an additional argument. Again, a programmer can provide wrong values here too, but good programmers would have to *think* about the size of the target buffer. – AviD Apr 20 '11 at 14:35
  • @Thomas, re ISO/IEC - I remember reading somewhere (I can't find it now) that the gcc/linux implementors officially declared that they would not be implementing memcpy_s or similar... But, of course I am still looking for an equivalent or a workaround... – AviD Apr 20 '11 at 15:56
  • 5
    @Thomas Pornin This is a rather inflammatory post. Like it or not (Java, C#, Scheme, OCaml, Python...) are all **written in C or C++**. You need to pick the language best suited for the problem, and C/C++ can be a great choice. That being said, I love using python when it is appropriate. – rook Apr 20 '11 at 16:45
  • 3
    @Rook: and C and C++ is transformed into machine language by a compiler, and machine language runs on a CPU which mostly consists of silicium and various metals coming from a mine somewhere. This does not mean that programming is nothing more than glorified shoveling. – Thomas Pornin Apr 20 '11 at 18:03
  • @Thomas Pornin My point was two fold, all of which you have failed to address and instead just insulted me. You would not write an VM in python, you would do it in C/C++. Further more all I see is 0-days in flash player. – rook Apr 20 '11 at 19:04
  • @Rook, sorry to enlighten you, but @Thomas is a well recognized cryptographer. – AviD Apr 20 '11 at 20:27
  • 2
    Gentlemen - please keep your comments polite and relevant! – Rory Alsop Apr 21 '11 at 08:21
  • 4
    "Stop using C/C++" is not a useful answer. The languages are the correct tool for the job in certain circumstances (mem/cpu efficiency sensitive applications or circumstances where you cannot require the distribution of a vm/framework) – TobyS Apr 21 '11 at 10:24
  • 3
    @TobyS: the _detailed_ answer is that there are not many jobs where C is the "correct tool" -- much less than usually assumed. As an example, look at http://jikesrvm.org/ : that's a Java VM written in Java. C programming is close to assembly: you are in control, but there is no other control. If a programmer feels the urge to ban `memcpy()` and relies on automatically-applied `memcpy_s()` then he does not really want to program in C -- so saying that he should not is not really stretching it. – Thomas Pornin Apr 24 '11 at 21:47
  • 2
    @TobyS: so I think that "Stop using C/C++" is still an effective, if terse, answer. Feel free to read it with an implicit "(unless you know what you are doing)". I may add that when faced with an embedded system which knows only a subset of C, _writing your own VM_ may still be a better option than trying to do some complex processing in C, especially security-wise. – Thomas Pornin Apr 24 '11 at 21:52
  • The `strncpy` function is not a "bounded strcpy"; its purpose is to convert a zero-terminated string into a zero-padded string. I'm not really sure what `strncat` is for, since it would only be useful in cases where one neither knew how long the source string was, nor cared how long it would be afterward, but somehow knew there was a certain amount of slack space after the end of the source string. I can't think of any situation where all conditions would apply, and if they don't one would be better off computing the lengths of the strings and using `memcpy` and writing a terminating-zero. – supercat Jul 29 '15 at 15:40
3

I answered this question over 10 years ago. Crikey. I have been linked to this question again recently, and I feel it is time to provide an updated answer. 10 years is not a long time, except in computing, where it is an eternity.

I have also learned a significant amount in the intervening time.


To address a comment left on this question: there's a significant amount of code written in C and it simply isn't economically feasible to rewrite all of it. Slowly it will hopefully be replaced, but in the mean time we have exploit mitigations to try to limit the damage. No one function renamed "secure" or otherwise will kill this bug class entirely: this needs a more thorough approach.

There is still no way in pure C to have a safe memcpy alternative, because it relies amongst other things on bounds checking, which means types need to be annotated with their length at some point and that information needs to propagate to all dependent poinnters. ISO standard C simply doesn't let you do that.

There is also a slight cost to this tracking, but given the security gain it is almost always justified. Finally, interactions with hardware are always going to be tricky: length annotations might not exist, and you may need a small amount of glue to transpose these into your system.

It should not be a surprise to you that the system I have just described is exactly what Rust does. Rust is by no means a perfect language, and soundness issues in the compiler are a risk, but the overall idea is one I am strongly in favour of: all accesses by default come with safety checks, unless you enclose this in an unsafe block. This is a good design, as it highlights precisely where code needs the most careful review where unchecked length accesses or type punning are truly unavoidable, and injects compile time and runtime checks elsewhere to detect errors.

This is not the only game in town, and for systems programming, particularly userspace daemons on Unix systems where C interaction is required, Go is also a strong contender these days. It uses a GC, but GCs outdo manual memory management for most workloads, and Go compiles to native code so you don't incur any interpretation or virtual machine costs. There are undoubtedly other languages.

I want to also mention Ada, because SPARK adds another dimension to this that is interesting. Ada supports similar compile time checks or runtime if necessary checks as Rust, and SPARK is an extension to Ada (more like a carefully chosen subset) that tries to enable formal verification. Formal verification is also not a silver bullet: it is not a cast iron proof of correctness, because provers themselves can have bugs and humans can incorrectly model code. Nevertheless, it is an extremely high standard of assurance. I think Ada also shows promise if you need very high assurances of the correctness of your implementation. For interest's sake, there's a Rust formal verification working group.

It is actually possible to formally verify C code as well. seL4 is the poster child for formally verified kernels. This was done by proving the haskell code correct, and proving the C code that actually runs as equivalent to the haskell code. However, this is ridiculously expensive: hundreds of PhDs have been sacrificed (spent their whole time on this) just to verify one kernel (with nowhere near the hardware support of Linux). It is economically infeasible to do this for any even mildly complicated commercial software.

If you look around the internet, there will be any number of research papers and theses trying to harden pure C implementations, but it comes down to encoding length in the types. C was never designed to do this. If you are going to memcpy in pure C, I think my original answer still stands somewhat: you need to thoroughly check it. However I would modify that with the proviso that you should definitely be asking yourself why you are writing C in 2021, and if you really need to, or if you can't do what you need to in a language that offers better safety guarantees. Certainly, sometimes you have no choice, but that question should still be asked along the way. If you want points of justification, here would be my arguments:

  • Can another language help you catch bugs faster in development? Diagnosing customer issues is expensive and time consuming: if you can catch them during QA, good. If you can catch them before QA: best.
  • Do you really need the speed you say you do? Have you benchmarked a test case? I've seen production C++ in memory databases obliterated by someone's weekend C# reimplementation, so I'm now a strong proponent of measuring.
  • If you're still here and the only viable choice is to use C, can you minimise the amount you need and make sure it has appropriate testing and validation?

I think it would also be useful to learn from safety critical coding, and enforce the same standards on your own organization.


Finally, I'd like to end with an obvious reminder. With memory-safe languages we should hopefully reduce the number of bugs that arise from this particular bug class, but this won't magically make code safe. There are still logic issues, type confusion and a whole host of other bug classes to contend with.

diagprov
  • 2,074
  • 11
  • 12
3

The Linux kernel makes many thousands of calls to so called unsafe functions such as strcpy() and strcat() however buffer overflows in this code base are very uncommon. In fact "unsafe functions" are preferred for copying static text because it reduces overhead. You can use "safe function calls" in an insecure way, strncpy(dest,source,strlen(source));. The choice of functions doesn't mean anything for how secure the system is.

This problem is better solved with modern memory protection systems. These are implicit and code doesn't have to be rewritten. In modern software with canaries, ASLR, and NX zones buffer overflows alone aren't very helpful. Dangling pointers where used in the 2010 pwn2own for Windows 7 and IE8 and the most recent flash 0-day didn't use a buffer overflow at all.

rook
  • 46,916
  • 10
  • 92
  • 181
  • 1
    I hope nobody ever actually uses `strncpy(dest,source,strlen(source))` in their code ever. It's functionally equivalent to strcpy. – Yuliy Apr 22 '11 at 20:35
  • 1
    @Yuliy you would be surprised, i have seen it first hand. It comes up more often when you disallow `strcpy()` and a programmer just wants to Finnish up and go home. – rook Apr 22 '11 at 20:50
3

But what about "pure" C (i.e. not C++)?

I don't think that functions such as memcpy_s really qualify as 'secure' replacements for memcpy, but either way they are equally applicable within pure 'C', there is not much that is 'C++' about them. Nor are they platform specific (in the sense that it is not all that difficult to write a portable version of them, even if you can't rely on them being part of the platform C library).

As far as safety goes, memcpy & co are defined such that they allow arbitrary scribbling on memory. That is what they are for, and no amount of adding size parameters will change that.

At the same time I don't think it's reasonable (yet) to say "Don't use 'C' / C++" as there are still plenty of systems where a 'C' like language is the only practical option.

The real question I guess is whether there is a safer alternative to buffer management that has more of a safety net than memcpy and that can be used from C and C++. Well, sure. A 'C' API or C++ class could be defined to allocate, manage and access bounds checked buffers and arrays and this is exactly the kind of thing that the interpreter for the Java VM et al is accessing.

There are also languages like Objective-C which aren't managed code yet which have much of the safety aspects of managed code, despite the fact that memcpy etc are available. This is because the Cocoa framework tends to encourage and at least allows the use of safer alternatives.

frankodwyer
  • 1,907
  • 12
  • 13
2

Another 'safe' API for handling strings in C is Daniel J Bernstein's string library. Credit to André Pang for pointing this out to me.

  • That link doesn't work for me. Perhaps it has moved somewhere closer to [a](http://www.skarnet.org/software/skalibs/libstddjb/) or [b](http://cr.yp.to/software.html) ? – David Cary Aug 16 '12 at 00:50
  • Or are you maybe referring to Bernstein's [netstring](http://en.wikipedia.org/wiki/netstring) format? – David Cary Aug 16 '12 at 01:14
0

People will never stop to make mistakes. Even in so called "safe implementations" and list of languages that Thomas Pornin has mentioned there will be a place for some vulnerability. In current answer I would not like to delve into the specifics of C/C++ - it won't bring much sense as addendum to what was already said. Here would like to point that another solution is not prevention of bugs itself, but mitigation of exploitation consequences. That's just another part of what can be done in the sake of security. As the best of examples, look at the browser of Google, Chrome - they have implemented sandboxing. Another possibility are compiler security options and OS level exploitation mitigations. Those are the consequences of what was said in a very first sentence and is an acknowledgement of the fact that introduction of bugs is inevitable part of every software development.