I answered this question over 10 years ago. Crikey. I have been linked to this question again recently, and I feel it is time to provide an updated answer. 10 years is not a long time, except in computing, where it is an eternity.
I have also learned a significant amount in the intervening time.
To address a comment left on this question: there's a significant amount of code written in C and it simply isn't economically feasible to rewrite all of it. Slowly it will hopefully be replaced, but in the mean time we have exploit mitigations to try to limit the damage. No one function renamed "secure" or otherwise will kill this bug class entirely: this needs a more thorough approach.
There is still no way in pure C to have a safe memcpy alternative, because it relies amongst other things on bounds checking, which means types need to be annotated with their length at some point and that information needs to propagate to all dependent poinnters. ISO standard C simply doesn't let you do that.
There is also a slight cost to this tracking, but given the security gain it is almost always justified. Finally, interactions with hardware are always going to be tricky: length annotations might not exist, and you may need a small amount of glue to transpose these into your system.
It should not be a surprise to you that the system I have just described is exactly what Rust does. Rust is by no means a perfect language, and soundness issues in the compiler are a risk, but the overall idea is one I am strongly in favour of: all accesses by default come with safety checks, unless you enclose this in an unsafe
block. This is a good design, as it highlights precisely where code needs the most careful review where unchecked length accesses or type punning are truly unavoidable, and injects compile time and runtime checks elsewhere to detect errors.
This is not the only game in town, and for systems programming, particularly userspace daemons on Unix systems where C interaction is required, Go is also a strong contender these days. It uses a GC, but GCs outdo manual memory management for most workloads, and Go compiles to native code so you don't incur any interpretation or virtual machine costs. There are undoubtedly other languages.
I want to also mention Ada, because SPARK adds another dimension to this that is interesting. Ada supports similar compile time checks or runtime if necessary checks as Rust, and SPARK is an extension to Ada (more like a carefully chosen subset) that tries to enable formal verification. Formal verification is also not a silver bullet: it is not a cast iron proof of correctness, because provers themselves can have bugs and humans can incorrectly model code. Nevertheless, it is an extremely high standard of assurance. I think Ada also shows promise if you need very high assurances of the correctness of your implementation. For interest's sake, there's a Rust formal verification working group.
It is actually possible to formally verify C code as well. seL4 is the poster child for formally verified kernels. This was done by proving the haskell code correct, and proving the C code that actually runs as equivalent to the haskell code. However, this is ridiculously expensive: hundreds of PhDs have been sacrificed (spent their whole time on this) just to verify one kernel (with nowhere near the hardware support of Linux). It is economically infeasible to do this for any even mildly complicated commercial software.
If you look around the internet, there will be any number of research papers and theses trying to harden pure C implementations, but it comes down to encoding length in the types. C was never designed to do this. If you are going to memcpy in pure C, I think my original answer still stands somewhat: you need to thoroughly check it. However I would modify that with the proviso that you should definitely be asking yourself why you are writing C in 2021, and if you really need to, or if you can't do what you need to in a language that offers better safety guarantees. Certainly, sometimes you have no choice, but that question should still be asked along the way. If you want points of justification, here would be my arguments:
- Can another language help you catch bugs faster in development? Diagnosing customer issues is expensive and time consuming: if you can catch them during QA, good. If you can catch them before QA: best.
- Do you really need the speed you say you do? Have you benchmarked a test case? I've seen production C++ in memory databases obliterated by someone's weekend C# reimplementation, so I'm now a strong proponent of measuring.
- If you're still here and the only viable choice is to use C, can you minimise the amount you need and make sure it has appropriate testing and validation?
I think it would also be useful to learn from safety critical coding, and enforce the same standards on your own organization.
Finally, I'd like to end with an obvious reminder. With memory-safe languages we should hopefully reduce the number of bugs that arise from this particular bug class, but this won't magically make code safe. There are still logic issues, type confusion and a whole host of other bug classes to contend with.