141

I have heard that there will always be vulnerabilities in codes, software. However, I don't understand why it is not possible to have an exploit-free software. If companies keep updating their software, eventually there will be no vulnerabilities, right?

chicks
  • 145
  • 1
  • 6
Zheer
  • 1,165
  • 3
  • 8
  • 10
  • 51
    It depends on how big of a software we are talking about. If it's a small toy with a $0.20 microcontroller in it, where all the software does is blinking a LED, then yes, it's possible. But most software is way more complex than that. – vsz Oct 21 '19 at 04:41
  • 60
    "if companies keep updating their software, eventually there will be no vulnerabilities" No, because they keep adding new features, removing old ones or changing them. Each time this happens there is a chance a mistake was made that adds a new vulnerability. – Qwertie Oct 21 '19 at 04:44
  • 27
    Theoretically software can be bug free. Large, complex software is unlikely to be bug free but with enough effort you could make it bug free. – Michael Oct 21 '19 at 09:12
  • 11
    @marshalcraft What exactly do you mean by that comment? How does Turing-Completeness affect the exploitability of code? –  Oct 21 '19 at 10:23
  • 1
    @Michael Bugs and exploits are two different things. – MonkeyZeus Oct 21 '19 at 12:16
  • 3
    @MonkeyZeus Not really. When dealing with formal semantics you'd completely define the behaviour of the program. This includes the fact that it will not execute random stuff given as input, so in that case an exploit is just a bug since if it works it means the software is going against its specification. – Giacomo Alzetta Oct 21 '19 at 12:17
  • @GiacomoAlzetta Yes, they overlap but they are not interchangeable. – MonkeyZeus Oct 21 '19 at 12:21
  • I hope the ICBM launch facilities are bug free, but, you know... – Mast Oct 21 '19 at 13:07
  • 14
    @MonkeyZeus Every vulnerability is a bug, but not every bug is a vulnerability. –  Oct 21 '19 at 13:19
  • @MechMK1 Correct. – MonkeyZeus Oct 21 '19 at 13:21
  • 2
    @MonkeyZeus Which means that a software that is bug-free is thus by definition vulnerability-free. –  Oct 21 '19 at 13:22
  • 2
    Technically yes, practically no. 10 PRINT "Hello World" 20 END There you go. – aslum Oct 21 '19 at 13:24
  • @MechMK1 Yes, and when your manager asks why it's taking so long to fix that SQL injection issue then you can happily tell them that you first had to fix the complicated CSS bug which was adding 1px too much margin on some obscure Android phones. So by process of bug elimination you will eventually get to that pesky SQL issue. – MonkeyZeus Oct 21 '19 at 13:35
  • 1
    @MonkeyZeus Yes, the bug will *eventually* be fixed, and in a sensible company the bug will have priority over the CSS bug. But the whole point is that once there are no bugs, you will have no vulnerabilities. The problem is that "once you have no bugs" will never happen. –  Oct 21 '19 at 13:37
  • 1
    @MechMK1 It sounds like you're finally on the same page. Fixing all exploits, not bugs, is the goal. – MonkeyZeus Oct 21 '19 at 13:40
  • 2
    @MonkeyZeus It's your goal, it's my goal, but it's likely not the goal of the AwesomeNewApp.io, because they focus on getting millions of teens to use their app to share selfies. –  Oct 21 '19 at 13:45
  • 1
    Related - https://en.wikipedia.org/wiki/Halting_problem – Ondrej Svejdar Oct 21 '19 at 14:11
  • 4
    @OndrejSvejdar That only means we can't *prove* that *all* software is exploit-free. We could prove all *useful* software is exploit free (or tweak it until it can be proven) and we could have software which is exploit-free which we can't prove is exploit-free. – user253751 Oct 21 '19 at 15:49
  • 2
    Read [The Mythical Man-Month](https://en.wikipedia.org/wiki/The_Mythical_Man-Month) – kelalaka Oct 21 '19 at 18:01
  • 3
    To begin with, declare all bugs features ... – Hagen von Eitzen Oct 21 '19 at 18:33
  • 3
    @MechMK1 I can think of vulnerabilities that are not bugs. For instance, a certain advocate of free software once by design refused to have a secure password. – Michael Oct 21 '19 at 19:04
  • 1
    If companies keep updating their software they will keep introducing bugs. – eckes Oct 22 '19 at 02:37
  • Any bug-free code is so trivial it probably didn't need writing. – VLAZ Oct 22 '19 at 05:28
  • @MechMK1 - If every vulnerability is a bug, but not every bug is a vulnerability, this would suggest that security flaws a bug. If yes, from experience Executives in large companies don't share that view. Security flaws are perceived as a "risk" and the impact they bear. Also I imagine that not every bug is a vulnerability isn't true since it's dependent on numerous factors. Just because it isn't a vulnerability that cannot be exploited today, it may be open to attack tomorrow. – Motivated Oct 22 '19 at 07:24
  • @Motivated An application displays labels wrong in RTL languages. The documentation does not mention the shorthand version of a flag. Particles with a certain display mode are not drawn. All of those are bugs, but no vulnerabilities. –  Oct 22 '19 at 08:00
  • @Mast there are security concerns with the ICBM launch facility. One hack is to install a puppet into the White House and the missiles are yours. – emory Oct 22 '19 at 12:06
  • The software is only as strong as the language it is written in or a module in the language that is being used. What that means is that if the language or module has an exploitable problem, which happens a lot, then the software can be hacked even if it is well written. Look at flaws in openssh historically for example. See https://nakedsecurity.sophos.com/2018/08/23/vulnerability-in-openssh-for-two-decades-no-the-sky-isnt-falling -- perfect example. – Jeff Clayton Oct 22 '19 at 12:27
  • The problem is an old one - a tanks vs. missiles issue. You build a tank to defend against the current missiles of the day, so someone builds a missile to defeat the new tank. Smart people write code, smart people break code. Endless cycle. – Jeff Clayton Oct 22 '19 at 12:30
  • 1
    This is an interesting question. Let us assume that exploit free software was possible and you the user forgot your password, challenge questions, etc. Then would it not be impossible for you to reclaim your account? – emory Oct 22 '19 at 14:26
  • 2
    Security flaws are only bugs **if the design requirements include security** . Otherwise, the software does what it was intended to do, and there are no requirements restricting the software from doing other things as well. – Carl Witthoft Oct 22 '19 at 17:44
  • @Qwertie it's correct that everytime companies add new features to their software there is a chance that a mistake was made, but for instance microsoft recommends you to update from windows xp to the newest version of windows, however in theory the newest windows could have more vunerabilites than the older version of windows, am i right? – Zheer Oct 22 '19 at 20:03
  • @Carl Witthoft is there a way to reclaim your account if the software was exploit-free and you got locked out from your account as emory said?any method? – Zheer Oct 22 '19 at 20:14
  • 1
    I suspect that it's easier to figure out how to prove your software is bug-free than to figure out how to prove your software is vulnerability-free. – user253751 Oct 23 '19 at 09:55
  • 1
    Halting problem and Theorem of Rice come to mind, which make it impossible to design an algorithm that checks any software for being free of exploits. – Polygnome Oct 23 '19 at 10:05
  • 1
    @Zheer Yes, in the beginning, the early adopters of Vista had just that problem. Not just security bugs, but also functionality was not working as expected. But as time went on, Windows XP stopped receiving support and any vulns found will be here to stay. –  Oct 23 '19 at 10:43
  • 1
    @Zheer I wasn't talking about exploits and their outcomes. – Carl Witthoft Oct 23 '19 at 12:16
  • Related: http://complogic.cs.mcgill.ca/beluga/. – Neil Oct 23 '19 at 18:56
  • 1
    @aslum Are you implying the '10 PRINT "Hello World" 20 END' is an example for a program that is bug free? Not so fast. First, there is no specification to test against that defines what is right. But hey, that's really obvious! No, it is not. What should happen when it is run two times and the result is concatenated. Is it one line or two lines? Or maybe it is three lines?! Note the character at the end of a line in UNIX, '\n' is called newline. If the output is 'Hello World\n', the result is three lines, because the start is an implicit new line, and the last character creates an empty line. – Volker Siegel Oct 23 '19 at 22:32
  • Your question reads to me like a conflation of two kinds of inevitability. "Software will always have vulnerabilities in it" does not mean "it is impossible for *any* software to ever be free of vulnerabilities", but rather "software tends to have *and gain* vulnerabilities *at a high enough rate* that over long enough time scales, over enough lines of code, you are practically guaranteed that at any given moment some software has some vulnerability in it". – mtraceur Oct 23 '19 at 23:56
  • How much are you willing to spend on the software? (Space Shuttle software was written in Ada and meticulously pored over by multiple people. Naturally, it was secure, and ear-bleedingly **expensive**.) – RonJohn Oct 24 '19 at 01:58
  • @RonJohn There are almost certainly bugs in it somewhere. They may just be extremely unlikely to be triggered in real-world conditions. The Ariane 5 exploded despite an extremely rigorous software development process. – Artelius Oct 24 '19 at 04:04
  • @Artelius but are the bugs **exploitable**? (That is, after all, the question.) – RonJohn Oct 24 '19 at 07:19
  • @VolkerSiegel No, I'm implying it's an exploit free software. The tongue in cheek answer is to the question is "Yes, but not usefully so". – aslum Oct 24 '19 at 12:42
  • @RonJohn If there are enough of them, then certainly. In other words as the code base grows, bug count increases, and exploits increase. Exploits are more likely to be exploitable by an insider than an external attacker but that is still a security risk. – Artelius Oct 25 '19 at 08:51

17 Answers17

273

Software is too complex

This is by far the most important factor. Even if you just look at something like a web application, the amount of work hours put into the codebase is immense. The code works with technologies, who's standards are pages over pages long, written decades ago, and which offers features that most developers have never even heard of.

Combine that with the fact that modern software is built on libraries, which are built on libraries, which abstract away some low-level library based on some OS functionality, which again is just a wrapper for some other OS function written in the 1990s.

The modern tech stack is just too big for one person to fully grok, even if you exclude the OS side of things, which leads to the next point:

Knowledge gets lost over time

SQL Injections are now 20 years old. They are still around. How so? One factor to consider is that knowledge inside a company gets lost over time. You may have one or two senior developers, who know and care about security, who make sure that their code isn't vulnerable against SQL injections, but those seniors will eventually take on different positions, change companies or retire. New people will take their place, and they may be just as good developers, but they don't know or care about security. As a result, they might not know or care about the problem, and thus not look for them.

People are taught the wrong way

Another point is that security isn't really something that schools care about. I recall the first lesson about using SQL in Java, and my teacher used string concatenation to insert parameters into a query. I told him that was insecure, and got yelled at for disturbing the lesson. All the students in this class have seen that string concatenation is the way to go - after all, that's how the teacher did it, and the teacher would never teach anything wrong, right?

All those students would now go into the world of development and happily write SQL code that is easily injectable, just because nobody cares. Why does nobody care? Because

Companies are not interested in "perfect code"

That's a bold statement, but it's true. To a company, they care about investment and returns. They "invest" the time of their developers (which costs the company a specific amount of money), and they expect features in return, which they can sell to customers. Features to sell include:

  • Software can now work with more file formats
  • Software now includes in-app purchases
  • Software looks better
  • Software makes you look better
  • Software works faster
  • Software seamlessly integrates into your workflow

What companies can't sell you is the absence of bugs. "Software is not vulnerable against XSS" is not something you can sell, and thus not something companies want to invest money in. Fixing security issues is much like doing your laundry - nobody pays you to do it, nobody praises you for doing it, and you probably don't feel like doing it anyways, but you still have to.

And one more final point:

You can't test for the absence of bugs

What this means is, you can never be certain if your code contains any bugs. You can't prove that some software is secure, because you can't see how many bugs there are left. Let me demonstrate this:

function Compare(string a, string b)
{
    if (a.Length != b.Length)
    {
        // If the length is not equal, we know the strings will not be equal
        return -1;
    }
    else
    {
        for(int i = 0; i < a.Length; i++)
        {
            if(a[i] != b[i])
            {
                // If one character mismatches, the string is not equal
                return -1;
            }
        }

        // Since no characters mismatched, the strings are equal
        return 0;
    }
}

Does this code look secure to you? You might think so. It returns 0 if strings are equal and -1 if they're not. So what's the problem? The problem is that if a constant secret is used for one part, and attacker-controlled input for the other, then an attacker can measure the time it takes for the function to complete. If the first 3 characters match, it'll take longer than if no characters match.

This means that an attacker can try various inputs and measure how long it will take to complete. The longer it takes, the more consecutive characters are identical. With enough time, an attacker can eventually find out what the secret string is. This is called a side-channel attack.

Could this bug be fixed? Yes, of course. Any bug can be fixed. But the point of this demonstration is to show that bugs are not necessarily clearly visible, and fixing them requires that you are aware of them, know how to fix them, and have the incentive to do so.

In Summary...

I know this is a long post, so I am not blaming you for skipping right to the end. The quick version is, writing exploit-free code is really really hard, and becomes exponentially harder the more complex your software becomes. Every technology your software uses, be it the web, XML or something else, gives your codebase thousands of additional exploitation vectors. In addition, your employer might not even care about producing exploit-free code - they care about features they can sell. And finally, can you ever really be sure it's exploit free? Or are you just waiting for the next big exploit to hit the public?

Nayuki
  • 222
  • 2
  • 7
  • So it is possible to have an expoit free software in theory, am i right? Just like how some open source softwares like tor and bitcoin are(or are almost) exploit free? – Zheer Oct 20 '19 at 21:21
  • 61
    It is possible in theory, but I have **no idea** why you would claim that tor or any particular bitcoin client are exploit-free. Just look at [tor](https://hackerone.com/torproject/hacktivity), for example, and you see plenty of bugs with it. The extremely short short-version is **"No sufficiently complex software is bug-free."** –  Oct 20 '19 at 21:24
  • 30
    @Zheer we sometimes find horrible vulnerabilities in security-critical open source code that was written 10 or more years ago by respected experts doing so carefully (because the code was considered important security-wise) and that has been reviewed many times by many people over these years not noticing that vulnerability. So we have evidence that all these factors are *not* sufficient to have vulnerability-free code. – Peteris Oct 21 '19 at 09:06
  • 4
    To most company, security is a cost. If they can get away with having no walls and no doors, they would. Obviously they can't, so they put up walls that you can kick in with your barefoot and locks that you can rip out using your bare hands. When they get robbed, they tell their insurance that they had the secured the place. It's a matter of perspective and scale really. – Nelson Oct 21 '19 at 09:21
  • 1
    @Nelson Exactly, which is why I equated it to doing laundry. –  Oct 21 '19 at 09:44
  • 8
    That SQL injection thing is a real deal. I'm doing a degree now and I just saw this... concatenated user input to generate a query. Not even prepared statements. – Nelson Oct 21 '19 at 09:48
  • 3
    @Nelson Yes, it happens. Because teachers, much like companies, are focused on the *functional* part of the code. They want the code to work, not to be secure. –  Oct 21 '19 at 09:49
  • I really dislike your tone on essentially windows nt framework and web standards. Which is very common in this google era. The fact is some problems the most optimal solution was discovered fairly quickly and in the 90s. The idea that because something is dated has no effect on its utility, it could still be the plain and simple solution, and people still yet again aren't getting that simple concept. Also web standards too. Like tcp, it solved a concrete finite problem back in the 80s, of a server/client bidirectional lossless logical connection. But then google ignores it, and re-invents... – marshal craft Oct 21 '19 at 10:26
  • 1
    The wheel, with quick, and waists everybodys time and makes no effort to work with standards, or to reach any kind of concensus with hardware oems, software, etc. Instead hoping it can throw some 1998 intel Pentium 4s it dug out of nyc garbage dump, throw them in windows driven pcs, let uefi browser script do everything (thanks intel, Microsoft, everybody else for agreeing on uefi for google) so that google can craft out a pathetic market nest. And again lets remember that google would never of been possible if it weren't for windows, Microsoft and standards. – marshal craft Oct 21 '19 at 10:32
  • 18
    @marshalcraft [Here](https://googleprojectzero.blogspot.com/2019/08/down-rabbit-hole.html) is a concrete story of an old Windows component screwing folks over big time. Furthermore, I never mentioned Windows itself. I just said "OS component". –  Oct 21 '19 at 10:34
  • 3
    @marshalcraft I love your implication that discovering techniques "in the 90's" is "fairly quickly". I would describe techniques that were discovered in the 60's as "fairly quickly". – Martin Bonner supports Monica Oct 21 '19 at 13:17
  • 2
    A correction - Companies do not care about secure code *until it is exploited* - then they start asking why the number of vulnerabilities isn't zero. – Zibbobz Oct 21 '19 at 13:25
  • 14
    @marshalcraft The Google Nexus 7 phones used to get _really_ slow after they'd been running a while. Eventually Google tracked it down to a 27 year old memory leak in the Linux kernel itself. It's not just Windows. All complex software has vulnerabilities. – Mooing Duck Oct 21 '19 at 13:58
  • "Knowledge gets lost over time", I was wondering why can't we just make a software or an AI, that can help us detect this vulnerabilities. – Zheer Oct 21 '19 at 14:31
  • 11
    @Zheer Automatic vulnerability scanners exist, and they get used. But with automatic systems, you get false positives and false negatives - meaning you will get reports for things that are not vulnerabilities, and no reports for things that are. And keep in mind **AI is not a magic bullet** –  Oct 21 '19 at 14:45
  • 4
    @Zheer, for an AI to know if something is a feature or a bug, it would need to understand the design intent. That's not something that's even really on the horizon. Pattern-matching is easy -- sure, we can easily look for variables being substituted into SQL queries or shell invocations -- but corner cases in how intended behaviors combine to result in unintended ones are a very different matter. – Charles Duffy Oct 21 '19 at 15:32
  • 4
    @Zheer The problem of detecting all software bugs likely turns into the [Turing Halting Problem](https://en.wikipedia.org/wiki/Halting_problem). I think we can do a lot better though. The point about teaching is a good one. One of the problems I face is getting programmers to realize that much of what their teacher/trainer/professor showed them is bad practice. There are a lot of easily avoidable bugs that result in vulnerabilities. – JimmyJames Oct 21 '19 at 15:37
  • 4
    If I was in charge of that class, I'd introduce SQL using concatenation, and then in the next lesson, show them SQL injection as a "cool trick", and then show them how to fix it. – user253751 Oct 21 '19 at 15:51
  • 1
    Is your example a bug though? It _is_ a security issue, if it's used for something that can be exploited - no sane password matching algorithm will do string matching like this (and will compare hashes instead) then yes, it can be an issue, but does that really mean it's a bug? – Baldrickk Oct 21 '19 at 15:59
  • 2
    @Baldrickk Yes, it's a bug. And yes, it's not prod-ready code, but the point was to illustrate code that is *functionally correct*, but still leaks information that it shouldn't. –  Oct 21 '19 at 16:09
  • 2
    I think something this answer missed (or maybe I missed it) is that as bugs are fixed, new bugs could likely be introduced and in some cases more bugs are introduced than are fixed. – Reimus Klinsman Oct 21 '19 at 16:52
  • The given example is interesting and shows its point, but it contains another bug which is not discussed. If string b is shorter than a, the compare will read off the end of b. Depending on the language, this might throw an exception or might read memory that isn't supposed to be accessible. The worst case is if string a is attacker controlled -- they could use the side-channel attacks to examine a potentially unlimited amount of memory beyond string b. – kwan3217 Oct 21 '19 at 19:29
  • Correctness testing is possible in theory, it is just not used in practice, due to its extreme resource need. – peterh Oct 21 '19 at 20:40
  • Some software is sold on it's resilience to *specific* bugs and exploits -- firewalls, for example -- but a wise company would never claim that it is resilient to *all* bugs and exploits. That's just a recipe for getting sued. – Phlarx Oct 21 '19 at 20:59
  • Context also matters, i.e you could use the compare function almost everywhere where you don't have secret data or when you can't reliably measure the time needed to execute such function. But then some programmer may use that function assuming it is safe, or perhaps a programmer uses a library which in some function uses that compare. etc. – kingW3 Oct 21 '19 at 21:14
  • The compare function does not necessarily have a bug. The security code should know how long the worst case takes, and simply wait until that time has passed if it fails sooner. That way it doesn't need to ensure that all of its sub-components run in constant time. – CJ Dennis Oct 21 '19 at 21:19
  • @MechMK1 - Companies are interested building the perfect feature that gives them a return on the investment. Companies are focused on protecting their assets and if the feature that is being built compromises their asset, the conversation changes from the perfect feature to an acceptable feature that minimizes risk. – Motivated Oct 22 '19 at 07:18
  • @MechMK1 - Security is about managing risk and its impact. I would be hard pressed to find a CEO that would be willing to accept a risk that results in significant financial losses. He/she may be open to an acceptable level of loss. In my view, the conversation regarding security needs to focus on risk management, risk reduction & protection of revenue. – Motivated Oct 22 '19 at 07:19
  • I don't think complexity really is the problem. Look at your first example: We've **solved** SQL injections over a decade ago. The difference in complexity between using a safe SQL and an unsafe SQL are trivial in almost all frameworks, if the frameworks allow unsafe SQL at all. That SQL injections still exist isn't because anything is complex, but because people are idiots (with regards to security). It's a mindset, not a complexity issue. – Tom Oct 22 '19 at 10:52
  • @Tom You conveniently forget the complexity added by the library that has to handle the SQL formatting. That's a very complex problem which - surprise - means there are bugs in there. – Voo Oct 22 '19 at 11:28
  • @kwan3217 Since people are looking at the snippet with the mentality of "Okay, there's an intentional bug here somewhere, and I'll find it, dammit!", it's an easy oversight to make, but there's no such bug: If the strings are different lengths, the code won't enter the else branch in the first place. – Aleksi Torhamo Oct 22 '19 at 11:35
  • 1
    ...Honestly, `Compare()` doesn't seem all that difficult to make bug-free, really. Initialise a "result" variable to 0, subtract 1 on each failed test, subtract 0 on each passed test, disable optimisations, compare `a` to a third string `z` (which is initialised after the length test, to `b` if result is 0 or `a` if it's -1) instead of to `b` itself, and clamp to [-1..0] & return at the end. Should result in uniform execution time regardless of inputs. ;P Not necessarily a good way to compare, but at least it's not leaky. – Justin Time - Reinstate Monica Oct 22 '19 at 23:02
  • A friend of mine made a career change and took a 9 month course on web development. He's at his first real job for programming and he inherited a php site from an old client of the firm he works for. I was giving him some guidance debugging and did a record scratch freeze-frame when I saw `$query = ... + $_POST[...]` I was like okay so this is a problem - it's a bigger one than we have time for right now but we need to get to this eventually! – corsiKa Oct 22 '19 at 23:21
  • 4
    The above code has another bug that I don't think anyone's spotted: **if the strings are both MAX_INT long, then the compare loop will fail**.  (Depending on the language, it'll loop for ever, stop with an overflow error, stop with an array index out-of-bounds exception or access violation, or give the wrong result.) — Which nicely illustrates how even nasty exploits can be incredibly hard to spot. – gidds Oct 23 '19 at 10:17
  • @gidds That's a very nice catch! –  Oct 23 '19 at 10:37
  • 1
    @JustinTime The problem isn't that comparing is hard, the problem is essentially that comparing is easy, and that's why people don't think anything could go wrong. Sure, you could write a constant-time comparison with little effort, but you would have to **know** that this is something you need. –  Oct 23 '19 at 10:54
  • @kwan3217, it would NOT read off the end because the code starts with a length comparison. Further, a suitably high-level language can’t read off the end of a string unless there’s a compiler bug. Don’t assume that every language is as error-prone as C. – WGroleau Oct 23 '19 at 14:26
  • @WGroleau I didn't specify any language on purpose. –  Oct 23 '19 at 14:28
  • @MechMK1, but kwan assumed a language that handles strings as poorly as C—and ignored your length check. – WGroleau Oct 23 '19 at 14:34
  • @WGroleau And he is absolutely right in assuming that someone may implement exactly that logic in C. –  Oct 23 '19 at 14:37
  • @WGroleau, you are correct, I forgot about that. – kwan3217 Oct 23 '19 at 15:54
  • I think code isn't exploit free, but I do believe that code can be bug free. Absolutely. – The Anathema Oct 23 '19 at 23:39
  • "Sufficiently complex" sounds extremely bad. Software should not be complex. Increase in complexity increases proportionally the probability of bugs and vulnerabilities. – Overmind Oct 24 '19 at 05:11
  • ...Fair enough, @MechMK1. ;P ...In this case, now that I think about it, I'd personally say there should be a `SecureCompare()` counterpart to `Compare()`, so the consumer can choose whether they want/need safety at the expense of speed. – Justin Time - Reinstate Monica Oct 24 '19 at 07:49
  • @Overmind I define complexity as "the amount of functionality a software has to offer". –  Oct 24 '19 at 09:22
  • Well that is miss-leading if you define it that way because higher complexity of the software design does not necessarily mean more functionality. I can prove hands-down that a file manager from the MS-DOS era has more file-management related functionality compared to any other known file manager in the last 20 years. – Overmind Oct 24 '19 at 09:28
  • @Overmind Let me ask in a different way: How would you define complexity? –  Oct 24 '19 at 09:29
  • From a pure software perspective it can be reduced to an approximation of code size, which the dictates the software result. What bytes of code used to do now we have an equivalent of kilobytes or more that do the exact same thing with no extension in functionality whatsoever. This is unnecessary complexity. Can be generally seen by comparing how windows evolved. – Overmind Oct 24 '19 at 12:45
95

The existing answers, at the time of writing this, focused on the difficulties of making bug free code, and why it is not possible.

But imagine if it were possible. How tricky that might be. There's one piece of software out there which earned the title of "bug free:" a member of the L4 family of microkernels called seL4. We can use it to see just how far the rabbit hole goes.

seL4 is a microkernel. It is unique because, in 2009, it was proven to have no bugs. What is meant by that is that they used an automated proof system to mathematically prove that if the code is compiled by a standards-complient compiler, the resulting binary will do precisely what the documentation of the language says it will do. This was strengthened later to make similar assertions of the ARM binary of the microkernel:

The binary code of the ARM version of the seL4 microkernel correctly implements the behaviour described in its abstract specification and nothing more. Furthermore, the specification and the seL4 binary satisfy the classic security properties called integrity and confidentiality.

Awesome! We have a non trivial piece of software that is proven to be correct. What's next?

Well, the seL4 people aren't lying to us. They immediately then point out that this proof has limits, and enumerate some of those limits

Assembly: the seL4 kernel, like all operating system kernels, contains some assembly code, about 340 lines of ARM assembly in our case. For seL4, this concerns mainly entry to and exit from the kernel, as well as direct hardware accesses. For the proof, we assume this code is correct.
Hardware: we assume the hardware works correctly. In practice, this means the hardware is assumed not to be tampered with, and working according to specification. It also means, it must be run within its operating conditions.
Hardware management: the proof makes only the most minimal assumptions on the underlying hardware. It abstracts from cache consistency, cache colouring and TLB (translation lookaside buffer) management. The proof assumes these functions are implemented correctly in the assembly layer mentioned above and that the hardware works as advertised. The proof also assumes that especially these three hardware management functions do not have any effect on the behaviour of the kernel. This is true if they are used correctly.
Boot code: the proof currently is about the operation of the kernel after it has been loaded correctly into memory and brought into a consistent, minimal initial state. This leaves out about 1,200 lines of the code base that a kernel programmer would usually consider to be part of the kernel.
Virtual memory: under the standard of 'normal' formal verification projects, virtual memory does not need to be considered an assumption of this proof. However, the degree of assurance is lower than for other parts of our proof where we reason from first principle. In more detail, virtual memory is the hardware mechanism that the kernel uses to protect itself from user programs and user programs from each other. This part is fully verified. However, virtual memory introduces a complication, because it can affect how the kernel itself accesses memory. Our execution model assumes a certain standard behaviour of memory while the kernel executes, and we justify this assumption by proving the necessary conditions on kernel behaviour. The thing is: you have to trust us that we got all necessary conditions and that we got them right. Our machine-checked proof doesn't force us to be complete at this point. In short, in this part of the proof, unlike the other parts, there is potential for human error.
...

The list continues. All of these caveats have to be carefully accounted for when claiming proof of correctness.

Now we have to give the seL4 team credit. Such a proof is an incredible confidence building statement. But it shows where the rabbit hole goes when you start to approach the idea of "bug free." You never really get "bug free." You just start having to seriously consider larger classes of bugs.

Eventually you will run into the most interesting and human issue of all: are you using the right software for the job? seL4 offers several great guarantees. Are they the ones you actually needed? MechMK1's answer points out a timing attack on some code. seL4's proof explicitly does not include defense against those. If you are worried about such timing attacks, seL4 does not guarantee anything about them. You used the wrong tool.

And, if you look at the history of exploits, it's full of teams that used the wrong tool and got burned for it.

†. In response to the comments: The answers actually speak to exploit free code. However, I would argue a proof that code is bug free is necessary for a proof that it is exploit free.

forest
  • 64,616
  • 20
  • 206
  • 257
Cort Ammon
  • 9,206
  • 3
  • 25
  • 26
  • 57
    It's also worth noting that all this proof does is ensure that the behavior of resulting binary 100% matches the abstract specification. Thus, if the abstract specification has a bug - some unintended consequence of the interaction of specified features enables some interesting attack - then the binary will also have it. And many security flaws really are holes in the specification or some protocol, not in the implementation. – Peteris Oct 21 '19 at 09:11
  • 6
    See also [CompCert](http://compcert.inria.fr/) a C compiler proved to follow the C standard (without this you cannot really guarantee any correctness of C programs...) – Giacomo Alzetta Oct 21 '19 at 12:18
  • 5
    Hardware-based security failures are not nearly as impossible as we'd like. Well-known, current examples include row hammer, meltdown, and spectre. Further, it is normally impossible to prove the absence of side-channel attacks. Side-channel attacks are reliant on implementation details; there is no guarantee that a proof covers every possible category of side-channel (admittedly, such a guarantee might be theoretically, but not practically, possible in the case of quantum computing). – Brian Oct 21 '19 at 13:01
  • 1
    Please note, the OP does not ask about "bug-free"... – CGCampbell Oct 21 '19 at 13:12
  • @Peteris Yep, Eventually you will run into the most interesting and human issue of all: are you using the right software for the job? – Cort Ammon Oct 21 '19 at 15:16
  • 11
    @CGCampbell I'd argue that "bug free" is the low bar to pass. Only once you are bug free can we even talk about exploit free meaningfully. – Cort Ammon Oct 21 '19 at 15:21
  • 2
    +1 for "[History is] full of teams that used the wrong tool and got burned for it." – J. Chris Compton Oct 21 '19 at 15:51
  • 5
    And in many of those cases, the “wrong tool” was C. Even one of the inventors of C wrote something like “If your code is more than X SLOC, you should be looking at Ada.” X was either 1500 or 15K—I no longer have that book to check. Today, there are several languages more robust than C, yet people still cling to the dinosaur. In the book “C traps and pitfalls” almost two thirds of the errors described are impossible in better languages. – WGroleau Oct 21 '19 at 16:35
  • It is alsoworth noting that the mentioned proof is only that the code works *according to the specs*. One has to take into account that the specs themselves might introduce a vulnerability – Hagen von Eitzen Oct 21 '19 at 18:39
  • 2
    @HagenvonEitzen However from the spec you can obtain an operational semantics and use temporal logics to prove whether some properties holds. These properties can be stuff like "no SQL-injection" is present for example, or "the value entered here from user A can never be seen by user B before user A grants them permission". So you can definitely prove that specific properties holds. and we generally have a good idea of which properties we want to achieve., so you can definitely prove that big classes of attack cannot work. – Bakuriu Oct 21 '19 at 19:26
  • For any program `P` that does not exercise at least one translation limit given in the Standard, any action `X`, and any conforming implementation `I`, the following "implementation" would also be conforming: "if input doesn't match `P`, process it using `I`, and otherwise bomb the stack in a way that causes `X` to happen". Any "proof" of a C program's correctness will have to rely upon behavioral guarantees beyond those given in the Standard. – supercat Oct 21 '19 at 19:53
  • I noticed that "if the code is compiled by a standards-complient compiler" would one of the caveats be that there does not exist a standards-compliant compiler and that the create one would involve a process similar to proving seL4 bug free and itself require a standards-compliant compiler? – emory Oct 22 '19 at 12:12
  • 3
    @emory That issue is best captured by [this famous issue](https://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html) with CC and the login program. The concerns here are actually why the seL4 team then went an additional step to prove the validity of an ARM binary of seL4 using any "reasonable" compiler. I don't know how they did it, but I presume it involved writing code which was always easy to prove whether it halted or looped forever (a subset of a Turing Complete language), and doing an analysis from there. – Cort Ammon Oct 22 '19 at 12:25
  • @CortAmmon So there proof is robust to a "reasonable" but not necessarily bug-free compiler? That is highly impressive. – emory Oct 22 '19 at 13:41
  • @emory Yeah. It blew me away. The last time I read about it, they still had a compiler dependency. Now obviously this is a 2 step process. First you generate the binary with the compiler, then you test said binary for spec conformance. If a compiler bug manifests during the code generation, they simply fail the binary during testing. – Cort Ammon Oct 22 '19 at 18:05
  • I want to give this answer a 200 Rep bounty, but I have to wait 24h –  Oct 23 '19 at 12:08
  • @Peteris For seL4, it also proves that the "abstract specification" has no unintended consequences (short of certain out-of-scope issues like timing attacks, which are being worked on), which is due to the extreme simplicity of the kernel's API. It has a tiny handful of syscalls, half of which are just asynchronous implementations of the other half, which control IPC and allow processes to communicate with userspace drivers. – forest Jan 03 '21 at 03:47
24

You can have high quality code, but it becomes massively more expensive to develop it. The Space Shuttle software was developed, with great care and rigorous testing, resulting in very reliable software - but much more expensive than a PHP script.

Some more day-to-day things are also very well coded. For example, the Linux TCP/IP stack is pretty solid and has had few security problems (although unfortunately, one recently) Other software at high risk of attack includes OpenSSH, Remote Desktop, VPN endpoints. The developers are typically aware of the importance of their software as often providing a "security boundary" especially with pre-authentication attacks, and in general they do better and have fewer security problems.

Unfortunately, some key software is not so well developed. A notable example is OpenSSL that is very widely used, yet has messy internals where it's easy to introduce security flaws like Heart Bleed. Steps have been taken to address this, e.g. LibreSSL.

A similar effect happens in CMS software. For example, Word Press core is generally well engineered and has few issues. But plugins are much more variable, and often outdated plugins is how such sites are hacked.

Web browsers are a front-line in this. Billions of desktop users rely on their web browser to be secure, keep malware off their systems. But they also need to be fast, support all the latest features, and still handle millions of legacy sites. So while we all really want web browsers to be trustworthy security boundaries, they are not that currently.

When it comes to bespoke software - which is often web applications - the developers working on them are typically less experienced and security aware than core infrastructure developers. And commercial timescales prevent them taking a very detailed and careful approach. But this can be helped with architectures that contain security critical code in a small area, which is carefully coded and tested. The non-security-critical code can be developed more quickly.

All development can be helped with security tools and testing, including static analyzers, fuzzers and pen tests. Some can be embedded in an automated CI pipeline, and more mature security departments do this already.

So we've got a long way to go, put there is definitely hope in the future that there will be much fewer security bugs. And many opportunities for innovative tech that gets us there.

paj28
  • 32,736
  • 8
  • 92
  • 130
  • 10
    You mentioned the shuttle, and that reminded me: The Apollo Mission Computer had a divide by zero bug that *almost* prevented Apollo 11 from going to the moon. (when the fault occurred, it took out a protective algorithm that prevented the gyros from tumbling.) – Cort Ammon Oct 21 '19 at 15:24
  • This is an interesting one. Of course, there is always a trade off between robustness and time to deliver, and if we are designing a simple web form with low risks incurred from usage then we aren't going to spend a fortune on testing and quality assurance. On the other hand, in my experience, most projects are non trivial and nearly all of them would benefit in long term costs if more testing and quality assurance was given. The problem I see often is that many software projects are carried out in situations such as banks/telcos, where safety-critical culture is lacking. – Sentinel Oct 23 '19 at 08:35
  • To add some (ironic) humor:The space shuttle programs (developed in the late 1960s and early 1970s) may have been bug free, but the Space 'Program' itself was full of bugs! (The poor'Challenger' crew knows this...R.I.P). – Mr. de Silva Oct 23 '19 at 15:20
11

Yes...

As others have pointed out, it's possible to proof your code, and by such means demonstrate that your code will work exactly as intended. The difficulty with proofing timing and non-deterministic models (such as network interactions) is one of difficulty, not impossibility. The patches to Meltdown and Spectre show that even side-channel timing attacks can be accounted for and addressed.

The primary approach to building code such as this is to treat code as mathematics. If you cannot proof your code, do not treat it as bug-free. If you can, then you have ... only a shot at bug-free.

... but ...

Even if you can proof that your code is pristine, cannot release data except as intended, cannot be put into an erroneous or aberrant state, etc, remember that code on-its-own is worthless. If a developer writes code that has such a proof, but runs that code on hardware that itself contains hardware vulnerabilities, the security of the software becomes moot.

Consider a simple function for retrieving some encrypted data from memory, stores it in a CPU register, does an appropriate transform in-place on that register to decrypt, process, and re-encrypt the data. Note that at some point, the unencrypted data are in a register. If the available opcodes for that CPU hardware afford the possibility of a program that does not clobber that CPU register, running parallel to your proven code, then there is a hardware-based attack.

What this means, ultimately, that to have such an exploit-free software, you would need to proof first that you have exploit-free hardware. As Meltdown and Spectre (among many others) have demonstrated, commonly available hardware just doesn't pass that mark.

Even military spec and space spec hardware fails at this metric. The LEON line of processors, which see use in military and space deployments, are only hardened against Single Event Upsets (SEUs) and Single Event Transients (SETs). This is great, but it means there's always the possibility of an attacker placing the system in an environment with enough radiation to induce enough upsets and transients to place the hardware in an aberrant state.

... and more buts ...

So proofing the software and hardware is not enough. We must consider even environmental effects in proofing our hardware. If we expose a LEON4 to enough radiation to either overwhelm the casing OR cause enough induced radiation in the casing to overwhelm the processor, we can still cause aberration. At this point, the sum total system (software, hardware, environment) would be insanely complicated to fully and properly define to attempt such a proof.

... so no, not really...

Assume that we have devised an RDBMS that we've proofed the code, we've proofed the hardware, and we have proofed the environment. At some point, we finally hit the weak point in any security chain:

Idio... er, Users.

Our glorious database and our illustrious PFY make for an insecure system. The PFY -- let's be more charitable and bestow upon them the title 'JrOp'... The JrOp accesses the database and is given only that data the JrOp needs to know and nothing more, nothing less. In a moment of brilliance only JrOps can muster, our JrOp leans over to a coworker and mutters, "Did you see what User12358W just uploaded? Look at this!"

So much for our security...

... one last hope (and defeating it with absurdity) ...

Let us say, however, we imagine the future hypothetical where we've finally figured out human consciousness. Humanity has finally achieved a scientific and technological accounting of all human mental functioning. Let's further say this allows us to proof our system against even our users -- the appropriate feedback systems are built into the system to ensure our JrOp doesn't even THINK of revealing that which was revealed to the JrOp. We can leave the question of meta-ethics and manipulation to the philosophers -- speaking of philosophers...

Note that at every single step, we've utilized proofs.

"Ah-hah," exclaims the Pyrrhonic skeptic with glee. "You've assumed that some formal system, such as ZF/ZFC, Peano, non-naive Set theory, classical propositional logic, is sound. Why?"

What answer can be given? Between Godel and Tarski, we cannot even formally define truth (see Godel's Incompleteness Theorum and Tarski's Undefinability Theorum), so even the assertion, "well, we pick it because it seems good to use a system in alignment with reality," at core is just an unfounded assumption -- which means any proof our system is exploit-free is ultimately itself assumption.

... so no, it isn't.

While it may be possible to write bug-free code, by writing it as mathematical proofs, and thus technically attaining the top-level goal of 'exploit-free code', this requires looking at code in a vacuum. There is some value in this -- it's a worthwhile goal ("But that assumes wor--" "Most people do, deal with it Pyrrho"). However, never allow yourself the comfort of thinking you've ever succeeded at that goal -- and if you do, have the humility to title your code "HMS Titanic".

LRWerewolf
  • 144
  • 3
  • I was wondering, what does "writing it as mathematical proofs" mean? – Zheer Oct 21 '19 at 20:59
  • 1
    It could mean something as literal as using something like Prolog or more conceptually as 'writing the proofs then writing the code according to the proofs and the chosen language's specification.' – LRWerewolf Oct 21 '19 at 21:06
  • 1
    We might have to reduce our requirements a bit, down from "make code that cannot be exploited even in principle" to just "code that is difficult enough to exploit that no conceivable human would be smart enough to figure out how to do so". Then we just have to worry about the AIs... – Jeremy Friesner Oct 22 '19 at 03:34
  • 2
    This is the correct answer: yes you can write perfect software with enough effort (read: $$$) but it's irrelevant because that does not guarantee security. +1. – Jared Smith Oct 22 '19 at 19:32
  • 1
    @Zheer it means, for example, writing in [Coq](https://en.wikipedia.org/wiki/Coq), a [pure functional language](https://en.wikipedia.org/wiki/List_of_programming_languages_by_type#Pure) which allows to prove mathematical assertions, _declaring_ lemmas and _implementing_ proofs. – CPHPython Oct 23 '19 at 14:11
  • @JeremyFriesner Given that some specialized machine learning implementations can do some pretty incredible things and bots already outnumber humans, I think neglecting AI is accepting defeat. – called2voyage Oct 23 '19 at 20:12
  • @called2voyage indeed :/ – Jeremy Friesner Oct 23 '19 at 21:08
8

I want to answer sideways to the previous questions. I don't believe that bug-free software is theoretically impossible or that software is too complex. We have other complex systems with much lower error rates.

There are two reasons why exploit-free code will not happen within the forseable future:

Performance and other Optimizations

Many issues, including exploitable ones, are not cases of where we don't know how to write the code correctly, it is just that correct code would be slower. Or use more memory. Or be more expensive to write. Many shortcuts are taken in software to squeeze out more speed or for some other gains. Some of these shortcuts are the source of exploits

Fundamental Problems

The systems we use to create software today have fundamental flaws that lead to exploits, but are not in principle unavoidable. Our compilers aren't proven to be safe. The library system, especially the Node ecosystem (now copied by composer, cargo and others) of dynamically integrating hundreds or thousands of small packages through automated dependencies is a huge security nightmare. I'd have to use 72pt fonts to show just how huge. Almost all our languages contain fundamentally insecure constructions (the thinking behing Rust illustrates a few of them). Our operating systems are built on even older systems with even more flaws.

In short: At this time, the best we can do is basically "try not to mess up" and that just isn't enough for a complex system.

Conclusion

So in summary, with the software world as it is today, no. Exploit-free code is impossible with those tools and mindsets and dev environments unless we are talking about trivial or extremely self-contained (the L4 kernel that was mentioned already) code.

Theoretically, however, nothing stops us from building software from small modules, each of which can be formally proven to be correct. Nothing stops us from modelling the relations, interactions and interfaces of those models and formally prove their correctness.

In fact, we could do that today, but without fundamental advances in software design, that code would crawl, not run.

schroeder
  • 123,438
  • 55
  • 284
  • 319
Tom
  • 10,124
  • 18
  • 51
  • 4
    You should give some examples of what you mean by those performance-related concerns. This seems dubious to me. Sure – if you write some given algorithm in Coq for the purpose of verifying it, that'll generally run stupidly slow. But most of that is due to the runtime handling of the dependent types. Often enough, the actually needed _program_ parts can be extracted as OCaml or Haskell, and that can give quite decent performance. Perhaps not quite as fast as a hand-tuned C or Rust implementation, but also not “crawl, not run”. – leftaroundabout Oct 22 '19 at 11:33
  • 2
    @leftaroundabout I assumed that was referring to the perf optimizations enabled by e.g. undefined behavior in C: compiler generates faster code by assuming integer operations don't overflow, no out-of-bounds array accesses, no null pointer dereferences, no uninitialized memory reads, etc. – Jared Smith Oct 22 '19 at 15:37
  • @JaredSmith maybe that's what Tom meant, but then the point is definitely not valid except in the sense “don't use C for security-critical applications”. E.g. Java avoids most of these problems without major performance hits, and certainly Rust and OCaml do. – leftaroundabout Oct 22 '19 at 15:41
  • @leftaroundabout no argument from me on that score. – Jared Smith Oct 22 '19 at 15:47
  • 2
    I don't mean one specific type. The Intel CPU issues were also caused by overzealous performance optimizations. Lots of input validation issues are caused by people thinking "nah, this won't happen, I don't need to check for it", then yes, compiler optimisations, then shortcuts in code, then assumptions about data, the list goes on. – Tom Oct 23 '19 at 05:26
  • I'm really glad you mentioned the libraries element. Libraries save a lot of time but you're basically relying on alot of that code (some which mightn't have been touched in years) from being free of security defects. – ChrisFNZ Oct 24 '19 at 03:39
7

Is it possible? Yes. But not for the software you're looking for.

"Bug/Exploit Free" basically means that a program will have a sensible, safe response to any input. This can include ignoring that input.

The only software where this can achieved is small, trivial programs just beyond a Hello World. There are no exploits in this:

print("Hello World")

Because this code ignores all inputs, and outputs only a hardcoded string.

However, this code also accomplishes exactly 0 useful work for you.

As soon as you want to, for example, connect to the internet and download something, you'll be downloading data that you have no control over and might be malicious. Of course, there's a lot of restrictions our downloading software puts on that data to defend you, but it is impossible to defend against a threat angle that you're not aware of.

Gloweye
  • 388
  • 2
  • 8
  • 2
    If I compile this code and invoke the executable with output redirection and your hard drive is completely full then bad things might happen. – gnasher729 Oct 21 '19 at 21:26
  • 4
    @gnasher729 I don't think that you could blame that problem on the 'hello world' program -- any bad things that happened (i.e. worse than an 'out of disk space' error message) in that case would be due to a bug in the OS or the OS's filesystem. – Jeremy Friesner Oct 22 '19 at 03:30
  • Then it would throw an exception and exit "gracefully" with an appropriate error code. – Gloweye Oct 22 '19 at 06:36
  • 5
    Note that it is possible to exploit the code above but not where you think. Way back when Unix was being invented Ken Thompson as a joke introduced a Trojan horse into the one and only C compiler in existence that would crash Unix if any program intercepts a specific input (see his seminal essay https://www.win.tue.nl/~aeb/linux/hh/thompson/trust.html). While it's true that this "cheats" by exploiting the compiler, all exploits exploits things outside a program's source code: rowhammer exploits how RAM is manufactured, CRIME exploits statistical anomaly of compressed data etc. – slebetman Oct 22 '19 at 07:07
  • .. In summary, even Hello World is not guaranteed to be bug free – slebetman Oct 22 '19 at 07:08
  • 1
    That's not a bug in the hello world program, though. That's a bug in the compiler, the unrelated compression algorithm, or the hardware (RAM). – Gloweye Oct 22 '19 at 07:11
  • Is it possible to exploit an unexploitable program just by changing hardware components ? – Zheer Oct 22 '19 at 13:38
  • 2
    @Zheer IMO, you'd be exploiting the hardware, not the program. – Gloweye Oct 22 '19 at 13:39
  • 3
    @Zheer per gloweye's comment, the difference is semantic not practical. No matter how often we pretend otherwise for the sake of our sanity, no program is independent of the hardware it runs on. Or the software it runs on (e.g. the operating system or virtual machine) for that matter. In other words, yes. – Jared Smith Oct 22 '19 at 15:32
  • Then how does this technique (arstechnica.com/tech-policy/2014/05/…)work? – Zheer Oct 22 '19 at 17:02
  • @Zheer could you provide a full link? Hard to say without seeing what it is. – Jared Smith Oct 22 '19 at 19:27
  • My bad, (https://www.google.com/amp/s/arstechnica.com/tech-policy/2014/05/photos-of-an-nsa-upgrade-factory-show-cisco-router-getting-implant/%3famp=1) – Zheer Oct 22 '19 at 19:48
  • 1
    That looks to be deliberately introducing an hardware "weakness". There is no unsafe behaviour that the above Hello World program would cause that wouldn't be there with the same amount of idle time on your computer. – Gloweye Oct 22 '19 at 20:37
6

In security, we like to believe that nothing can be secured, only hardened.

This is because no matter how much you try to update your software and applications, Zero Day's exist. Especially if your software is worth hacking in to. This means although your team of security engineers might be able to patch the issue, the software can be exploited before the vulnerability goes public.

And the more applications you create in your software, the higher the chance of Zero days.

s h a a n
  • 335
  • 1
  • 3
  • 14
6

Yes, if the security of the system is mathematically proven. It is not a new idea, the Trusted Computer System Evaluation Criteria, in short "Orange Book" originates from 1985.

enter image description here

In them, the highest level of security, named A1, is when we have verified design. It means that it is mathematically proven that there is no way to break the system.

In practice, proving the mathematical correctness (incl. security) of any software is very hard, and a very hairsplitting work. As far I know, no complete computer system has such a proof, but some systems (at least the VM/ESA kernel) were partially proven.

Note also, IT Security inherently deals with possible attacks from which we don't know, were are they coming from. For example, such a mathematical model would be fine and working for a system which - directly or indirectly - assumes that there is no way to eavesdrop its internal TCP communications. Thus, it would be eligible to get the A1 certificate. While in practice, such a system could be easily breakable on a compromised router.

In general, automatized (or partially automatized) correctness testing of programs, incl. their security testing, is a well-going computer science field since some decades ago. It resulted many well-referred publications and Phd degrees. But it is still so far away from the practical wide usage, as it was 25 years ago.

peterh
  • 2,938
  • 6
  • 25
  • 31
  • 1
    Well, there is a [C compiler](http://compcert.inria.fr/compcert-C.html) which "is formally verified, using machine-assisted mathematical proofs, to be exempt from miscompilation issues". I suppose that this does not imply *bug-free*, but it is a big step. Of course a C compiler is probably small by today's standard (for example it is probably much smaller than the software in my color TV remote or my watch) but it is not trivial and does something genuinely useful. – Peter - Reinstate Monica Oct 23 '19 at 14:41
  • @PeterA.Schneider Thanks, nice to know! However, it is only a proof that the compiler is correct, thus it does not generate bad binary from a good source. It does not imply, only requires, that all the software created with it, are correct. – peterh Oct 23 '19 at 16:55
  • Indeed :-). The software you compile with it may be buggy as hell. But I was refering to the compiler itself as a non-trivial proven correct program. – Peter - Reinstate Monica Oct 23 '19 at 17:12
  • @PeterA.Schneider I imagine a language, more clearly, an extension to any language, what we could call *proof*. Proof would be a language describing, *why the code is correct*. The proof could be created only by humans, but *both the validity of the proof* (i.e. the proof is correct), and that it relates to the code underlying, could be verified algorithmically. There could be interessant side-effects. For example, in practical software development, the complexity / resources of the code + proof would need to be minimized, and not only the code alone. – peterh Oct 25 '19 at 08:30
  • @PeterA.Schneider Such a proof + code would be essentially *a language in which it is impossible to write buggy code*. There won't be automatized testing needed any more, actually nothing would be needed, and the only source of the problems would be the bad modelling or task statement. – peterh Oct 25 '19 at 08:33
  • The bad modelling -- or more general, wrong specification -- is not to be underestimated. In a sense program code *is the only complete specification* of a program (in the sense that the universe is the only complete model of the universe). The designers of Algol68 had hopes that such a universal and abstract language could serve as a *specification tool* for algorithms, but we now know that making programming errors is still easy with abstract languages. If you completely specify a program by other means, e.g. with CASE tools, producing the actual code can be largely automated. – Peter - Reinstate Monica Oct 25 '19 at 08:58
  • It is true though that the type or class of errors changes with the move from assembly to abstract languages to CASE. If the code is generated and compiled with proven correct tools we don't have array boundary violations any longer; instead what's left are misunderstood requirements as well as underspecifications like missed use/corner cases and unexpected input or other conditions. – Peter - Reinstate Monica Oct 25 '19 at 09:03
6

Yes

I'm surprised nobody has mentioned formal verification by its name (though Cort's answer does mention the L4 microkernel, which has been formally verified).

I'm not personally very familiar with formal verification, so I'll point to some relevant bits from the Wikipedia page on the topic; please refer to it for more information.

In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics.[1]

Formal verification of software programs involves proving that a program satisfies a formal specification of its behavior. [...]

The growth in complexity of designs increases the importance of formal verification techniques in the hardware industry.[6][7] At present, formal verification is used by most or all leading hardware companies,[8] but its use in the software industry is still languishing.[citation needed] This could be attributed to the greater need in the hardware industry, where errors have greater commercial significance.[citation needed] [...]

As of 2011, several operating systems have been formally verified: NICTA's Secure Embedded L4 microkernel, sold commercially as seL4 by OK Labs;[10] OSEK/VDX based real-time operating system ORIENTAIS by East China Normal University;[citation needed] Green Hills Software's Integrity operating system;[citation needed] and SYSGO's PikeOS.[11][12]

As of 2016, Yale and Columbia professors Zhong Shao and Ronghui Gu developed a formal verification protocol for blockchain called CertiKOS.[13] The program is the first example of formal verification in the blockchain world, and an example of formal verification being used explicitly as a security program.[14]

As of 2017, formal verification has been applied to the design of large computer networks[15] through a mathematical model of the network,[16] and as part of a new network technology category, intent-based networking.[17] Network software vendors that offer formal verification solutions include Cisco[18], Forward Networks[19][20] and Veriflow Systems.[21]

The CompCert C compiler is a formally verified C compiler implementing the majority of ISO C.

Dan Dascalescu
  • 1,945
  • 2
  • 15
  • 23
  • 1
    Correctness is not bug free. In my 20 years of experience 90% of bugs in the wild comes from buggy specification, not buggy code. So a 100% correct program can still be buggy – slebetman Oct 22 '19 at 07:11
  • Keep in mind, that things can be verified in implementation, but still have unforeseen vulnerabilities. Quantum cryptography is known mathematically (by all known physics) to be unbreakable. However, attacks on implementations have been found. The hardware is implemented exactly as specified, the mathematics behind the specification is verified... yet, attacks were still possible: https://www.researchgate.net/publication/46581503_Hacking_commercial_quantum_cryptography_systems_by_tailored_brightillumination – Jarrod Christman Oct 22 '19 at 13:35
  • I mentioned "verified design", from the orange book. But I admit that formal verification looks more accurate. – peterh Oct 23 '19 at 07:27
  • 1
    +1 and I'd add that you might also use formal [proof assistant](https://en.wikipedia.org/wiki/Proof_assistant) tools. – CPHPython Oct 23 '19 at 15:01
  • While exploit free software is possible, it can NEVER be quantified/verified. – Mr. de Silva Oct 23 '19 at 15:08
  • @slebetman and from my experience buggy specification are harder than usual to prove. – user2284570 Oct 23 '19 at 19:38
  • Formal Verification was a big push in the 80's but the closest anyone ever achieved, on a program of complexity, was rigorous validation of function. The linked Wikipedia article says, *"Formal verification of software programs involves proving that a program satisfies a formal specification of its behavior."* Classic testing verifies that a program does what it's supposed to do. **Testing rarely extends to trying to make it do something it is not intended to do.** Even if such tests are performed, how much is enough? – user10216038 Jan 03 '21 at 04:44
5

It's possible, but not economic without regulations that currently do not exist.

The answer about the proven to be correct kernel seL4 is very good in giving an example of bug-free code in the sense that it will perform exactly as described - and if they description is wrong, well, that might be called an exploit. But bugs in the description/specification are comparably extremely rare and it's debatable if they really even are bugs.

The limits that are also cited in the other answer all boil down to "we limited ourselves to the kernel, because we had limited resources". All of them could be solved by developing the hardware and surrounding software and client software in the same manner as the seL4 kernel.

If everyone did this, then writing, say, a provably correct website would become trivial, because all the tools you would be using would be provably correct and you would only be writing a little glue code. So the amount of code that would need to proven correct for a small project would be small. Right now, the amount of code that needs to be proven correct if you want to write a small provably correct program is huge because you would basically need to start over from scratch without having any of the tools available that were developed since the start of computers.

Some people today call for oppressive tools like like surveillance and censorship and trade blockades and counter attacks in response to digitization. If they instead switched to incentivizing secure software, for example by requiring a certain amount of liability (also called responsibility) from software and hardware manufacturers, then we would soon only have secure software. It would take much less time to rebuild our software ecosystem in a totally secure manner than it took to create it in the first place.

Nobody
  • 686
  • 4
  • 10
  • We'd also have to throw away all existing software (except for seL4 and CompCert). Yes, really, all of it. – user253751 Oct 21 '19 at 15:59
  • 1
    @immibis That might or might not be what would be economic if there were penalties to insecure software. I do imply in my very last sentence of the answer that it might come to that. But it's unlikely. It would be stupid to introduce sudden and large penalties, it would need to be gradual transition. So people might also gradually harden their software. Provable correctness isn't the only way to write secure software, although it would probably make insurance of liability much easier. – Nobody Oct 21 '19 at 16:16
  • Regardless of what is most economically viable, we'd have to throw away all existing software *right now*, because it's not secure. Until the new software is written, it's back to pencils and paper. – user253751 Oct 21 '19 at 16:19
  • @immibis As I reiterated in my comment, obviously this is not an instant switch. That's why maybe we would want to start now as opposed to when we're even deeper in the mess and someone starts a "cyber war". – Nobody Oct 21 '19 at 17:15
  • Bugs, vagueness, or other problems in the spec are frighteningly common, at least in my experience. Especially when some feature is added at a future point in time, the interaction between New Feature and Old Features is often not fully explored in the spec. It's fine to say "we implemented it as written", but if it's a real problem you're still going to be there in the middle of the night/weekend/your vacation fixing it. – user3067860 Oct 21 '19 at 21:26
  • @user3067860 There are several answers to that. If software is formally validated, then the spec must be complete. If security is important, that also separately means that specs will be more complete and better thought through. If most software was formally validated and/or generally made to be secure, then I think that implies that easy programming tasks are done in "easy" restricted environments that make exploitation of bugs impossible (e.g. writing a program in a turing-complete language that has access to the whole system is total overkill for a web page). – Nobody Oct 22 '19 at 16:09
3

Currently, it's very costly to write bug-free code that is complicated enough. It's even more costly to verify that it is actually bug-free, or the verifier program is bug-free, ad infinitum. I don't think anyone already had a solution for the scale of most commercial software.

But I'd argue that some programs, which may have bugs, would be at least free of vulnerabilities. For example, a program that is supposed to run in a perfect sandbox such as a browser, and doesn't attempt to interact with anything except the user, or at least doesn't have any documented promises that other programs are supposed to trust. If there is something going wrong, it's a vulnerability of the sandbox, and not the program itself.

We have ways to design systems that accept a result only if multiple differently designed versions of a program agrees. And we have ways to make the parts of a program stateless. We could recreate the promises by using these methods. As a sandboxing program would have limited complexity, I'd argue that, in the distant future, there is some hope to make it eventually possible to write exploit-free code as long as all the used algorithms are provable. I don't know if it will ever become economically viable, though.

donjuedo
  • 659
  • 1
  • 5
  • 8
user23013
  • 660
  • 5
  • 11
2

Most of the answers have focused on bugs that enable exploits. This is very true. Yet there is a more fundamental avenue for exploits.

If it can be programmed, it can be hacked.

Any system that can be programmed can be told to do stupid things, even malicious things.
Programmability can take many forms, some of which are not very obvious. For example is a word processor or a spreadsheet has a macro feature. This feature provides sequences to the user. If in addition, there are features providing selection and reiteration, suddenly it's very programmable.

If it cannot be programmed, the users will demand more flexibility.

Just about any complex application package will eventually create an environment where the users want to automate their routine behavior. This automation sometimes takes on the form of scripting, like Powershell or Python, but sometimes it comes about through something like a macro feature with a few extra bells and whistles for automation. When the builders accommodate the users, it's suddenly a programmable system.

Walter Mitty
  • 128
  • 3
  • "If it can be programmed, it can be hacked." No. If it was programmed wrong or to a bad spec - it can be hacked. Exploits are not some magical living things that pop up on their own. – Oleg V. Volkov Oct 22 '19 at 18:29
  • If you can come up with a clean way to divide programs into a productive group and a counterproductive group, then you are right. I assume that such a clean partition is not possible. – Walter Mitty Oct 23 '19 at 07:11
2

Just think in terms of 'developing' an impenetrable building... and think of few possible scenarios and assumptions:

  • is the budget limited? Always is! Bad actor with bigger budget could buy means of getting in (as in buy tools, bribing builders, ...)
  • there is always environment scale beyond which you have no control: a region going rouge, a meteor striking crucial safety infrastructure, technological advances further down the line which you had no way of planning for, ...

You could let your imagination run wild with this example.

And now accept the fact that buildings are often simpler to defend as being physical objects, most likely simpler and rarely built from components with as long chains of dependencies or as hard to establish provenance as 3rd party software libraries are.

diginoise
  • 121
  • 3
1

Theoretically, yes.

Although exploit-free software is possible, it is extremely hard to achieve, if you could program a piece of software to program for you, technically, this is possible. I have heard of people attempting to make something like this, although it is harder than it seems, creating a bot that can program for you, is harder than it seems. Another way a program could be exploit free is if the piece of software is mathematically proven. Although, man made code could not achieve something like this, other types of programming can be exploit free if it didn't require human input.

yosh
  • 11
  • 2
1

Writing perfect code is like building a perfect car. We might be able to build a perfect car but only for the age we are in. As the technology grows, ideas get shared and more brains get togather to solve problems then you might have something much better.

You are correct in saying that if a company keeps working on a software, then at some time in time they will be bug free. Thats true, but with time different technologies evolve and you make choice to either stay up to date with technology or just keep up with the same old perfect codebase.

Lets take example of facebook because they are a large group and are focused on on a single product. Facebook used to use jquery library for all the dynamic stuff a few years back. It was a cutting edge technology and everything was going great and never thought of replacing it. But to keep users engaged they needed to become much more dynamic. So as facebook grew and needed more and more dynamic functionality and realised that jquery was not fulfilling their needs.

Because no other website had that many users, no body actually understood the need for newer libraries. So they started to work on their own library called React. As time passed on more people started using the internet because of facebook and so obviously they got introduced to other sites as well. Now other websites also started to have the problems that facebook were facing, but fortunately now they had React Library to fullfil their needs instead of building a new one.

Google was having a similar problem and instead of using facebook's React they thought of building their own to address their specific needs. This will keep on going and there wont ever be a single codebase that is perfect.

Its the law of nature whenever something bigger arrives that drives more people to think bigger and do better than that, Similar to how more and more powerful characters keep on coming in Avengers.

Because time is the only unique entity and there never is an unlimited amount of time. Business owners as well as developers make triad off's. Triad off's in code can be something like:

  • To be more optimized/faster or to be more manageable ?
  • To focus more on security or to have a better user experience ?
  • Should new features be more tested or to ship new features on time ?

We make these triad off's everyday...

omer Farooq
  • 111
  • 3
1

For specific cases (programs), almost. In general, NO

  1. For specific cases:

You can repeatedly refine a given program until most or all known forms of vulnerabilities (i.e. buffer overflows) got away, but many forms of vulnerabilities happen outside the source code. For example, suppose you compile such that almost or perfect program. This produces an object or executable program that you distribute. In the target computer it is exposed to malware that can modify such that binary code i.e. inserting jumps to malicious code that of course, are not in the original program.

  1. In general

Is it possible to have a program, now or in the future, being able to validate the source code of any program for vulnerabilities ?

A bit of theory. Being a vulnerability-free program is a semantic property of programs, not a syntactic one. A syntactic property can be formalized (and hence, it can be detected by formal methods), but a semantic one cannot:

A semantic property is one that is not a trivial semantic property. a trivial semantic property is one that is always present or always absent in all and every program. A well known semantic property of programs is "This program will run forever" (the famous Turing's halting problem) because some programs will run forever, while some other won't. Turin proved that the halting problem is undecidable, so a formal method to test the halting nature of any program cannot exist.

The Rice's theorem states that that all non-trivial, semantic properties of programs are also undecidable. In fact, the proof is based in the fact that if a non-trivial semantic property of programs were decidable, it could be used to solve the halting program, which is impossible.

As another example of semantic properties, consider the property "This program is harmful". It is of course a semantic property and hence, as a consequence of the Rice's theorem a formal and deterministic malware detection program cannot be built; most of them use heuristics for their detection procedures.

Of course, as it is used in malware detection, you can use heuristics, artificial intelligence, machine learning, etc. to approach to a method for searching vulnerabilities in code, but a formal, perfect and deterministic one cannot exist.

0

The first rule of software testing (QA):

'It cannot be confirmed that the last bug has been found'.

I have coded since 1980 (also an electronics engineer) and none of my software has been exploited, that does not mean it couldn't be, just that nobody did it. Banking systems (and 'Snowden'-like systems) have auto-triggering warning/audits to log unauthorized access (I have worked on similar systems).

So, yes, exploit free software is possible, but how would you quantify/verify it?

Finally, look up FCC (USA) rules:

Part 15 of the FCC rules, which governs unlicensed devices, incorporates a fundamental tenet of U.S. spectrum policy: an unlicensed device must accept interference from any source, and may not cause harmful interference to any licensed service

Which means your Wi-Fi signal is 'exploitable' which in turn means the software on it is 'exploitable'.

schroeder
  • 123,438
  • 55
  • 284
  • 319
Mr. de Silva
  • 109
  • 4
  • How does the " auto triggering warning/audits to log unauthorized access"work? If you exploit their system how can they log the unauthorized access? – Zheer Oct 23 '19 at 15:37
  • There are two types of exploitation, authorized and unauthorized. An authorized person is allowed to access a system within specific parameters (times/days/allowed menu items,etc). An unauthorized user also falls in to that category as well as doing something without logging in. Logging is 'easy', triggering means the script and/or database must be able to alert/control activity of the user/script ($$$). – Mr. de Silva Oct 24 '19 at 01:51
  • Your WiFi section does not appear to relate to the rest of your answer, appears to make a colossal leap in logic, and a misinterpretation of Part 15; "radio interference" does not mean "exploit". – schroeder Oct 25 '19 at 12:08
  • @schroeder: I can exploit anything! I have the skiil,that means there are others out there. The question is 'exploit free software' and it cannot be done because the vessel is exploitable! Also, I have DESIGNED and BUILT FCC compliant devices, so I know the FCC regulations and implications. – Mr. de Silva Oct 25 '19 at 13:57
  • And you are still reading Part 15 wrong. And you keep making errors in logic. You are correct, the question is 'exploit free software' not 'hazard-free systems'. Exploits in the "vessel" are out of scope in the question, even if Part 15 somehow meant "WiFi exploits". – schroeder Oct 25 '19 at 14:19