152

What analysis was Bruce Schneier referencing when he wrote:

Viruses have no “cure.” It’s been mathematically proven that it is always possible to write a virus that any existing antivirus program can’t stop.

From the book Secrets & Lies by Bruce Schneier, page 154.

Luc
  • 31,973
  • 8
  • 71
  • 135
Cate
  • 1,235
  • 2
  • 6
  • 4
  • 5
    I think it is already impossible to do a complete and formal definition of what a computer virus is in the first place, i.e. describe what exactly the "maliciousness" is. But such definition is needed in order to even argue about decidable vs. undecidable problem. But even if ou find a kind of useful albeit not perfect definition it will probably be undecidable because of the [halting problem](https://en.wikipedia.org/wiki/Halting_problem). – Steffen Ullrich Mar 17 '18 at 17:17
  • 4
    Post is locked because it is generating a ton of speculation and opinion. The question is asking about the source of a paper referenced in another article. – schroeder Jan 24 '19 at 07:51
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/88818/discussion-on-question-by-cate-has-it-been-mathematically-proven-that-antivirus). – Rory Alsop Jan 25 '19 at 21:25
  • 4
    This questions reminds me of a dialogue by Douglas R. Hoffstadter explaining Gödel's Incompleteness Theorems: https://genius.com/Douglas-hofstadter-contracrostipunctus-annotated – 0112 Jan 26 '19 at 23:03
  • 1
    One example of not being able to detect a "virus" or what a function in JavaScript returns **unless the function is called** [How do I check if a JavaScript function returns a Promise?](https://stackoverflow.com/q/43416214/); [Can a regular expression be crafted which determines the return type of a function?](https://stackoverflow.com/q/43417236/); https://astexplorer.net/#/gist/7b371f354537e9d3415bc8ed9fad9c94/9cb99f18e70987ff09d1195b7c3189be87d67105 – guest271314 Jan 27 '19 at 20:46
  • 3
    The Halting Problem says yes? If you can't tell if a program halts, it would seem to follow that you also can't tell if a program does something malicious/is a virus. – aroth Jan 29 '19 at 10:39
  • 2
    Your title asks a different question than was claimed in the quote. The quote claims not all viruses can be stopped. This is obviously different than asking if they can all be detected as in your title. (This is true case since it is possible to stop a virus without detecting it) – Matt Jan 29 '19 at 13:38
  • Do you consider monitoring application as anti-virus ? It is always possible to create monitor that detects attempts to write,delete and even read out of assigned directory (let's say /usr/local/virus) or to grow this directory beyond some limitation. This would for most practical purposes eliminate vast classes of viruses. – rs.29 Jan 31 '19 at 07:18
  • 2
    Easy counter-example: A program which displays cat-videos and sends itself to all your friends. Some may see it as a virus, others will see it as a great app with cool features. So this single instance is already undecidable. – Falco Jan 31 '19 at 13:20

19 Answers19

213

Under one possible interpretation of that, it's a result of Rice's theorem. A program is malicious if it performs some malicious action, which makes it a semantic property. Some programs are malicious and some aren't, which makes it a non-trivial property. Thus, by Rice's theorem, it's undecidable in the general case whether a program is malicious.

  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/88726/discussion-on-answer-by-joseph-sible-has-it-been-mathematically-proven-that-anti). – schroeder Jan 24 '19 at 07:42
  • 30
    Not to mention, "malicious" is an almost entirely subjective term. I've seen disassembly programs get flagged as malware because the AV author apparently considered reverse engineering to be a "malicious" thing (even though I can't think of a situation where someone who didn't want said disassembler on their computer would end up with it there). – Doktor J Jan 25 '19 at 20:17
  • 7
    @DoktorJ: The point is that even if you come up with some rigid non-syntactic definition for "malicious" _(regardless of if that definition misses some viruses, or includes some non-viruses)_, it's still impossible to flag all/only malicious programs. – BlueRaja - Danny Pflughoeft Jan 25 '19 at 22:49
  • 2
    The OP asked about "viruses", not malicious software. viruses are not necessarily malicious. The only determining property is that it self-replicates. (Now it's likely the OP _meant_ malware (virus or not)). – Stéphane Chazelas Jan 26 '19 at 22:45
  • 6
    @Stéphane Chazelas Self-replication is a non-trivial semantic property as well, ergo it is undecidable. – The_Sympathizer Jan 26 '19 at 23:49
  • 1
    @BlueRaja-DannyPflughoeft: it's important that this is true only insofar as it's impossible to tell up front, with a guarantee of termination. Given a rigorous definition of actions the program is not allowed to perform it's perfectly possible to do it while the program is running -- this is the principle behind sandboxing. Most users would settle for a program that stops malware before it does something bad, regardless of whether it also got the opportunity to show some dancing monkeys first. The problem is that these "rigorous definitions" will inevitably turn out to be incomplete. – Jeroen Mostert Jan 27 '19 at 11:40
  • 4
    I think Rice's theorem only applies if you have to decide *ahead of time* if a program is malicious, and if you don's allow false positives. If the virus scanner can e.g. intercept API calls, modify the program (e.g. to insert runtime checks) and occasionally reject a "safe" program, it should be possible to prove safety. That's basically how e.g. a Java VM can offer safety guarantees, isn't it? – nikie Jan 28 '19 at 18:07
  • 3
    @JeroenMostert: No, it has nothing to do with static vs runtime analysis. The difference is in whether the definition is semantic or syntactic. "A list of the actions the program is allowed to perform" is a syntactic definition. This is why writing a sandbox is so much easier than writing an antivirus. – BlueRaja - Danny Pflughoeft Jan 28 '19 at 23:11
  • @JeroenMostert: Sandboxing does **not** solve the problem. How do you know whether the latest version of your browser that you installed has a hidden malware that captures your keyboard input on bank websites? Are you going to use a different sandbox for every sensitive website? How do you know that your latest sandboxing software is not itself malware? How do you know that your operating system... – user21820 Jan 30 '19 at 14:32
  • @Joseph: I agree with your answer, but one possible quibble is that real computers are not Turing-complete, and hence it is theoretically possible (though practically impossible) to determine whether any program would cause your specific computer to enter an undesirable state. Naturally, you would not be able to do so within your computer itself, but theoretically you can use another computer that has more memory to do the analysis. Of course, the time needed is proportional to the number of possible states your computer can be in... – user21820 Jan 30 '19 at 14:36
  • @user21820: I never said sandboxing was or could ever be a perfect solution (it should be quite obvious that *no* solution can be an absolute guard against malware, from a simple practical point of view, let alone theory). The question is whether Rice's theorem is the correct thing to invoke here as mathematical proof that such malware can't be detected. (Or, going up one more level, if Schneider was referring to that at all, which does not appear to be the case -- despite being undervoted, [this answer](https://security.stackexchange.com/a/202003/147318) appears to be the correct one.) – Jeroen Mostert Jan 31 '19 at 11:11
  • @user21820 that is not even complete, because two programs could be different only in where your data is actually sent. A banking app and a password sniffer will both read your password and send it to a server - your machine will do exactly the same thing, but the receiver is different - so a local scan can never decide if the receiver is someone you actually intent to sent your data to, so the whole internet with all reachable machines together should minimally be your state machine for checking all states - good luck... – Falco Jan 31 '19 at 13:19
  • @Falco: Yes of course. There are many such details but the comment box was too shallow to contain it. – user21820 Jan 31 '19 at 14:07
193

Actually, the opposite can be easily proved: since all computer viruses are executable code in one way or another, all you have to do is write an antivirus program that will report ANY executable code as viral. It logically follows that such a program will detect ALL possible viruses:

  • All code is detected by your antivirus (C → D)
  • All viruses are code (V → C)
  • All viruses are detected by your antivirus (V → D)

Sure, an argument could be made about this antivirus software barfing out too many "false positives". But what are the criteria to decide whether a positive is false or true? Ah! Turns out the distinction between benign code and malicious code, between an honest "remote PC control" suite and a trojan like Netbus is completely arbitrary, and so the whole question is pointless.

walen
  • 1,765
  • 1
  • 6
  • 6
  • 127
    Cute. I mean, utterly irrelevant, but also cute and clever. – Jack Aidley Jan 23 '19 at 14:39
  • 4
    One can somewhat more usefully require that programs be composed of certain actions which cannot be composed in such fashion as to yield "virus" behavior, and then reject any programs that are not composed solely of such actions. This approach is practical for many purposes; the biggest problem is that the range of tasks that can be done using only actions that can't yield virus behavior falls somewhat short of the range of tasks people want computers to do. – supercat Jan 23 '19 at 17:40
  • 18
    @supercat, the problem with your version is that you need to come up with a definition of "virus behavior" that can be programmatically applied. For example, a Javascript Bitcoin miner might be virus-like when visiting the New York Times website, but desired behavior when visiting Joe's Discount Pirate Software. Encrypting the contents of your hard drive may be good behavior when it's your FDE software doing it, and virus-like when it's being done by ransomware. – Mark Jan 23 '19 at 21:17
  • 5
    This antivirus program would detect *itself* as a virus and shut itself down, making it incapable of detecting anything else. – Nuclear Hoagie Jan 23 '19 at 21:43
  • @Mark: I would define a virus is as a program that installs itself to be run automatically without the computer owner or operator's consent. If the set of allowable actions does not include any means of installing program to run automatically, then it will not be possible to build a virus from actions in that set. – supercat Jan 23 '19 at 22:06
  • 8
    @supercat A lot of viruses are installed with the operator's consent, but do things the operator didn't consent to. However, if you prevent automatic running, then a reboot would kill it, just not the side effects. – Poik Jan 23 '19 at 22:41
  • @Poik: Not all malicious programs that spread themselves are viruses. For a piece of code to be a virus, it must among other things be able to, without an operator's consent, cause itself to be executed at a future time when no part of it is running. Preventing applications from modifying other programs or the startup sequence will 100% prevent viruses, though other forms of malware such as worms would still be possible. – supercat Jan 24 '19 at 00:31
  • 3
    except, how do you detect whether something is executable code? e.g., the JPG virus that exploited a bug in Microsoft's JPG rendering (causing code to execute when Microsoft Internet Explorer visited a web page). So the only way would be to expand your pseudo-code to say that all bits are, since any bits could be executable code. If we're going to get so ridiculous, I could say that "bytes" (reported by "dir") is actually a translation of "virus found", so "dir" is actually anti-virus software that does that. Utterly ridiculous. (Not to say that I don't appreciate the ridiculousness. Bravo.) – TOOGAM Jan 24 '19 at 03:16
  • Antivirus that detects all executable code as malicious would detect whatever OS it’s running on as malicious too. If this AV program were to actually do anything about that, it would crash the OS pretty quickly, every single time. Wouldn’t make a great AV program, but would make for a nasty virus. – HopelessN00b Jan 24 '19 at 04:27
  • 12
    Sounds like an argument for *white-listing* wanted programs only, as in a repository of only tested software to install (maybe even FOSS, not unlike most linux repositories), to at least avoid installing malicious things on purpose – Xen2050 Jan 24 '19 at 10:53
  • 1
    When asking a question about something even remotely practical, you can assume that the trivial solution is excluded by default. – jpmc26 Jan 24 '19 at 22:44
  • 2
    I'd challenge the first premise: *All code is detected by your antivirus.* as being the piece of this which is practically impossible. – TemporalWolf Jan 24 '19 at 22:57
  • 2
    @Xen2050 "Sounds like an argument for white-listing wanted programs only..." Yes, it does. Unfortunately there are millions of apps for smartphones for example. I don't want to be the one who has to test them all. – Trilarion Jan 25 '19 at 10:14
  • Can one consider a script to be "code"? If so, I think the first premise becomes a bit less trivial – UKMonkey Jan 25 '19 at 13:19
  • What does Virical mean? I can't find a definition for it. –  Jan 25 '19 at 19:33
  • @TemporalWolf No, it's agonizingly trivial: The first instruction of the machine code is halt. If at any point you find your machine to be halted, then it has detected code. – Fax Jan 26 '19 at 00:27
  • 3
    -1 because this answer is not helpful in any way. – mbomb007 Jan 27 '19 at 19:14
  • 1
    @TOOGAM: exactly. If we're talking about x86, most byte sequences decode as valid x86 instructions. Usually ASCII text results in a sequence of instructions that doesn't make any sense, or includes privileged instructions like `6c ins` and `6e outs`, but it's not rare for a sequence to contain no illegal instructions. (Especially in 32-bit mode.) I put the text of this answer inside `.ascii "..."` in foo.s (replacing newlines with `\n` and escaping the quotes), and assembled it with `gcc -m32 -c foo.s`. Lots of short conditional-branch instructions, but no illegal instructions! – Peter Cordes Jan 28 '19 at 05:12
  • Oh, I can put it on Godbolt in a way that will show disassembly *with* hexdump: https://godbolt.org/z/IIXBUc (unlike this lang=asm first attempt that doesn't: https://godbolt.org/z/-2ATIj). Some of the fun longer instructions include `imul esi,DWORD PTR [edx+esi*2+0x61],0x202c7972`. – Peter Cordes Jan 28 '19 at 05:47
  • 1
    @Fax that assumes you are able to scan the program to begin with. Perhaps it's run through an attached peripheral, runs as a boot loader, decrypted on demand, or it mimes user interaction by pretending to be a keyboard. All four of these require different, non-trivial approaches to enforcing and are all things which are currently known to be exploited in the wild. Your system also requires the absolute, non-trivial restriction that no non-executable files contain the binary representation of a machine halt instruction... here are several unicode chars which contain 0xF4, the x86 code: ôǴ˴ϴӴ״ – TemporalWolf Jan 28 '19 at 19:07
  • 1
    @TemporalWolf No, it does not assume that. From a program's perspective, "any code is present" is a logical tautology. Nobody said you computer can't be an excellent paperweight. – Fax Jan 29 '19 at 01:38
  • @JackAidley There is nothing "clever" in there. It's like concluding that if we kill ourselves, then we won't have more problems in life. – Red Banana Jan 29 '19 at 11:37
  • This answer would work perfectly in a setup where the AV is called to judge a bit of code, and has to give an answer. Unfortunately this leaves a gap in your proof as you would need to prove that the AV will be asked to judge each bit of code. (And the smaller gap that it needs to be able to answer). – Dennis Jaheruddin Jan 29 '19 at 11:46
  • 1
    I have a cure for every disease known to man! – Hot Licks Jan 30 '19 at 02:58
  • I feel like this is an incorrect use of "detect". It's analogous to saying that if a jury finds a man guilty for the wrong reasons, even if he was really guilty, that they *knew* he was guilty. To *know* means something more than happening to be correct, and "detect" here is used very similarly; it means something more than just deciding. It should mean that some analysis has been performed, and that some justification for the decision exists. – Dave Cousineau Feb 01 '19 at 16:57
  • @Steve Looks like an incorrect attempt at turning "virus" into an adjective. If so, it should be "viral." – jpmc26 Mar 28 '19 at 00:00
  • @jpmc26 More like non-native mistake. I know it is "viral" in English, but in Spanish it is "vírico, vírica" and I inadvertently mixed them. – walen Mar 28 '19 at 07:20
95

According to Wikipedia:

In 1987, Fred Cohen published a demonstration that there is no algorithm that can perfectly detect all possible viruses.

It also references this paper. That might be the analysis Mr. Schneier was referring to.

Harry Johnston
  • 1,667
  • 10
  • 14
  • 17
    Schneier states here that this is the result he was referring to when he made a very similar statement some time later in another context: https://www.schneier.com/blog/archives/2009/07/making_an_opera.html – Steve Jessop Jan 23 '19 at 11:31
  • 13
    @Stephan: For every possible binary there exists a hypothetical platform for which it is virulent and a hypothetical platform for which it is not virulent. This theorem is stronger than Rice's theorem as it applies to hypercomputers also. – Joshua Jan 23 '19 at 16:47
  • 1
    so it was proven that in 1987, the perfect antivirus did not exist? neat. but did he also prove that the perfect antivirus could not be created, or merely that it didn't exist as of 1987? – user1067003 Jan 24 '19 at 15:11
  • 9
    @user1067003, I'm not a computer scientist, but I'm pretty sure the phrase "there is no algorithm" means that there is no possible algorithm, not just that there is no *known* algorithm. Otherwise it would not be a very interesting result. (The proof as written probably only applies to classical computers though, it might need to be redone to cover quantum computers or any other more abstruse architectures.) – Harry Johnston Jan 24 '19 at 20:45
  • 7
    This answer heavily relies on an external link to answer the question. Can you maybe sumarize the arguments from that paper? – Philipp Jan 25 '19 at 08:52
  • 3
    @HarryJohnston: Quantum computers are equivalent to Turing Machines in terms of computability, so no, it doesn't need to be redone. In fact, pretty much every computer scientist believes that _every_ physically-realizable computer is, at best, [equivalent to a Turing Machine](https://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis) – BlueRaja - Danny Pflughoeft Jan 25 '19 at 22:57
  • 1
    @BlueRaja-DannyPflughoeft the church turing thesis states that the human mind is equivalent to a turing machine. That has no basis on whether or not a hypercomputer is possible to construct. All it means is that we wouldn't be able to do the calculations by hand. We'd have to rely on the machine being correct. How does that article at all back up that claim? – user64742 Jan 26 '19 at 17:35
26

The statement can't be mathematically proved unless it is reformulated as a mathematical proposition.

At the very least, that requires a mathematically sound definition of what a "virus" is: which is challenging; and you might end up with an abstraction that isn't useful in practice, because it includes some behaviours which people regard as entirely benign and useful, and/or excludes some behaviours which people consider antisocial.

The tough thing here is that a virus is a program that changes its environment in some way, and any attempt to define the environment rigorously is going to be too limiting for practical use.

So I would say no: the proposition can't be mathematically proven, and that's because it can't be mathematically formulated.

Michael Kay
  • 491
  • 3
  • 6
  • 2
    A counterexample: I'll specialize the original question (losing its generality), by defining *a virus* as any `bash script` which invokes `rm -rf /*` when run with no arguments; and I'll define *an antivirus* as a version of bash interpreter which denies that `rm -rf /*` command from execution. I hope you agree that we can go all the way down rigorously defining what a "bash interpreter", "execute", etc means mathematically. Did I just formulate a weaker version of the original question? Does this disprove your claim that *it can't be mathematically formulated*? Yes and yes. – ulidtko Jan 23 '19 at 18:36
  • 16
    @ulidtko No. You cannot just arbitrarily define a virus to be anything you want. Otherwise, I could define a virus as "software which has 'anti-virus' in its name", which would not help the discussion any, as that is a discussion of philosophy, not of security or mathematics. In fact, your definition of virus is like saying "I define the set of even numbers to be {2}". Go ahead and make such a system if you want, but you have not disproven anything by doing so. As Michael has suggested, "virus" cannot be mathematically defined in any way useful to this conversation. – Aaron Jan 23 '19 at 20:48
  • 6
    Re: "At the very least, that requires a mathematically sound definition of what a 'virus' is": Not necessarily. If there are characteristics that would apply to *any* definition of "virus", then you may be able to mathematically formalize some of those characteristics without needing to produce a specific complete definition. That partial characterization could be enough to let you apply [Rice's theorem](https://en.wikipedia.org/wiki/Rice%27s_theorem). – ruakh Jan 23 '19 at 21:16
  • 2
    @ulidtko On the same basis, I could define a virus as any program that sends an email. I'm sure I could do some reasoning about that class of programs, but it would tell me nothing about the class of programs that ordinary people classify as viruses in the real world. – Michael Kay Jan 23 '19 at 22:07
  • @ruakh Rice's theorem only applies if you restrict your anti-virus program to operate by static analysis of (source or object) code. The question is about "detecting viruses", not limited to detecting them by static analysis. What happens if you detect them by observing their self-replicating behaviour? – Michael Kay Jan 23 '19 at 22:12
  • 1
    @MichaelKay: I'm aware; I only mentioned Rice's theorem as an example of why your statement was incorrect. Sorry if that was unclear. – ruakh Jan 23 '19 at 22:17
  • On a 64 bit system with a full 64 bit memory space and no expandable memory the size of memory will be $2^{64}$ bytes. If we then consider all possible permutations of that memory space then there are $8^{2^{64}}$ programs in that memory space. Less if we consider that the compiled binary will coexist with memory. Because the number of programs is finite they can be divided into a set of malicious programs and non-malicious programs. That would be the definition of the of malicious 64 bit programs. Your claim is that we do not have an definition that would expand to 128 bit and so on. – user64742 Jan 26 '19 at 18:50
  • Your claim might also be that one particular combination of bits might be malicious to one user and non-malicious to another user. In that situation then the issue can be resolved by allowing a user to whitelist something that was flagged as malicious. It's not ideal but it all it really means is that people will have different definitions of malicious or in a more realistic meaning - that some users *want* to run malicious code on their own machine. – user64742 Jan 26 '19 at 18:52
19

tl;dr- The answer depends on exactly what requirements you impose on the question.

  1. If you merely want to detect all viruses without further constraint, then simply flag anything and everything as a virus, and you're done.

  2. If you want to properly identify all programs as either a virus or not, then it's impossible in the unbound case since the classification problem reduces to the halting problem.

  3. If you want to properly identify all programs as either a virus or not, and you're considering a finite machine, then it's theoretically possible but not generally feasible in practice.

  4. If you allow that the computer can produce random errors, then any program can be a virus.

Case 1: Complete virus detection

Obviously, flagging all programs as viruses catches 'em all. (Pok'e'mon!)

Starting with this case to make the point that it's not hard to detect all viruses; rather, the specific theoretical problem is correctly classifying iff programs are viruses.

Case 2: Impossible to correctly classify in a general, unbounded scenario

Consider the program:

doHaltingProblem();          //  Not a virus operation itself
installEveryVirusEver();     //  Definitely a virus operation, but will it happen?

In this case, the program is a virus only if the halting problem halts, allowing installEveryVirusEver() to occur. So, virus-detection reduces to the halting problem in the general, unbound case.

Case 3: Possible by brute-force search in bounded scenarios

If programs to be classified as viruses-or-not are to operate on a finite machine, then you can simply simulate the machine running from every possible starting state. Finite machines will eventually loop back to a prior state, so it's necessarily a finite (if lengthy) analysis.

Case 4: Machines that can error can have viruses spontaneously emerge

Assuming that a machine can run a program that'd be considered a virus, and there's a non-zero chance of a random mutation shifting it into that state, then it should eventually arrive at a virus state.

Which is kind of a boring point, but just for completeness's sake.


Discussion on the quote in the question

Viruses have no “cure.” It’s been mathematically proven that it is always possible to write a virus that any existing antivirus program can’t stop.

"Secrets & Lies", Bruce Schneier, page 154

As pointed out in Case (1) above, it's possible to flag all viruses by just flagging everything as a virus; that's easy. What's impossible is to, in an unbound case, determine if every possible program is a virus or not.

Further, it's difficult to establish if particular programs are viruses in bound cases. For example, consider the program:

var toExecute = decryptByBruteForce([ciphertext]);  // Decrypt the next part of the program by brute-force
run(toExecute);                                     // Run the now-decrypted part of the program

As discussed in Case (3), this program can be classified as a virus-or-not when run on a finite machine, but since doing so would require brute-forcing an encrypted message, it's not likely feasible in practical scenarios.

So, in real-world applications, it reduces to an issue of heuristics: anti-virus programs make guesses at what's a virus or not. Or if you want more reliable security, you could have an anti-virus program flag anything that it can't prove to be safe, dodging the issue of needing to classify every possible program.

Unfortunately, the use of heuristics gives knowledgable attackers security holes to target. Without reading the source of the quote, I suspect that this problem is what they were trying to refer to.

Nat
  • 1,443
  • 2
  • 10
  • 13
  • The case (4) is not just boring, but indicates a more likely scenario and problematic scenario. Consider a code that contains some calls if (cannotHappen) installEveryVirusEver(); where cannotHappen cannot happen if the machine works ok, but is likely to occur if the machine is overclocked. How to correctly classify that? – Hans Olsson Jan 25 '19 at 12:34
  • 1
    @HansOlsson Yeah, it'd require generalizing the ontology to capture problems that don't fit well into the deterministic model; presumably we'd have to consider the ensemble of possible outcomes, then classify them according to how secure we'd deem them, or else just concede that programs in general may act maliciously even if not designed to. I was hoping to provide a nod to issues like [row hammer](https://en.wikipedia.org/wiki/Row_hammer), where the presumption of error-free operation can really mess up analysis. – Nat Jan 25 '19 at 12:42
  • "brute forcing an encrypted message" Why not run the program in a sandbox, and figure out how it decrypts the message itself? IIRC that's how some antivirus already works. – micsthepick Feb 01 '19 at 05:50
  • @micsthepick Running a program in a sandbox is a better-scaling solution that's liable to be generally superior in non-trivial cases, so you're right that that'd tend to be preferred in practice. Just, the question is about anti-virus in the sense of preemptive analysis; sandboxing is a different subject. – Nat Feb 01 '19 at 06:03
  • Worth noting that the halting problem example can be bypassed by simply flagging any program that has the potential to install a virus be virus (I.e. dedicated byte code to do that exists so presumably the software can do so). – user64742 Jan 03 '20 at 03:29
16

It depends on your definition of "stop".

If you define stop as being "detect, ahead of time, that code could do something malicious, and prevent it running", then as others have mentioned, this is impossible, by Rice's theorem.

If you define stop as "detect when a running program is attempting to do something bad, and then stop it", then Rice's theorem doesn't apply. Your antivirus doesn't need to decide if the program could do something malicious, only whether it's doing something malicious now.

AFAIK, the second version hasn't been proven impossible mathematically impossible. And indeed, for any sufficiently specific definition of "malicious", it's very doable - this is essentially sandboxing.

It seems likely, however, that there is no good definition of "malicious", than would cover all the forms of malice that a virus might attempt. What about a virus that mines bitcoin? Or that serves up pirate movies? Or that spams message boards on your behalf? All of these are indistinguishable from code being run by users who genuinely want to do exactly these things. As such, I suspect (although I don't know of any attempt to prove) that creating antivirus is AI complete.

James_pic
  • 2,520
  • 2
  • 17
  • 22
  • 1
    The problem of defining malicious is hardly a show-stopper. Different people will have different interests that go into constructing a definition appropriate to them, but that's okay. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 23:55
  • So, someone produces a link on a web site that installs software on your machine to monitor what movies you are watching, and feeds the data to MegaCorp. Is this a virus? Does your answer vary depending on whether the "someone" is Microsoft, versus a teenage hacker? Does your mathematical assessment of the software take into account the trustworthiness of the originator? In short, can this actually be formulated as a mathematical problem? – Michael Kay Jan 24 '19 at 00:25
  • @MichaelKay: Yes. No. No, just whether the action violates your policy. Yes. – R.. GitHub STOP HELPING ICE Jan 24 '19 at 00:36
  • I've heard of NP complete, but what is AI complete? – Michael Jan 24 '19 at 19:06
  • 1
    @Michael AI compete is the (I believe somewhat more woolily defined) idea that "you can't solve this problem without also creating an artificial general intelligence". I don't think it's as clearly defined as NP, so it's probably not amenable to mathematical proof. – James_pic Jan 25 '19 at 07:37
  • @R The increasing sophistication of malware, and the increasing shittiness of commercial software vendors, is making that grey area very big. The most sensible definition to me of malware would be software that did something that the user didn't want, and wasn't aware would happen - which depends not just on the user's desires, but on their ability to understand what's happening. Does that include the crappy browser toolbar Oracle include with Java updates? Or the MITM proxy Lenovo included with some of their laptops? Pretty much everything but GNU HURD is in the grey area. – James_pic Jan 25 '19 at 07:46
  • 1
    Thanks, that makes sense. I had heard a less rigourous version of this, along the lines of: here are five (or 10) problems which would be high value to solve, but it turns out that solving just one of them (creating an AGI/ASI) would actually provide the means to solve the others as well (assuming it doesn't wipe us out / make us its pets / etc.) – Michael Jan 25 '19 at 18:25
  • 1
    Reminds me of the old question "what is a weed?" with the answer "anything you don't want growing in your garden" , re: your comment to @R – Michael Jan 25 '19 at 18:27
8

Yes, it was mathematically proven by Alonzo Church in 1935–36 and independently shortly thereafter by Alan Turing in 1936.

https://en.wikipedia.org/wiki/Entscheidungsproblem

  • 23
    If the statement *was* somehow reducible to the halting problem, the interesting bit would be showing that this is the case, not just name-dropping it. Or, to put it more directly: no, this is definitely not what Church and Turing proved (it's not even close to what they set out to prove). – Jeroen Mostert Jan 23 '19 at 13:13
  • 1
    Yes @JeroenMostert you have right. I was wrong. Church and Turing proved that there does not exist mechanism that decide if such program that detect all viruses could be even written. I have confused the concepts. Uncle Bob was blogging about it:https://blog.cleancoder.com/uncle-bob/2018/06/21/IntegersAndEstimates.html – Michał Kuliński Jan 23 '19 at 13:20
  • 1
    So, does this answer need to be deleted or completely rewritten? – schroeder Jan 23 '19 at 13:53
  • 5
    Martin's blog, while amusing, commits some of the more common fallacies when it comes to explaining the halting problem. For example, "the specification of a program is a great big Diophantine equation in a bazillion unknowns" is wrong, plain and simple. Certainly that's *a* way of specifying a program (if you squint), but you need to be *much* more careful than that to make meaningful statements over what proofs say is and is not possible. Explanations like these tend to obfuscate rather than clarify -- the actual math is subtle, but for the good reason that these things *are* subtle. – Jeroen Mostert Jan 23 '19 at 14:15
  • 1
    I'm not too sure this answer is wrong: If "to halt" means "to produce a report listing all viruses on the system, no more or less", then we can't prove in advance that the program will halt with a correct answer unless we run it to find out. Note that the referenced paper also refers to the halting problem: https://web.archive.org/web/20140525233117/http://grnlight.net/index.php/computer-articles/109-an-undetectable-computer-virus – stevegt Jan 23 '19 at 22:16
  • @schroeder deleted. – user64742 Jan 26 '19 at 18:42
  • @stevegt the halting problem isn't about whether it will give a correct answer. It is about whether it can even halt in the first place, which is unanswerable period. You'll never know whether it will halt but it's not useful here as no connection has been made between a machine that detects whether a machine halts and the machine that detects whether a machine does something malicious. – user64742 Jan 26 '19 at 18:44
8

No, you can not, because the difference between malware and a useful program is completely subjective. Any "evil" behavior can be intended behavior. For example, I have the following programs running on my computer right now:

  • A program which encrypts all my files. Is it a cryptolocker ransomware or a full disk encryption tool?
  • A program which allows remote access to control my computer. Is it a trojan or is it Team Viewer?
  • A program which creates network connections to all kinds of computers all over the internet and exchanges obscure data with them. Is it a a botnet or a distributed computing platform?
  • A program which sends all my personal files to a remote server. Is it spyware or is it Cloud Backup?
  • A program which downloads executable files from the Internet and runs them. Is it a malware dropper or is it Steam?

You can't tell the difference, because from a purely technical perspective they do the exact same things. The only difference is in the intention. One program deceives the user about what it does and acts against the users interests, the other does exactly what the user wants it to do. But what the user really wants from their software is something a machine can not decide... at least not at the current stage of AI technology. That's why all virus scanners are primarily signature-based.

Philipp
  • 48,867
  • 8
  • 127
  • 157
5

To mathematicly prove that it is always possible to write a virus that can bypass all existing anti-virus programs, you would first have to mathematicly define what is a virus, and good luck with that.

Perhaps a "program that performs undesired actions"? Well that is simply impossible, imagine a program that opens a connection to your PC and enables remote control. The antivirus can see that the program does this, but how can it know if it is desired or not?

It might be a legit remote control program like TeamViewer, or it might be a virus mascarading as a simple image viewing program, eigter way, your antivirus will see a program that can read and view images from your PC and open remote connections, and it will have no way to tell if that a "desired" behavoiur or not, since it can't know why you are installing that program.

kajacx
  • 151
  • 3
4

As pointed out by @walen, it is actually possible to detect all viruses if false-positives are allowed: just report everything as a virus.

Let's assume that it is also possible to detect all viruses if false-positives are not allowed. We'll have a IsVirus function which can be run on any program and return whether or not that program is a virus. Now, consider this program which we'll call P:

if IsVirus(P):
    exit
else:
    DoVirusThings

What is the value of IsVirus(P)? If it's true, then P just exits without doing anything and we therefore have a false-positive. But if it's false, then P does virus things and we have an undetected virus.

This proves that it is not possible to detect all viruses if false-positives are not allowed.

Matthew
  • 161
  • 3
  • 1
    Downvoter please explain why. As far as I can tell, this proof is valid. – Matthew Jan 25 '19 at 11:54
  • I didn't downvote, though this does look a bit off, as there doesn't appear to be a contradiction. This is, if `IsVirus(P)` is true, then your script isn't viral; however, `IsVirus(P)` claimed that `P` was viral, not your script. – Nat Jan 25 '19 at 12:20
  • Realistically, nobody would care if `IsVirus` reported that `P` was indeed a virus, even though it wouldn't actually be one with that definition of `P`. Indeed, it's not at all unreasonable to *call* `P` a virus, as it's designed to do bad things and get clever with us if we attempt to identify it. In other words, the very act of invoking `IsVirus` could be used as a virus identifier for any program that isn't our own anti-virus program. (There are ways to "fix" this line of reasoning, but they involve getting much more precise with our definitions.) – Jeroen Mostert Jan 25 '19 at 13:08
  • @Nat `P` is the script – Matthew Jan 25 '19 at 21:25
  • @Matthew Hah, opps, missed that part! That does seem to be a contradiction then, though it appears to be a self-referential problem, like "_This statement is false._", rather than virus-classification. – Nat Jan 25 '19 at 21:59
  • An antivirus program could identify everything containing `DoVirusThings` as a virus, because it has no use in non-virus programs, and it can be called in other ways. – user23013 Jan 26 '19 at 12:02
  • I think you assume that 'code being a virus' is fully correlated with 'code doing virus things when executed'. This is not a crazy assumption, but it is an assumption nonetheless (and with this assumption I think your proof works). – Dennis Jaheruddin Jan 29 '19 at 11:57
  • Hm, you can't easily feed `P` into a function if `P` is supposed to be the entire program. Instead, have the `IsVirus` "function" be an executable that checks for viruses, and feed the current executable into it as a filename. Now it's much easier for the program to "feed itself into the antivirus" and create the contradiction. however then the program `P` *will* do virus-y things when some arbitrary executable `IsVirus` returns false on it, so flagging it as virus is actually correct. – kajacx Feb 04 '19 at 20:28
2

The original question was "Has it been mathematically proven that antivirus can't detect all viruses?"

It's likely accurate to say that we can never prove that we have written code that will detect all viruses.

A general-purpose computer attached to the Internet with the ability to download and execute code is likely equivalent to a universal Turing machine. This equivalency includes Turing's infinite tape size: If the bandwidth of the machine's network interface is less than the total growth rate of Internet-accessible data, the machine can never reach the "end of the tape". (I covered this a little in the conclusion of this paper a long time ago. While it's demonstrable in a practical sense, coming up with a mathematical proof would probably require adding some constraints.)

If the above is true, and if "to halt" means "to produce a report listing all viruses on the system, no more or less", then we can't prove in advance that the program will halt with a correct answer; we have to run it in order to find out.

If both of the above paragraphs are true, then we can never verify correctness of the resulting report, because we can never compile a complete list of all possible viruses to compare it with: Those viruses are all out there somewhere on the tape, it's infinite in size, and we can never read the whole thing.

If all three paragraphs are true, then we can never prove that we have written a 100% correct virus detector.

stevegt
  • 189
  • 3
1

Well, the definition of a virus is pretty vague. Yes, it is a malicious entity, but malicious is just as vague. Depending on the system and it's policies the the possibilities for a malicious entity will change. Defining a ever changing entity is something thats has come up in various fields, number theory, state machines, etc. and it has been proven in different ways that it is not possible, at least based on what we know.

One way would be, instead of defining what is malicious we can define what is allowed, a very strict and independent system, which allows only certain sequences of operations to be performed. That way it MAY be kept safe.

This problem IMO is just as hard as defining random.

Adithya Sama
  • 121
  • 3
1

A Virus is just code--it's kind if saying "Can my AI lawn mower program tell the difference between weeds and plants"--if so, does it pull daises?

If you download a program that sends emails to everyone in your contact list, is it a virus or are you a spammer? Depends on why you downloaded the program, not any specific of bytes in the program.

So you don't even have to go to mathematical proof--you can just reason that it can't be done.

On the other hand, you could say it's easy to identify viruses if you define virus by behavior. Programs are updated as part of an update process, if anything attempts to change code on your computer outside this process you could define it as a virus. With this definitions you could easily detect changes that happen outside of specific installation procedures. It might take hardware that can separate code from data and lock down the code space when you aren't holding in a button, but it's possible (if annoying).

This also assumes that the code you deliberately installed during the update process is not a virus itself, but since we are defining a virus by behavior for this example then by definition I guess it's not.

Bill K
  • 407
  • 2
  • 6
1

Has it been mathematically proven that antivirus can't detect all viruses?

What analysis was Bruce Schneier referencing when he wrote:

Viruses have no “cure.” It’s been mathematically proven that it is always possible to write a virus that any existing antivirus program can’t stop." [0]

[0] Secrets & Lies. Bruce Schneier. Page 154

This answer does not directly address what analysis Bruce Schneier was referencing. An individual who is interested in what a primary source meant when they made a statement should make the effort to contact the primary source themselves to ask their specific questions, to avoid speculation, conjecture or confusion.

Kurt Gödel's Incompleteness Theorem published in 1931 in Über formal unentscheidbare Sätze der "Principia Mathematica" und verwandter Systeme (called in English "On Formally Undecidable Propositions of "Principia Mathematica" and Related Systems") when carefully considered is very instructive relevant to analysis of any formal system, from computer viruses to politics

1. If a (logical or axiomatic formal) system is consistent, it cannot be complete.
2. The consistency of axioms cannot be proved within their own system.

These theorems ended a half-century of attempts, beginning with the work of Frege and culminating in Principia Mathematica and Hilbert's formalism, to find a set of axioms sufficient for all mathematics.

In hindsight, the basic idea at the heart of the incompleteness theorem is rather simple. Gödel essentially constructed a formula that claims that it is unprovable in a given formal system. If it were provable, it would be false. Thus there will always be at least one true but unprovable statement. That is, for any computably enumerable set of axioms for arithmetic (that is, a set that can in principle be printed out by an idealized computer with unlimited resources), there is a formula that is true of arithmetic, but which is not provable in that system. To make this precise, however, Gödel needed to produce a method to encode (as natural numbers) statements, proofs, and the concept of provability; he did this using a process known as Gödel numbering.

0

Not a mathematical proof, but AFAIK there are two ways to detect malware:

Signatures

As new malware can be developed - Including obfuscating or modifying existing ones - the signature of the new malware won't be in the antivirus database, therefore it won't be detected

Heuristics

This method uses automatic dynamic and/or static analysis to understand the behavior of the software and according to what it does the antivirus decides if it's malicious or not.

And here comes the tricky part, not everything that today is considered innocuous may be in the future.

For example, 20 years ago a piece of software using encryption libraries may have not been seen as something malicious, now we know that it may be some kind of ransomware encrypting your data. At the same time, you may (And you should) use a password manager that also uses encryption libraries to securely store your data. So how can we decide if it's malware or not only based on the fact that it encrypts data? Similarly a TCP connection may be used to leak information or to browse websites

The only difference is semantic, which is hard to analyze in an automatic way because technologies constantly evolving and malware adapts to that evolution. After all malware is no different from any other software except for its owner bad intentions

Mr. E
  • 1,954
  • 9
  • 18
  • 4
    You are answering the question like a programmer (which I am). It's useless. Their question does not live in the realm of reality but rather theory. It's like when I try to convince people that no "AI" (Artificial Intelligence) can be evil or have a soul. They point out some research or some Elon Musk/Alex Jones-tier article and I point out the word "Artificial". They still won't listen. Being an expert does not help you when arguing with most people. – NicVerAZ Jan 23 '19 at 22:26
  • 2
    "Being an expert does not help you when arguing with most people." While true, they are asking in website whose only purpose is to ask "experts" in the subject. Why ask somebody if not willing to accept his/her answer? – Mr. E Jan 24 '19 at 13:09
  • 1
    Everybody is an expert. – NicVerAZ Jan 26 '19 at 05:32
-1

No, it has not been proven, and cannot be proven because it's not true. An antivirus program that actually works is simply a suitable permissions/access control model enforced by the runtime environment. Rice's theorem is irrelevant because you don't have to determine ahead of time if a program is safe. You just run it under the access controls, and abort it (or just deny access) if it attempts to do something contrary to your policy.

Mathematically, this means you have a computable test that gives you a positive result if a progam is malicious, but may not halt if the program is not malicious. That's fine, because the test is running the program, and you can just keep running at as long as it's not malicious.

Of course real-world antivirus programs don't do the simple obviously right thing (which would be integrated with the OS, runtime VM, or similar component, not a separate AV product). Instead they use bogus signatures and heuristics to classify programs. It is provable (see other answers) that such an approach cannot work.

  • "No" what? Not it has not been proven? But read the other answers that show that it has been proven? A very confusing start to your answer. Can you clarify? – schroeder Jan 23 '19 at 16:41
  • The no is a direct answer to the question subject line. I'll elaborate. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 16:55
  • Shall I get out the virus that uses rowhammer to overwrite the kernel security checks? – Joshua Jan 23 '19 at 17:02
  • @Joshua: The runtime environment can fully detect or prevent that. For example by running the program under an interpreter. See "suitable access control model". In this case the hardware is buggy and allowing any direct low-level access to that part of the system (making requests to the memory bus) is insufficient access control. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 17:13
  • Unlike spectre (so far), rowhammer can be exploited through an interpreter. – Joshua Jan 23 '19 at 18:21
  • Can you *prove* the presumed existence of that *suitable runtime environment* ? – ulidtko Jan 23 '19 at 18:24
  • Suppose you have two programs: a legitimate program that cleans up old temporary files that match certain criteria, and a malware program that deletes important files that match certain criteria. How do runtime access controls reliably distinguish between these cases? – Ray Jan 23 '19 at 20:09
  • @ulidtko: Yes. Formal methods people do stuff like that all the time with verification of logic designs. But it depends on a formalization of your rules for what's allowed and what's not. Lots of people don't even have in mind a model for what is and isn't malicious, and expect a machine (or another person) to read their mind and apply social norms from inside it to make the determination. Of course that seems unlikely to happen. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 20:32
  • @Ray: The access control allows write/delete on one set of files and not on another. This is how normal filesystem permissions work. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 20:34
  • @Joshua: That depends on the nature of the interpreter. It most certainly can't be exploited on an interpreter that permutes physical and virtual addresses via a strong cryptographic cipher (without also breaking the cipher), or just one that throttles memory access in a suitable way. *If* your physical machine is subject to such bugs, *then* your access control must include measures to preclude access to the mechanisms of triggering those bugs. This is not rocket science, people. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 20:36
  • 1
    While good, it will not prevent all viruses. @Ray allowed you to side-step the issue with that specific example, but that will not always be the case. What about where _the exact same software_ can be viewed in one use case as useful but another use case as malicious? Consider software that allows remote control of your computer, written and used initially for good. Then others abuse it. The gamer calls it awesomeware when his buddy gets him XP while he sleeps, and their neighbor considers it malware when their enemy blasts obnoxious white-noise. One person's trash is another's treasure. – Aaron Jan 23 '19 at 21:07
  • @Aaron: **For the nth time**, that is a matter of choosing the suitable access control model/defining malicious. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 23:51
  • BTW aside from tone, my answer here is basically the same as [James_pic's answer](https://security.stackexchange.com/a/202047/2836) which appeared later and isn't so ridiculously downvoted by people who haven't read the comments. – R.. GitHub STOP HELPING ICE Jan 23 '19 at 23:56
  • This answer explains how to prevent _unauthorized programs_ from running, not viruses in particular. The question is about malware specifically. – forest Jan 25 '19 at 04:37
  • @forest: Malware is just a program that does something you don't want it to do. – R.. GitHub STOP HELPING ICE Jan 25 '19 at 15:59
  • @R.. wrote "choosing the suitable access control..." If it is that simple, then you should write up a paper on it. Because, with the specific example I provided, if you actually can write such an access control or definition to differentiate between "Action A is ok" and "(the very exact same) Action A is malicious", then you would become a millionaire. No, better yet, because you would have completely defied logic itself and discovered an entirely new kind of logic to make that possible, you may well become a billionaire, or be able to defy physics with the new logic and become superman. – Aaron Jan 25 '19 at 19:41
  • ... my point is, the exact same, the very same, the _same same_ action can be both malicious **and** not malicious. Not in the same place at the same time, of course. But at a different place, or a different time. Maybe even on the same computer, with the same access controls defined, doing the same action (same file, same modification, etc.), but Susy considers it a good thing, then a half hour later when her brother uses the PC he considers it malicious. Or Susy considers it malicious at time X but not at Y ("play-me-a-tune-app" is cool during free time, but can easily be used as malware) – Aaron Jan 25 '19 at 19:46
  • @R.. I want to reiterate, you are on the right track. My comments are not meant otherwise. Better access controls would make blocking malware way better. What you propose is a huge improvement over modern antivirus. The points being brought against it are merely that, in this specific Q&A, the question asker very specifically asked about mathematical proofs about whether _every single possible_ malware could be detected. We need assume that means with minimal false positives, otherwise the answer is "block everything." So, when "p != q" and "p = q" are both true, our comments suggest it can't. – Aaron Jan 25 '19 at 19:57
  • The core idea in this answer seems correct, though the framing is off - specifically, it looks like this answer is to a slightly different question, but misframed to match this question. The problem is that this question is about how to analyze a program to determine iff it's a virus based on its specification, e.g. a binary of it, which this answer doesn't even attempt to address. However, I guess that this answer's kind of a frame-challenge in that the reason it doesn't attempt to address the question is that it proposes a better solution than anti-virus: run-time-management. – Nat Jan 29 '19 at 03:13
-4

Can we have an antivirus that detects all viruses?

No math needed. Experiment time. Lets arrange a pool of viruses, a pool of systems-that-try-to-defend-themselves and mechanisms that constantly change/improve both parties.

Give them a planet. Wait billions of years.

Do we still have viruses?

For me the sheer number of possibilities already tested is far beyond satisfactory.

About "proving"

Here I'd like to explain why I've drastically reframed your question.

The direct answer to your question "Has it been mathematically proven that <statement about real world>" is: No, mathematicians have never proven your statement about real world (or any other such statement). By real world I mean the one that you see when you look out your window.

You can only prove anything in axiomatic systems ("worlds" that are ruled by a set of axioms - where axioms themselves are defined as unprovable). I don't know axioms that rule the real computer viruses, so I cannot prove anything about these.

I'll try to predict reality and I'll be often correct, sure. But I will use falsifiable theories for that, not proofs.

kubanczyk
  • 1,182
  • 6
  • 11
  • 6
    Your argument goes a bit too much in the other direction -- there are obvious gaps in what biological evolution can practically achieve (no cow has jumped over the moon), and the absence of "solutions" to certain "problems" is hard to accept as absolute proof that something can't be done unless you demonstrate why we couldn't possibly do any better. In particular, the mechanisms by which viruses and anti-virus software are "evolved" and "selected" are quite different. Even the word "virus" has become a misnomer when it comes to current malicious software (which rarely copies itself). – Jeroen Mostert Jan 23 '19 at 11:18
  • @JeroenMostert I've just reported a fact. While you imply that fiddling with thoughts inside our heads would yield more satisfactory results, in this case the number of repetitions and the huge pools of specimen involved is good-enough for me. That is, I assume it's vastly more probable for a human brain to overlook something here, than for the entire biosphere checking it for so many generations. (But - yes! - in some other cases our thinking can be superior, of course. Good for us!) – kubanczyk Jan 23 '19 at 11:32
  • 14
    Cars are impossible. Proof: billions of years of evolution have not yielded wheels. Q.E.D. This fails, of course, because the parallels between biology and engineering only go so far -- you don't even need to presuppose smart inventors for this. Likewise computing. We've spotted some cute parallels between replicating programs and biological viruses way back when, but that's as far as it goes, and actual replicating programs have become no more than theoretical curiosities, while malicious programs are dead yet kicking. Is the biological parallel still valid and insightful? I wouldn't assume. – Jeroen Mostert Jan 23 '19 at 11:49
  • @JeroenMostert Yes, I think I mentioned it too. – kubanczyk Jan 23 '19 at 12:07
  • 1
    Whether or not proofs are possible is not the question. You have side-stepped answering the question. There was a reference made and the framing is made in that context. This is a non-answer. – schroeder Jan 23 '19 at 12:11
  • @schroeder Ah, this is a valid concern - I've edited myself to correct my omission. – kubanczyk Jan 23 '19 at 12:35
  • 4
    In addition to the shortcomings pointed out in previous comments, this answer (probably accidentally) implicitly relies on incorrect assumptions about the equivalence between biological and computer viruses. Just because both are called “viruses” doesn't mean that they have identical relevant properties. The name is a metaphor, not an accurate definition. – Konrad Rudolph Jan 23 '19 at 14:56
  • **Evolutionary processes are not optimizing**; they are [satisficing](https://en.wikipedia.org/wiki/Satisficing). You cannot claim that the failure of an evolutionary system to develop an optimal strategy against an adversarial agent proves its impossibility. Also, I'm pretty sure OP doesn't care about the axiomatic system being used in the proof. No one would condemn you for using [Zermelo–Fraenkel axioms](https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_theory#Axioms), and pointing out its problems is a red herring. – forest Jan 25 '19 at 07:00
-4

I would agree with Najib myself. Models theorems can only ever be applied to very formal logical subset of a formal logic. I dont know that any thing can be formulated about incompleteness of reality, based on some formal logical grammar. Seems its a matter of greater philosophy at some point. That said assuming you can concretely describe what it means to be virus or malicious, that is that this can be formulated as a mathematics question, then sure it can be shown.

Computers exhibit finite resources. Thus through brute force every state can be considered and if it is malicous.

This fails if combinations of states through time are considered, maybe? I can't say myself.

-5

More or less, yes. Mathematics behind it in layman's terms, this is boiled down to a Bootstrapping Paradox, a type of temporal paradox that questions where information originated. This relates to this problem in that if you have an antivirus that detects all known viruses, but then the next day, somebody creates a new virus, there would be no signature in the old virus scan to detect the new virus. If you could conceivably take all computer viruses from the end of time, then beam them back to before now and created virus signatures from them all, where did the information come from? The future, or the past? Since the general consensus is that time travel is impossible for a number of reasons mathematically/physically, you can't have coverage of all viruses, only all known viruses.

  • 3
    This relies on the unstated assumption that "antiviruses can't detect viruses that didn't exist, or weren't known, when the antivirus was created/last updated". This is key to the question, so just assuming it's true means that your answer has no actual substance. Also, for **some** viruses, this assumption is clearly false, since heuristics do work on some viruses. All of the talk of time travel is just fluff. – Joseph Sible-Reinstate Monica Jan 24 '19 at 05:32