Is it safer to compile open source code vs simply running the binary?

Question

I understand that with OpenSource software, my milage may vary based on the trust of the author and the distribution platform they use (Codeplex, Git, or private server).

Oftentimes a FOSS website will offer me a link to download the binary and another link to download the source code.

Under what conditions can I simply download the exe? (and verify the checksum, hoping there is no rootkit etc.)
Are there times when I should I compile the binary from source?

What may help answer this question is information about well known issues where the pre-compiled EXE was infected with a virus or other malicious code. (etc.) If you are aware of any such incident I'd appreciate that information as well.

score 8 · Answer 1 · answered Mar 20 '13 at 02:51

Most of the time, it's only just barely safer, and sometimes it's less safe.

Under what conditions can I simply download the exe(cutable?)

Signed packages from major distributions are built on the Distribution's build servers. In that regard, it's almost certainly best to use the packaging system.

Are there times when I should I compile the binary from source?

Generally, a source control repository such as git makes for a non-reputable history that can be observed as opposed to the binary output which could be separately altered. In that regard, source code is safer, but only if you're actually reviewing it. It is possible for malicious binaries to be put in place, or to have the packaging developer's system compromised with a malicious compiler, but recorded instances of the former are rare, and the latter all but unheard of.

Compiling can expose you to unknown bugs as well. If you use different compiler options, you might introduce a bug that isn't seen in the regularly distributed binary. There are plenty of cases of machine-dependent flaws or changes in behavior due to different compiler flags. In that regard, compiling from source could be less safe.

The ultimate answer here comes down to how much you feel it's worth in the form of time you're willing to spend to verify the source of the binary or source code.

Regarding the malicious compiler thing: there has been incidents for iOS devices (XcodeGhost). While most apps aren't open source, this is a problem for developers in countries with a censored internet -- if you cannot access the official build environment downloads, it is tempting to use a mirror, some of them are malicious. — dst, Dec 09 '15 at 15:18

score 6 · Answer 2 · answered Mar 20 '13 at 02:39

Let me first state that I do not know any case where only the precompiled executable file of a FOSS project contains malicious code. So if you are looking for concrete examples, this answer probably isn't for you.

The biggest advantage of compiling the code yourself is the ability to read through said code and determine what the code actually does. This is often touted as the primary advantage of FOSS code. See Linus' Law.

Given enough eyeballs, all bugs are shallow

Realistically though, a single developer or even a small team of developers will not have the time and inclination to go through the source code for every single FOSS code used in a project. Not to mention that to actually spot malicious code, the developers need to have a background in security. Imagine going through the entire codebase of the Linux kernel before compiling it... You will need to have some trust in the FOSS developers and project hosting site.

Personally, the approach I will take depends on the reputation of the FOSS project in question. If it's something huge and reputable like the Linux kernel, I will just use whatever binaries they provide and trust the many developers to spot malicious code and issue fixes promptly. If the project isn't very reputable, I would make an attempt to go through the source code myself to see if there are any hints of malicious code present before compiling it myself. If that isn't possible, I would question the use of said FOSS project and instead go for a more reputable one or even write the functionality myself.

"`I do not know any case where only the precompiled executable file of a FOSS project contains malicious code`" - how about the original precursor to the rootkit, [Ken Thompson's Unix C compiler and Login command](http://en.wikipedia.org/wiki/Rootkit#History)? Sure, not exactly a typical example, but there you have it... — AviD, Mar 20 '13 at 11:34

score 2 · Answer 3 · edited Dec 09 '15 at 02:27

Source code is "safer" in the following ways:

Planting a discreet backdoor in source code (as opposed to binary code) is hard, in proportion with the number of people who review the source code.
Very few virus will automatically infect source code.

The first way is not a strong guarantee. Firstly, since you are envisioning an hostile author, then you must consider ways by which that author could give you a specific source code with a backdoor, distinct from the source code seen by everybody else. Thus, the set of people who are in position to review the source which you are about to compile may be reduced to a single person, i.e. you. In that case, if you, personally, are not willing to perform a full source code review, then compiling the source code will not make you any safer than using a binary, at least against an hostile author who wants to plant a backdoor.

Secondly, backdoors and other vulnerabilities are bugs. It has been amply demonstrated that no amount of code reviewing can detect all bugs, even very serious ones, even for bugs which are honest mistakes. If we cannot reliably find out bugs which are the product of mere bad luck and inattention, how can we hope to recognize bugs created intentionally by a supposedly intelligent attacker who is intent on evading detection ?

A classical must-read reference is Ken Thompson's Turing award lecture. For a more recent story, see all the drama about the alleged backdoor in OpenBSD IPSec implementation, which turned out to be (officially) a dud. While in this latter case it seems that there was no actual backdoor (I have not checked myself), it highlights the fact that planting some backdoor in a subtly flawed PRNG seems highly possible, even as source code which is in plain view of many people.

In the end, it is a matter of risk. From the attacker's point of view, putting the backdoor in source code is risky: if the said source code is widely available, then the risk of being caught is higher; whereas a backdoor in binary code is mostly safe -- again, for the attacker. Relying on source code being safer for you means relying on the attacker's rationality, i.e. using source code because you believe that the attacker would not be mad enough to take the risk of putting the backdoor in visible source code.

Protection against virus probably has a higher practical value, in the case of binaries for Windows at least. By using the source code, you are immune against most virus which could run on the software author's machine (but, of course, not against the virus on your own machine -- but these are already there).

score 0 · Answer 4 · edited Apr 12 '17 at 07:31

Firstly, as other have stated, open source doesn't mean that those who look at the code are skilled enough to detect the problems you mention. Also, code can be made to look benign when in reality it does something behind the scenes: see the Underhanded C code contest for an example:

a competition that challenges coders to solve a simple data processing problem by writing innocent-looking C code that is as readable, clear, and seemingly trustworthy as possible, yet covertly implements a malicious function.

Secondly, even if code doesn't necessarily have explicit malicious bits, it might be vulnerable to attacks that can compromise the system. Heartbleed might be the most obvious example here.

So assuming you trust the author(s) and the community, and that there are no subtle issues such as vulnerabilities in the random number genreration, the problem now is - when can I trust the binary and when it is more advisable to compile from source?

To answer this question I suggest you check out reproducible builds. They are being implemented in debian and others including tor. If the FOSS program you're interested in provides reproducible builds and an automated system to distribute checksums then the chances of a binary being compromised are slim. In particular check out how The Qubes project does it when it comes to signing.

Lastly, don't forget that the problem might not be in the application source code in a scenario where your compiler is compromised. In this case you're better off with (trusted) binaries.

score -2 · Answer 5 · answered Dec 09 '15 at 02:59

-2

You are safer with the solution that is most widely used. If there are no checksums, then it is not weighted either way. Also this is time dependent and culture dependent.

answered Dec 09 '15 at 02:59

m.kin

1
1

Is it safer to compile open source code vs simply running the binary?

5 Answers5

Linked

Related