9

Many projects offering binaries, also offer hashes (e.g. SHA256) of those binaries, wither as .ASC files, or directly on the web page near the binary. This isn't to protect against network-caused corruption, as that's ensured by the TCP protocol.

Given that the binary and the hash are downloaded from the same server (example from very sensitive software, bitcoin-core), what attack scenarios does this technique prevent?

If an attacker managed to tamper with the binary, why wouldn't they be able to change the checksum in the same way? Same for the attacker performing MITM and tampering the download in transit.

Note that I'm not talking about public/private key signing, which is far more secure because the attacker would also need to get the signer's private key. I'm only talking about the point of providing checksums/hashes along with downloads. This is even stranger for bitcoin-core, which has both the mechanisms and the audience to sign the file with a private key.

I can imagine that a separate, secret, monitoring bot hosted on a completely different system, could download the signature file every minute (given its tiny size) and check it against tampering, but I haven't heard of this being done.

Dan Dascalescu
  • 1,945
  • 2
  • 15
  • 23
  • 2
    Does this answer your question? [Why verify a file / firmware downloaded online against a checksum?](https://security.stackexchange.com/questions/196648/why-verify-a-file-firmware-downloaded-online-against-a-checksum) Also take note of the linked duplicates – Arminius Aug 24 '20 at 01:54
  • @Arminius: good find. Better than the SE similar questions suggestions when I typed mine :) – Dan Dascalescu Aug 24 '20 at 04:12
  • Not all file transfer protocols use TCP. TFTP uses UDP for example. – Teun Vink Aug 24 '20 at 07:40
  • "This isn't to protect against network-caused corruption, as that's ensured by the TCP protocol." --- this is a false assumption. Checksums are not a security control. – schroeder Aug 24 '20 at 11:45

4 Answers4

5

It allows distribution of the bulk data through untrusted channels.

For example, you could give me an USB stick and ask me to download it for you over night using my cheap DSL connection, while you get the checksum off the webpage (that you pulled up on your laptop with an expensive metered connection). I recognize the file name, look into my "large files I have downloaded already" folder, and there it is, so I copy it onto your USB stick.

You get the file instantly, and you don't have to trust me for this to work.

Simon Richter
  • 1,482
  • 11
  • 8
3

If the organization hosting the downloadable file simply posts the checksum of the file along with the file, then you are right. If an attacker manages to breach the server, and replace the downloadable file with a malicious file, then it is trivial for the attacker replace the original checksum with checksum of the malicious file as well.

This is why most organizations go a step further, and post a digital signature of the checksum as well, where the digital signature is created using a private signing key that is associated with the organization and stored offline. For example, see https://releases.ubuntu.com/focal/ for digital signatures of Ubuntu images made using Ubuntu's private signing key. This way, even if an attacker manages to breach the server, and replace the downloadable file with a malicious file - the attacker would not be able to create a valid signature for the file, because the private signing key is (hopefully) not stored on the server.

mti2935
  • 19,868
  • 2
  • 45
  • 64
  • That'd be true for PGP signatures, but there's no private key for SHA256 checksums! – Dan Dascalescu Aug 24 '20 at 00:51
  • 1
    @DanDascalescu I was editing my answer as you posted your comment, to make the same distinction that you made in your comment. FYI, at https://bitcoin.org/bin/bitcoin-core-0.20.0/SHA256SUMS.asc, if you scroll down, you'll see that the SHA256 checksums are in fact signed using a PGP signature. – mti2935 Aug 24 '20 at 00:58
1

You raise a valid point regarding the TCP protocol ensuring the downloaded file isn't corrupted. As for an "attacker scenario", and I'll use that as a blanket statement, the signature can't be tampered with in the way you've proposed, nor can the binary be tampered with and the signature be the same when the end-user hashes the file. For example:

A website tells me their installer has a SHA256 hash of ABC (often they tell me this right next to the download link). I download that file, generate the SHA256 hash of it, and it tells me the hash is 123. I now know the installer does not match what the server was offering me -- for whatever reason that may be.

It sounds like you understand more than the basic idea behind all this but are getting stuck on the idea of checksum/hash being offered and why that couldn't be tampered with (unless I'm misunderstanding you). Now theoretically if the "attacker" could compromise the website as well and change the hash to match the hash of the tampered file (123 back in our example), then their attack would be successful. But at that rate they might as well stop tampering with the file in-line and just host their own malicious binary.

This all doesn't even take into account signed binaries or counter-sign signatures often used in distributing software.

  • 2
    Exactly - if the attacker can tamper with the binary, why couldn't they tamper with the hash as well? As for hosting their own, that's less interesting. The attacker wants the reputation of the original host, but with their own subverted binary. Re. signed binaries, or PGP signature, no beef with that. Clarified the question. – Dan Dascalescu Aug 24 '20 at 00:53
  • The hash is calculated first by the developer and let's say posted on their website for this example. Then you check their claimed hash against one you calculate locally. An attacker wouldn't be able to "tamper with the hash" as there isn't one being sent in-transit. You calculate it on your system once the binary is downloaded. – coderichardson Aug 24 '20 at 23:31
  • The attacker could tamper with the hash displayed on the developer's website by editing the source code (HTML etc.) that outputs it. If it's a static HTML served from the same server where the download is served, then there's no security benefit in posting the hash. Anyway, turns out my question was a dupe (unsurprisingly; I just hasn't stumbled upon the same search terms). – Dan Dascalescu Aug 25 '20 at 00:03
1

You seem to be assuming that all security measures inherently prevent some type of attack.

This is a rather common misconception. All security measures increase the difficulty of some type of attack, but only a few truly make a particular type of attack impossible.

This is important because making an attack more difficult inherently reduces the likelihood that it will actually happen. If making the attack is either too difficult or too expensive for the attacker relative to the perceived value of success, they just won't attempt the attack (or they'll give up part way through).

In the case of providing a hash for a download, you get the following security benefits:

  • MitM attacks become just a little bit more complicated. You have to intercept and modify two (or possibly more) requests instead of just one for your attack to go undetected until it's too late.
  • If the hashing isn't done on access, attacks on the server have to replace more than one file, or may even need to replace things in two completely different network locations (I've seen sites that store the hashes in a separate location from the files for this reason).

You also get the following two non-security benefit:

  • The user can be confident that the file they downloaded is what they were supposed to get in the event of there not being an attack. IOW, it protects them against getting a corrupted file due to things like at-rest data corruption on the server or bad memory on the server or their local system.
  • It provides a reasonable degree of confidence that intermediary protocols did not mangle the data. TCP kind of protects against this (but a 2 byte checksum for potentially 1k+ bytes is going to se a lot of potential collisions), but not everything uses TCP, and the user may not even be getting the file from the origin server (see Simon Richter's answer for an example of this).

So, overall, you get a couple of limited security benefits and some limited insurance against a couple of problems that may crop up (if you look at old usenet and BB posts with files, you'll sometimes see checksums provided in the post for these same non-security reasons). In contrast, it costs the server operator essentially nothing and doesn't create any new problems for the user, so even though it doesn't add much security, it's pretty easy to see as being worth doing because the overhead is essentially nonexistent.

Austin Hemmelgarn
  • 1,625
  • 7
  • 9