7

Possible Duplicate:
Does hashing a file from an unsigned website give a false sense of security?

On many servers providing files for download, there is a file on which checksums are provided for each download. Example1 Example2

I understand that these checksums can be provided in order to check that the download succeeded (file not corrupt). File downloads over HTTP or FTP are quite reliable though.

But why would it prevent me from downloading a malicious file? If an attacker can modify the download on the server, it means that the server has been compromised. Thus, the attacker might also have modified those MD5SUMS files.

So is it really necessary to use the provided checksum if both the checksum and the original download are provided by the same server?

Benoit
  • 493
  • 1
  • 5
  • 10
  • A hash can confirm that the file is complete. In the past servers owned by groups like Apache and Debian have been compromise by a third-party, and files been modified. In the case of say Sourceforge a mirrior that was used was modified and a file was modified, so any user who downloaded the file from that specific mirror, recieved the malicious file. – Ramhound Jan 08 '13 at 14:00
  • See the duplicate question for more details. Hashes used in this manner are for transmission integrity validation only - not for origin authentication. To have the latter, you must at minimum have some other means of origin authentication (e.g.: SSL) on the site that is hosting the checksum. After that point, it doesn't matter where your actual download comes from - if the origin of the checksum can be validated, then the checksum can be used to validate the origin and integrity of the file. – Iszi Jan 08 '13 at 14:44
  • Note that this also means it doesn't matter whether the checksum and download are on the same server or not in your scenario. If there's no validation for the checksum, the download can just as easily (though perhaps less likely) be malicious regardless of where the two are hosted in relation to each other. – Iszi Jan 08 '13 at 14:51
  • @Iszi: yes this is a duplicate. Could it be closed and marked as duplicate please? – Benoit Jan 08 '13 at 15:11
  • @Benoit Rory & I beat you to it. But thanks for the acknowledgement. – Iszi Jan 08 '13 at 15:12

3 Answers3

9

Really, when they are on the same site, there is no real security value. It is only of a security value when trying to verify that a file from another source is the same file (though even that may be dubious thanks to success of collision attacks against a lot of typical error detection hashes.) The primary reason, as you mentioned, is simply to make sure that you can validate the transmission was ok.

AJ Henderson
  • 41,816
  • 5
  • 63
  • 110
6

The MD5 checksum is about downloading the big archive through HTTP (possibly from a mirror) while obtaining the MD5 value from a "secure" Web site (HTTPS). This way you just need to concentrate on obtaining the right MD5 value, but you could get the archive from any source over any medium (download from a fishy-looking mirror, p2p network, CD/DVD, avian carrier... it does not matter as long as the hash value matches). The hash value is just a way to concentrate the security requirements, on a value which is small enough to (supposedly) make security easier. Of course, this does not make the issue go away: you still have to worry about whether the hash value is genuine and unaltered.

The MD5 checksum is also good at detecting non-malicious alterations, e.g. a bit flip because of bad RAM on your computer (a much more common occurrence than usually assumed).

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • Thanks. My question was, indeed, about those cases when the hash and the file are served by the same web site or server. You confirm that it cannot protect me from maliciously altered downloads since the hash can be altered, too. Then why do those example sites provide them? Only to check download integrity? – Benoit Jan 08 '13 at 13:40
  • @Benoit The answer gives additional reasons (un-malicious data corruption). Also consider this: You can re-download the hash from a mirror or at another point of time (its very concentrated data) so you the effort to give you malicious data increases. It is not a total solution but surely an enhancement to the overall security. Also there are almost no costs involved. You are right, Hash my not be the ultimate solution/protection, but it can be helpful and is cheap. Maybe that's the "why" :) – humanityANDpeace Jan 08 '13 at 13:44
  • @Benoit - Its less likely the malicious person has access to both the ability to modify the website ( in order to update the hash ) and the ability to change the file itself. – Ramhound Jan 08 '13 at 14:01
  • @Ramhound: see the two examples provided… – Benoit Jan 08 '13 at 14:13
0

preface

The question asks for the "necessity"/"why is necessary" to do the checksum test. Even though or maybe because of I assume a security context I am a little troubled. I assume you meant to say somehting like "what would be the security gain of using same origin plaintext transmitted hash/checksums"?

No, it is not very much useful to the safety/security

As you might have stated already with your sceptic/bright question the security gain would be in many cases quite limited. If somebody is able to tamper a 600MB download on the fly as a "transparent Man in the middle" that attacker is surely able to generate a fitting hashfile that will prevent you from noticing the tampering of the 600MB file.

The (not so much security related) reason "potential data corruption during transmission of the data" has been stated also already.

I do not want to provoke that new additional point here as "a very secure feature" but if you perceive an attack under the light of its effort and costs. It might be that an attacker is well able to tamper a 600MB file (which takes some time do download) at one or two points of time. The hash/checksum file (which as @Thomas Pornin put quite so well concentrates the security requirements) can be downloaded more swiftly (only few bytes). It can be easily be downloaded several times (at different point of times). Without any big effort from your side (since the file is so small) but would require the attacker to be constantly on the watch somehow. The fact that the hash concentrates the security makes might make shift the "effort balance" to your side. You would not download the whole (let's say 600MB*.iso) for 3-4 times just to check if it is still the same (i.e. that there is no obvious sign of tampering the data), but you might consider downloading (at very little expense), the hash from multiple source at different dates. The effort (thanks to the hash) is little for you. The attacker on the other hand, still has to be there (even for the few bytes), otherwise the not fitting hash/checksum can indicate some discrepancy.

Of course you can consider that this is only a very thin, small added security (resulting from an potential increase in the effort it takes to successfully run the attack) but since the hash can be computed and transmitted (several times, multiple sources) it can promote an increase security.

regarding multiple source: Your examples given can be seen as

combo. allowing you to download the iso via HTTP (saving the computation power of encrypting the transmission) and then downloading the HASH/Checksum via HTTPS in order to reduce to some extend the chance of an attack.

Just to mention it also. There are often also file signatures used which do not only contain a hash value but are by using public/private key signing can be checked for authenticity or origin and will beat (in terms of security) the only hash-value/checksum files.

humanityANDpeace
  • 1,412
  • 1
  • 12
  • 24