Why is MD5 still used heavily?

16

2

MD5 seems to have well documented vulnerabilities and yet it remains widespread in its usage. Does anyone have any reasons for it remaining a viable option when other alternatives (e.g. SHA-2) seem to be more robust?

David in Dakota

Posted 2009-10-27T14:09:13.890

Reputation: 301

Answers

16

It's quick to generate, and often the fact that collisions are theoretically possible isn't a massive problem. i.e. checking whether a cached file has changed in order to avoid downloading a new copy.

A quick benchmark done in 1996 shows the following:

            Digest Performance in MegaBytes per Second

      Pentium P5     Power Mac    SPARC 4     DEC Alpha
          90 MHz        80 MHz      110 MHz      200 MHz

MD5         13.1          3.1         5.1          8.5
SHA1         2.5          1.2         2.0          3.3

For a modern use - on embedded chips, MD5 can be 2-3x faster to produce than the SHA1 for the same information.

Rich Bradshaw

Posted 2009-10-27T14:09:13.890

Reputation: 6 324

10

A MD5 hash is "good enough" for most menial tasks. Recall that it's still incredibly difficult to produce meaningful collisions in the same number of bytes.

For instance, say you download the new Ubuntu 9.10 next week from a trusted mirror. You want to verify that the file was downloaded correctly and completely. Simply fire up MD5 and hash the ISO. Compare the hash against the published hash. If the hashes match, you can be sure that the ISO was copied correctly and completely.

eleven81

Posted 2009-10-27T14:09:13.890

Reputation: 12 423

It's not difficult anymore. And how is it harder to run sha256sum filename.iso instead of md5sum filename.iso? – Mechanical snail – 2011-10-15T23:11:44.450

Further, if your ISP is evil, MD5 does not guarantee that the ISO was downloaded correctly. The ISP could tamper with the ISO image to do something evil.

– Mechanical snail – 2011-10-15T23:15:39.873

4

  1. It is short - easier to read.
  2. It is widespread - great interoperability with other systems
  3. It is usual - everyone is just used to it.

and security can be improved with salting it.

Josip Medved

Posted 2009-10-27T14:09:13.890

Reputation: 8 582

3

MD5 is widely used as a checksum hash function because its fast and presents a extremely low collision ratio. An MD5 checksum is composed of 32 hexadecimal digits which together provide a 1 in ~3.42e34 odds of a collision. You could theoretically hash all the files in all computers in a country the size of the USA and not produce a collision(*).

For cryptography, MD5 is a valid alternative if security is only a moderate concern. It's a very viable option for hashing database passwords or other fields requiring internal security for its speed mostly, but also because MD5 does offer a reasonable level of security where strong encryption is not a concern.


(*) for most checksum purposes, a collision is only meaningful if it happens between two objects of similar origins and with the same size. Despite an MD5 high uniqueness probability, collisions could eventually occur between two very distinct files. Say, a 1.5Mb database file and a 35k gif file. For most purposes, this is a meaningless collision. Even more so because MD5 is just one element of file indexing. File size being another important one.

A Dwarf

Posted 2009-10-27T14:09:13.890

Reputation: 17 756

2

MD5 is widely used because it has been widely used, and the breaks are not yet significant enough to be an obvious problem in existing systems.

Douglas Leeder

Posted 2009-10-27T14:09:13.890

Reputation: 1 375