3

A project I am working on currently requires that the user runs an MD5 hash checking tool on the entire project, after it has been unzipped. They do not currently request that the ZIP itself is checked.

If they were to switch to checking the MD5 of the zip, would there be any value in verifying the integrity of the unzipped files with MD5 - or is this covered by CRC checks when unzipping?

Craig Mason
  • 133
  • 1
  • 6

5 Answers5

1

Technically yes, practically probably not.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
1

There is the possibility of your decompression software doing something strange, or the data being corrupted in the storage step. For highly critical data you should always verify after storing it to the disk.

In practice zip/unzip are old programs and the risk of a bug in the Zip program being shipped with your linux is rather low. This is primarily a concern on unstable platforms or when there is a problem with the storage. I have have seen routers corrupt images when decompressing them, and failed writes over NFS can cause interesting file corruption.

If you think somebody might craft an "evil" zip archive to bypass your checks the situation is a bit different. Note that CRC in the zip provides no protection against an attacker, and that MD5 is a rather old and weak algorithm. Most systems are shifting towards the SHA algorithms to verity file integrity instead (SHA256 being most popular I think). Hashing the archive and the expanded files makes an attack on MD5 a lot harder.

pehrs
  • 8,749
  • 29
  • 46
0

Individual file checksums should be checked to thwart Birthday Attack collisions... if you have reason to be concerned about such things.

danlefree
  • 2,873
  • 1
  • 18
  • 20
0

as I know each file in zip archive has it's own checksum, when you extract the file from archive the zip computes checksum of extracted file and compares it to checksum in archive. if the checksum is different it assumes that the zip archive is corrupted. also the checksum is not md5 checksum and i think its CRC checksum also i'm not sure i think it's reliable enough, but if you want to be 100% sure that the file you have is the same file that the file provider distributes for example CentOS iso. and is not modified by third parties than you might want to check the md5 sum

Troydm
  • 354
  • 2
  • 11
0

The only way I see this being valid is double checking for security and guarantying that the file was not corrupted when written to the disk (that would be extremely bad luck or bad disk!).

coredump
  • 12,573
  • 2
  • 34
  • 53