How do downloads become corrupt? (SHA1 value varies)

5

1

I downloaded VS 2012 using IDM. While I did pause and resume several times, it downloaded successfully. BUT, the SHA1 and CRC values do not match with the given values in the website. This means that my download is corrupted. How does this happen?

Abhishek Sha

Posted 2012-09-06T16:46:30.527

Reputation: 606

Was this a Microsoft site? Did you try their download manager? – Dave M – 2012-09-06T17:26:48.717

I haven't said that I downloaded from MSDN/DreamSpark Premium/TechNet. I've downloaded from their site with the link provided for the trial version. – Abhishek Sha – 2012-09-07T10:44:24.820

Answers

3

All it means when a file "downloads successfully" is that the number bits that the server has to offer equal the number of bits delivered to you. There are no guarantee that the exact sequence of bits is the same. It very often is the same sequence but no guarantees.

Remember that all it takes is one bit different between the original and your copy in order to fail the checksum. SO has a good question about error rates in TCP checksums which is one possible cause of the problem. Because there are so many moving parts in a transfer, it's hard to pick out exactly where the problem occurred.

Best advice is to try again or get it off BitTorrent where the error correction is more robust.

Green

Posted 2012-09-06T16:46:30.527

Reputation: 556

3

When transferring data over a network connection, some amount of data corruption is always likely to occur. It should be less likely over a connection using TCP (like HTTP or your file transfer). You will see it happen with UDP connections (commonly used by streaming media services). The reason for this is that TCP connections use various methods for error detection, while UDP does not.

However, even with these methods employed, errors still may squeak by. There are a few different cases that can cause this:

  1. The file on the server is corrupted already. In this case, the checksums on the site, while correct for the uncorrupted file, will never match the checksum on the downloaded file.
  2. The file gets corrupted during transmission over the network. Network connections typically go through several points and large physical distances, over a lot of hardware controlled by different people. Physical problems with the hardware at some point in the path may introduce corruption, or it's possible that the devices messed up when routing the data packets.
  3. The file gets corrupted after download completes, on your computer. This can occur if the file happens to get stored on a location on your disk that has bad sectors. Since the data won't get written reliably because of a defect in the disk itself, your file will become corrupted. Similarly, the downloading program might mess up when assembling the data file (especially if you started and stopped the download multiple times), due to a bug in the code. It also could mistakenly end the connection and think the download completed, but prematurely.

Usually, if I find my download is corrupted, I'll try redownloading it a couple times. If it doesn't help, I will usually wait for a day or so and try the download again. If it's a server issue, and the hosting company is relatively on top of things, they'll soon discover the issue and fix it pretty quickly. If it's a routing issue, just waiting will help, in much the same way that when traffic on the roads are congested, waiting will let you avoid the bad traffic. And, while it's less likely, it's still always a good idea to keep an eye out for seemingly random data corruption on your disk and scan your drive for errors once in a while. Disks do fail, and the first indication usually is either corrupted data or sudden shrinking of drive capacity.

Ben Richards

Posted 2012-09-06T16:46:30.527

Reputation: 11 662