1

Pretty straightforward question. Take the image here:

https://i.imgur.com/oEdf6Rl.png

Does it come with a checksum, which I can verify against after I have fully downloaded the file?

This question goes for any file downloaded, and in my particular case; I do not care if the file on their end has been corrupted, but rather that I can check BEFOREHAND what that file's signature / fingerprint / digest / hash is; Can I do that?

The reason I want to do this, is because I want to bank those hashes and put them into a database, where I can ascertain if I have downloaded that particular file before hence saving bandwidth. Thanks.

  • There are ways to avoid duplicate download, but they don't use checksums and don't answer your question. Also you tagged md5; if you do use md5 that does NOT prevent having different files with the same checksum, because md5 has been broken for collision (efficiently) for over a decade. – dave_thompson_085 Feb 26 '21 at 06:40
  • @dave_thompson_085 but has the md5 been dupped with an identical file size and mimetype? I don't think so. I think it's just a matter of creating identical hashes, which is unlikely to be all that pertinent anyways given that the vector of attack does not make sense, and the odds are basically bill anyways. As to avoiding a duplicate download: I am all ears. – Jannies - They do it for free Feb 26 '21 at 06:48

1 Answers1

1

Does it come with a checksum, which I can verify against after I have fully downloaded the file?

No, random files do not come with a checksum. Providing checksums to go with files is usually only done for executables and installation media.

check BEFOREHAND what that file's signature / fingerprint / digest / hash is; Can I do that?

No, you can only checksum a file after you've downloaded it.

I want to bank those hashes and put them into a database, where I can ascertain if I have downloaded that particular file before hence saving bandwidth.

I'm afraid what you're trying to do simply isn't supported. And to an extent, web browser caching provides bypass of redundant image loads already.

gowenfawr
  • 71,975
  • 17
  • 161
  • 198
  • I'm not about loading web pages or images, but downloading files locally to disk. I just used an image as an example. Also I know you can't checksum an incomplete file, I just wanted to know if they provide one reliably. There are some headers out there such as `Want-Digest`, but I have yet to find a server that even supports this. In any case; If they don't give you a checksum, how are you verifying that whatever packet you are being sent, was correct? I'm guessing those have smaller ckecksums? – Jannies - They do it for free Feb 25 '21 at 19:57
  • @Jannies-Theydoitforfree: TCP provides simple checksum against transport errors. TLS (as in HTTPS) provides more reliable detection of errors and manipulation. Both of these are transparent to the application though. – Steffen Ullrich Feb 25 '21 at 20:13
  • @Jannies-Theydoitforfree and even with `Want-Digest` you get the Digest as a header _followed by_ the content, so that wouldn't meet your needs. And, yes, packets have their own checksums under TCP, which is why it's considered a "reliable" protocol. – gowenfawr Feb 25 '21 at 20:32