27

We use sha1sum to calculate SHA-1 hash value of our packages.

Clarification about the usage: We distribute some software packages, and we want users to be able to check that what they downloaded is the correct package, down to the last bit.

The SHA-1 cryptographic hash algorithm has been replaced by SHA-2 since SHA-1 is known to be considerably weaker.

Can we still use sha1sum?

Or should we replace it with sha256sum, or sha512sum?

WhiteWinterWolf
  • 19,082
  • 4
  • 58
  • 104
Michael
  • 1,457
  • 1
  • 18
  • 36
  • 1
    Use it for what purpose? What research have you done? Have you read https://en.wikipedia.org/wiki/SHA-1 and http://security.stackexchange.com/q/64049/971? It has discussion on the known weaknesses in SHA-1 and attacks on it and on the different types of security goals that applications might need. What specifically is your confusion, that isn't already covered by that Wikipedia article and other articles here on this site? See also http://security.stackexchange.com/q/94690/971 and [tag:sha]. – D.W. Sep 02 '15 at 20:30
  • My question is not regarding SHA1. My question is about sha1sum. Tom Leek understand it very correct and I accept his answer. – Michael Sep 03 '15 at 08:57
  • Your question is too a question about SHA1 -- I have no idea why you would think it isn't. Asking whether `sha1sum` is secure amounts to asking whether the SHA1 algorithm is secure, as `sha1sum` just computes the SHA1 algorithm. – D.W. Sep 03 '15 at 16:44

7 Answers7

57

I suppose you "use sha1sum" in the following context: you distribute some software packages, and you want users to be able to check that what they downloaded is the correct package, down to the last bit. This assumes that you have a way to convey the hash value (computed with SHA-1) in an "unalterable" way (e.g. as part of a Web page which is served over HTTPS).

I also suppose that we are talking about attacks here, i.e. some malicious individual who can somehow alter the package as it is downloaded, and will want to inject some modification that will go undetected.

The security property that the used hash function should offer here is resistance to second-preimages. Most importantly, this is not the same as resistance to collisions. A collision is when the attacker can craft two distinct messages m and m' that hash to the same value; a second-preimage is when the attacker is given a fixed m and challenged with finding a distinct m' that hashes to the same value.

Second-preimages are a lot harder to obtain than collisions. For a "perfect" hash function with output size n bits, the computational effort for finding a collision is about 2n/2 invocations of the hash function; for a second-preimage, this is 2n. Moreover, structural weaknesses that allow for a faster collision attack do not necessarily apply to a second-preimage attack. This is true, in particular, for the known weaknesses of SHA-1: right now (September 2015), there are some known theoretical weaknesses of SHA-1 that should allow the computation of a collision in less than the ideal 280 effort (this is still a huge effort, about 261, so it has not been actually demonstrated yet); but these weaknesses are differential paths that intrinsically require the attacker to craft both m and m', therefore they do not carry over second-preimages.

For the time being, there is no known second-preimage attack on SHA-1 that would be even theoretically or academically faster than the generic attack, with a 2160 cost that is way beyond technological feasibility, by a long shot.

Bottom-line: within the context of what you are trying to do, SHA-1 is safe, and likely to remain safe for some time (even MD5 would still be appropriate).

Another reason for using sha1sum is the availability of client-side tools: in particular, the command-line hashing tool provided by Microsoft for Windows (called FCIV) knows MD5 and SHA-1, but not SHA-256 (at least so says the documentation)(*).

Windows 7 and later also contain a command-line tool called "certutil" that can compute SHA-256 hashes with the "-hashfile" sub-command. This is not widely known, but it can be convenient at times.


That being said, a powerful reason against using SHA-1 is that of image: it is currently highly trendy to boo and mock any use of SHA-1; the crowds clamour for its removal, anathema, arrest and public execution. By using SHA-1 you are telling the world that you are, definitely, not a hipster. From a business point of view, it rarely makes any good not to yield to the fashion du jour, so you should use one of the SHA-2 functions, e.g. SHA-256 or SHA-512.

There is no strong reason to prefer SHA-256 over SHA-512 or the other way round; some small, 32-bit only architectures are more comfortable with SHA-256, but this rarely matters in practice (even a 32-bit implementation of SHA-512 will still be able to hash several dozens of megabytes of data per second on an anemic laptop, and even in 32-bit mode, a not-too-old x86 CPU has some abilities at 64-bit computations with SSE2, which give a good boost for SHA-512). Any marketing expert would tell you to use SHA-512 on the sole basis that 512 is greater than 256, so "it must be better" in some (magical) way.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 8
    If it's for download verification, you might as well offer both. – user253751 Sep 02 '15 at 14:53
  • 6
    There is a good reason to prefer SHA-256 much of the time: SHA-512 digests are really long. – Ry- Sep 02 '15 at 15:21
  • Congrats on your rep birthday Mr 100k Leek :-) – paj28 Sep 02 '15 at 21:00
  • 2
    There's quite a tenuous use for a collision attack on the hash in this context, which is an insider who can manipulate the package but can't avoid it being QAed before signing. If they have a collision then they could submit one of the pair to QA for signing and then later arrange for someone to receive the other (malicious) one. However, someone in that position can probably pull all kinds of tricks to sneak underhanded code through QA. Messing with the signing would in practice be an absolute last resort, and so it's quite reasonable to exclude it from the threat model IMO. – Steve Jessop Sep 03 '15 at 00:08
  • @minitech There is also SHA-384 which is more or less just a truncated SHA-512. – kasperd Sep 03 '15 at 08:34
  • Getting the use of SHA-1 for certificates phased out now is a very sensible move. Better do it now than wait until it is as broken as was the case with MD5. There are use cases where SHA-1 and even MD5 can still be secure. But for those people who cannot keep up with all of this, it is better to use SHA-2 everywhere rather than SHA-1 or MD5. – kasperd Sep 03 '15 at 08:42
  • @SteveJessop, with software that depends on an upstream project it wouldn't even have to be an insider of the actual product. E.g. if libfoo is known to be embedded into the software, the developer of libfoo could find a collision between a legitimate patched version of libfoo and a malicious version, publish the legitimate as a new version for inclusion in the software and then be able to replace it with the malicious without a hash change. A large project may include many such libraries. – otus Sep 04 '15 at 09:03
  • @otus In practice, Linux distributors commonly recompile the software from source with their specific set of flags, sometimes dependencies, etc. Some distributions (Gentoo, I'm looking at you) affords the user (administrator) a huge degree of control over this process. So any such attack would have to be carefully tailored, *and* then the attacker (libfoo developer) would need to find a way to replace the libfoo binaries in the distribution's package repository. Possible? Technically, yes. Practical? Doubtful. – user Sep 04 '15 at 09:24
  • @MichaelKjörling, if there was no way to replace the software package, there would be no need for the hash in the first place. Yes, the attack I outlined is difficult and might not be practical, but I'd still be more comfortable with a collision resistant hash. – otus Sep 04 '15 at 09:38
7

You should use SHA-256 or SHA-512.

If you are only signing packages you have created yourself, then technically SHA-1 is still secure for that purpose. The property that is now weakened is "collision resistance" which you are not strictly relying on. However, the security of SHA-1 is only going to get worse with time, so it makes sense to move on now.

paj28
  • 32,736
  • 8
  • 92
  • 130
  • -1 because collisions can still be an issue even if you are the one signing it. If someone steals your private key, they will be able to create two versions of the same software with matching digests. They can distribute out a benign one that people may analyze and conclude is safe, then dish out the malicious version. – forest Jul 25 '18 at 02:04
  • @forest - If someone steals your private key they can sign anything they want. – paj28 Jul 25 '18 at 07:23
  • Yes, but they cannot release two different versions of some data with the same signature but different contents. If I verify the hash of something I download and actually look at what I download, I don't want the second time I download it necessitate me looking through it even if the hash already matches. Once I audit source code (for example) once, anything else with the same hash should not need further audits. – forest Jul 25 '18 at 07:33
  • @forest - Ok. In fact, the issue you describe could be a problem even without a stolen key - the legitimate owner could put out a benign and malicious version. You're actually confusing hashes and signatures. While a signature contains a hash, the similarity ends there. If you're doing your own auditing, you can hash the code you audited with whatever algorithm you choose. If you're using the signature, you will usually decide "I trust this key, I will trust whatever is signed by it" – paj28 Jul 25 '18 at 08:11
  • @forest - Also, why did you not downvote Tom Leek's answer above that says the same? – paj28 Jul 25 '18 at 08:12
  • Good point wrt being able to hash the code you audited with any hash. That's a solution I didn't think of and entirely limits the security requirements to that of preimage resistance. My DV was premature since I didn't consider this. I can remove the DV once the post is edited to unlock my vote. – forest Jul 25 '18 at 08:26
3

Tom Leak has a beautiful answer (which is why it is accepted). It is concerned with the mathematically provable facts behind the use of SHA-1. There is a second approach which is less fact based but may provide valuable heuristic information. I learned this approach from reading Bruce Schneier's opinions, but I cannot find the links off hand, so Bruce will have to deal with my namedropping.

In theory an algorithm is not broken until it's broken. Until someone has found a way to do X, where X is something that should be computationally infeasible, it is not considered "broken."1 However, that proves to be of limited value in its practical application in cryptography. Cryptographers would really rather get a little notice before their products fall apart, not after.

What has been found, historically, is that algorithms are rarely broken in one big step. Yes, it can happen, and a cryptographer has to plan for that, but what has been found empirically is that they are typically whittled away over time, paper after paper. It has been found that watching the difficulty of generating a collision is a reasonably good metric for guesstimating when the algorithm will actually be broken. So when Tom points out that the collision should take 280 operations and now takes 261, it is very valid to point out, as he did, that the 261 operations is theoretical, because it is still too large to warrant an attempt. However, it is also valid to think of it as "the algorithm has experienced a reduction in strength of 19 bits of power," and use that as a poor man's rule of thumb to project forward and estimate when that will become an issue.

This kind of thinking is why there is now a SHA-3, even though theoretically SHA-1 is still not fully broken. The cryptographers involved in SHA-3's development and testing know that it is going to take quite a while to develop confidence in SHA-3, and they want to make sure that confidence is there before SHA-1 breaks, not after.

1. I am aware that the most technically strict definition of a "broken" hash merely one where an attacker can do better than brute force, as opposed to when it actually becomes computationally feasible. However, this latter definition is more typically used when discussing the practical side of hashing.

Cort Ammon
  • 9,206
  • 3
  • 25
  • 26
2

Even if you are not doing work for the federal government, the link to the document below is a good reference to how long people should use hash key lengths until future dates for specific tasks like authenticating a signature. For package authentication, I would also include a size which makes it much harder to create a collision (Changed package that matches the same hash). Why would you not use both SHA1 and SHA512 (if the compute time is not so burdensome)? Also take into account the asymmetric key length used to sign the hash given. While it is great that you are thinking about this, there are probably other integrity issues that are more pressing like how to ensure that a person does not receive a falsified hash for comparison and the source of all of your included libraries that would be more of a risk.

P.S. If you are thinking about certificates for a browser both Chrome and IE are going to or have set UI about sunsetting the use of SHA1. If you are storing passwords, look to salted (and possibly peppered) hashes with PBKDF2.

http://csrc.nist.gov/publications/nistpubs/800-131A/sp800-131A.pdf http://googleonlinesecurity.blogspot.com/2014/09/gradually-sunsetting-sha-1.html https://en.wikipedia.org/wiki/PBKDF2

user219861
  • 101
  • 1
2

For the intended purpose, the practical answer is "Yes, of course", although for reasons of prestige you should also publish a more recent hash (such as SHA-3).
If it didn't have so much "stench", you could in principle still use MD5 (apart from people yelling at you, even MD5 will work fine for this application).

Using SHA-1 to verify downloads has never been "secure" and is primarily not intended to be. The main purpose of providing a hash (of any kind) is to detect bit errors, partial downloads, or accidentially downloading the wrong executable (clicking on the row above or below in the list and not noticing).

Malicious modifications are an issue that is not very well addressed with a hash, since an attacker who has sufficient access to the server so he can replace the binary can usually also replace the hash.
Arguably, you do gain some security if you use a CDN, since someone who gains access to a node in the CDN cannot successfully replace the binary on that node without the user (who downloads the hash from the server that only you control) noticing.

If, however, you need something that must be "secure" and resilient to malicious modification, you should most definitively digitally sign your executables and include for example a GPG signature. A hash will not do.

Damon
  • 5,001
  • 1
  • 19
  • 26
  • SHA-3 has only been standardized for a few weeks at this point ([Wikipedia](https://en.wikipedia.org/wiki/SHA-3) puts the NIST standard release on August 5, 2015). This means software support, particularly under the name SHA-3, is extremely scarce. SHA-2 is probably better *for now*, but having a plan for how to switch hash algorithms in the future is probably a good idea regardless. – user Sep 04 '15 at 09:27
0

There are better choices than SHA2. Blake2, for example, is a finalist of SHA3 competition, and blake2-256 is as fast as SHA1 and much faster as SHA2. It can be used as the checksum algorithm in Winrar 5. Like MD5 and SHA1, SHA2 is based on Merkle-Damgard structure and has possibly similar vulnerabilities. SHA3 is not vulnerable, but SHA3-224 is ca. 8 times slower than SHA1.

Smit Johnth
  • 1,709
  • 4
  • 17
  • 23
0

Can you still use it?..Yes you can, but sha-1 is vulnerable to collision attacks and has been deprecated by a number of browsers.

If the question was should you still use it, then I would say no you shouldn't, you should move to sha-2 and sha256sum type program.

TheJulyPlot
  • 7,669
  • 6
  • 30
  • 44