8

I store password hashes in their full value for example, $h = sha256('foo') outputs 64 characters: 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae

I store that directly in the database (along with salts, iterations, etc.).

My question is if I truncate that hash by 32 characters (or whatever length that's not too short) then store that instead, does it make it impossible to "crack" or "reverse" (in case hacker gains access to the database of hashes)? If so then I guess this is highly recommended or are there problems with it?

Rory Alsop
  • 61,367
  • 12
  • 115
  • 320
IMB
  • 2,888
  • 6
  • 28
  • 42
  • Based on the context from your previous question, would it be fair to re-word your question as "can an `n`-bit cryptographic hash be truncated to an `m`-bit cryptographic hash with just the expected 2^(n-m) loss of security, or can partial hash output not 'stand alone'?" – B-Con Aug 09 '12 at 18:20
  • @B-Con Honestly I don't quite understand what the prefix and suffix of `(n-m)` means. I think it's better if the question was not "too" technical :-) – IMB Aug 09 '12 at 18:24
  • Basically, you want to turn a hash of length `n` into a shorter hash of length `m`, but you want to know if you lose anything beyond just a shorter length.(?) – B-Con Aug 09 '12 at 18:30
  • @B-Con By this time I now understand truncating too many is not a good idea since it's likely to cause collision. So my question now is, since truncation has it's (arguable) benefits (since Google does it according to Rook's answer below), how many truncation is the maximum we can do so that it won't be prone to collision. – IMB Aug 09 '12 at 18:33

8 Answers8

15

Definitely not, if anything this practice undermines the hash function making it easier to find a collision during an exhaustive search. In the sense of a password, any plain text that produces the same hash value is a valid password.

That being said, ChromeOS uses a truncated hash of your Google password in their s2k function for the Google disk encryption scheme. I suspect that this was to make it harder to break your Google password in favor making it easier to find a password that decrypts your disk... which is kind of unsettling.

rook
  • 46,916
  • 10
  • 92
  • 181
  • Agree, I guess it depends on how many characters I truncate? The chance of collision with a different password is probably very tiny if I just truncate say 2 or 3 characters right? – IMB Aug 09 '12 at 17:10
  • 1
    @IMB the more you truncate the less resistant to collisions the hash function becomes. – rook Aug 09 '12 at 17:11
  • Source from google: http://www.chromium.org/chromium-os/chromiumos-design-docs/protecting-cached-user-data#managingkeys -- An interesting idea, though I agree with your analysis. It probably doesn't provide any real security. If your pw entropy is low (say ~30bits) and can be bruteforced with a GPU, and the salt is there and the attacker knows what part of the salt is used (reverse engineering) if they recover the key/partial hash, anything pw that works for the partial hash overwhelmingly is in the attacker's favor for working on google. – dr jimbob Aug 09 '12 at 20:24
  • This link says "SHA-256 digest of a user-specific salt concatenated with the user's passphrase. The first 128 bits of the digest are used as the user's "weak password hash." - Looking at the way it is communicated there wonder me were is the catch. – Andrew Smith Aug 09 '12 at 21:26
  • Yeah, I don't think Google's idea is very sound theoretically. Imagine you can brute-force passwords in an "optimal" order, e.g. in a way that approaches the theoretical 1.2 bits/character for an english passphrase. Then you have 6.8 bits of redundancy per character, which is a sort of ECC. Even if you halve that, it still sounds pretty high. Their description then seems overly-pessimistic to the point of ignorance - "we're willing to lose some of the entropy supplied by the passphrase" - this would only be true if your password had over 128 bits entropy, or the attacker cracked SHA256. – sourcejedi Sep 01 '12 at 10:11
  • @sourcejedi The truncated SHA256 hash is then passed to scrypt with large memory and cpu usage requirements. An FPGA or GPU won't be helpful. Its a neat s2k design. – rook Sep 01 '12 at 18:43
  • Whoosh. The hash halving is meant to provide some protection if you've brute-forced the scrypt, right? Say you succeed with one of the top N passwords (e.g. N=100,000). But in that case, the overwhelming likelyhood is that you've found the "real" password, and not a collision. (And again, I would be very skeptical of any individual designer who made such a mis-statement!) – sourcejedi Sep 01 '12 at 21:40
  • Real Chromebooks use a TPM, where this is not so obviously redundant - in case the TPM is broken and you can brute-force unimpeded. It's still very badly expressed. It's not halving the password entropy; it's truncating it to 128 bits. If you wanted to halve the password entropy, you'd have to make an estimate of it, and then truncate the hash to that many bits. With the numbers they suggest, this would be 9-15 bits. I bet they rejected that implementation without thinking! What they went with equally hilarious. It's probably fine; it just doesn't give you anything except a false sense of sec. – sourcejedi Sep 01 '12 at 22:01
  • If a person uses a password on multiple sites, and an attacker finds a password whose 256-bit hash matches the one stored on one of the sites, that password will likely work on all. If a site truncated the hash to 32 bits, then an attacker wouldn't have to work as hard to find a password that worked on that site, but such a password would be far less likely to work on any of the other sites. – supercat Jul 21 '14 at 16:54
10

What's "cracking" ? It is finding a password value which your system will accept, i.e. one which will match whatever your system stores as hash value in its database. If the attacker finds another password, distinct from the "true" password, but which matches the hash, then the attacker wins nonetheless.

For a given hash value, there are already many matching passwords (because there are "only" 2256 possible hash values, and many more possible passwords if you accept "long" passwords, say 50 characters). By truncating the hash value, you only increase the number of passwords which, when hashed-and-truncated, will match what you stored; i.e. you only make things easier for the attacker.

Now things are not necessarily so dire. SHA-256 offers 256 bits of output because it tries to offer resistance to collisions of at least 2128, and you need 256 bits of output to get that much resistance. However, collisions are not an issue for hashed password storage; what is needed is resistance to preimages. This one requires only n bits of output for 2n resistance. In other words, if you keep only 128 bits (that's 32 hexadecimal characters) then you are still "fine", or at least not substantially worse than what you would be with the full 64 characters.

Caution: a simple hash is not good. Though you do not weaken it any further (at least not in a practically meaningful way) by truncating it, you already have two problems:

  • You have no salt, which allows the attacker to apply time-wise or space-wise parallelism. Namely, attacking many passwords simultaneously, and/or using precomputed tables (rainbow tables).
  • A single hash is too fast, making it easy for the attacker to try millions, possibly billions of potential passwords per second.

So you need a better password-hashing process, one which can be configured for slowness and which uses a salt (a new random salt for each password). Usual recommendation is bcrypt. Then, and only then, you might envision a truncation (but keep at least 128 bits, and homemade alterations to cryptographic algorithms are not recommended on a general basis).


Edit: to clarify things: an attacker with unlimited computing power may try to enumerate all possible passwords and keep the ones which match the stored hash values -- these are candidates for being the "true" password. By truncating the hash, you increase the number of candidates, so, in a certain light, the attacker is farther from guessing the "true password". However, the attacker is not after the "true password", but after a password which grants access. Any password which matches the stored value will grant access, since the server grants access on the basis of "the entered password matches the stored hash value". So any candidate is good enough for the attacker.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • Regarding caution, as stated in the question "along with salts, iterations, etc." :-) – IMB Aug 09 '12 at 17:23
  • I do agree that truncating "too many" characters makes you vulnerable to other passwords but given you have unique salt and iterations do you still not recommend truncating? Wouldn't truncating 2 or 3 characters protect you from dictionary/rainbow attacks since if the hash ever matches something the original value will never be known? – IMB Aug 09 '12 at 17:33
  • 1
    Truncating might offer some sort of resistance against rainbow tables which were computed _previously_ with full hash values, but that's an artefact of how rainbow tables work (as opposed to regular precomputed hash tables) and rainbow tables are not applicable anyway IF you use proper salts (and if you do not, then that's a more urgent problem). – Thomas Pornin Aug 09 '12 at 17:39
  • Proper salts meaning unique per user, random, at least 20 characters right? – IMB Aug 09 '12 at 17:41
  • "Proper salts" = unique per hashed passwords (you should change the salt when a user modifies his password, too). "Random with enough characters" is a cheap and easy way to get uniqueness with high enough probability. There is no _need_ for randomness or length, for a salt, but short salts and non-randomness make it more hard or expensive to maintain uniqueness. – Thomas Pornin Aug 09 '12 at 18:00
  • I now understand the risk involved with this but if we prevent duplicate/matching hash stored in the database then this fixes the issue of collision right? – IMB Aug 13 '12 at 11:37
  • 1
    @IMB: collision _of salts_ (collisions on the hash itself are a non-issue for password storage) can be avoided to some extent by looking at the database when choosing a new salt, but this is expensive. Also, it does not protect against time-wise collisions (reusing a past salt -- the attacker may have had access to the hashed passwords several times, and old passwords are still interesting to him because users reuse passwords) and trans-server collisions (the same software on another server+database uses the same salt value, again giving optimization opportunities to the attacker). – Thomas Pornin Aug 13 '12 at 12:24
6

By truncating the hash, you can only make it easier to crack. Let's say the hash is 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae and you store 2c26b46b68ffc68ff99b453c1d304134. The attacker now has an easier job: he can find any password whose hash begins with 2c26b46b68ffc68ff99b453c1d304134. If he's using a lookup table, he might have to rearrange it a little to cope with your nonstandard format, but that's an amount of work that's roughly proportional to the size of the table.

I think that by “hash function” you mean something like PBKDF2 or bcrypt, since you mention “salts, iterations, etc.”. Functions such as the SHA family are not an appropriate way to store hashes (see Why is using salt more secure?). The only way to invert a proper hash function is by straight brute force: guess the password, compute your guess's hash, compare with the reference hash. If the reference hash is a truncation of the real hash, the attacker gains a negligible amount of time doing the comparison. For a massive brute force attack, or if you truncate the hash too much, the attacker may have its job made easier by finding a collision on the truncated hash which wouldn't have been one on the original hash. However, this is not a major concern in practice, because the weak point will be the password anyway.

With a good hash function, each bit is as independent as possible from the other bits and doesn't reveal any information about the hash. Therefore a truncated hash function is still a hash function, with reduced strength. If your hash function is one of the SHA functions (or by implication an iterated hash derived from a SHA), the NIST recommendations for using hashes (NIST SP-800-107, §5.1) discuss the strength you can expect from an N-bit truncation.

Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
2

I think that this question has good answers, so I just want to try to think in a different approach:

Given that the hash, salt, etc, is all done in the best known way, you get 64 characters (32 bytes) as output. If you throw 30 bytes away and just keep 2 bytes, will it be easier to find passwords that will generate that 2-bytes output? Probably yes, there'll be lots of passwords that will generate those 2 bytes.

And if you throw away 28 bytes, keeping 4 bytes? Probably less passwords will fit, but still much more passwords than the original 32 bytes hash.

And so on...

woliveirajr
  • 4,462
  • 2
  • 17
  • 26
2

It should go without saying, but this doesn't protect your site; it protects the user who re-uses their password on other sites. Kudos for thinking about your users (and doing so realistically).

Anyway, this is exactly equivalent to using a shorter hash function.

The problem is that you suggest a fixed number of bits. But password length, and strength in general, is quite variable.

It has been suggested that common password strengths are equivalent to a random key of length 18-30 bits. So, you could use that theory and truncate the hash to 15 bits - but then the 18-bit passwords would only require 23 = 8 guesses, once you've cracked the hash. Secondly, if you're worried about being called irresponsible, then doing that is going to tick both the "implementing your own custom cryptography is a really bad idea", and "will sound incredibly dumb if it comes up on the news" boxes. ("15 bit security, anyone"?).

This is not a sound idea. You should use standard key strengthening techniques like PBKDF2 or scrypt instead.

sourcejedi
  • 609
  • 4
  • 14
1

Suppose we have the hash 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae that was generated from a valid input. Lets simplify things and truncate the hash down to 1 character and see the results of our action.

Our new hash is "2".

The question is, how hard would it be to reverse the password from the truncated hash?

Many cracking algorithms would find many different passwords that match the hash! However, this may never find the original input. Also, the original input could never be proven that it created this hash.

Now suppose a different truncated hash of 2 characters.

The new hash is "2c".

A lot less passwords will be able to match the new truncated hash but there will still be many.

From this pattern you can see that the more a hash is truncated, the more passwords it could possibly be!

Obviously, this will we bad for authentication as well.

ponsfonze
  • 1,332
  • 11
  • 13
1

Truncating the hash may reduce the security of that hash by increasing the probability an attacker will be able to determine a password which generates that hash. However the question is whether there is any change to the difficulty to determine the password used to generate the hash. This is important if the same password is used to generate multiple hashes (using different salts).

Update

Thinking more about this last night, if the attacker has access to the salt then there is no impact to the difficultly in recovering the master password. The logic is thus: as a minor change in the hash input will have a large impact on the hash output, the same will be true of the partial hash. Therefore, if part of the input is known the only way to create the output is if the master password is determined.

Rory Alsop
  • 61,367
  • 12
  • 115
  • 320
ericball
  • 171
  • 2
-3

This makes absolutely no difference except the original cracking software would have to be modified to actually handle it correctly.

Andrew Smith
  • 1
  • 1
  • 6
  • 19
  • The assumption is the cracking software don't know it's truncated. Or if it knew, it doesn't know what was truncated. – IMB Aug 09 '12 at 19:48
  • 2
    If it's swapped, truncated, it is known or not known, doesnt make difference for a software, which can assume, that the hashes are not perfect, e.g. it's not completely dumb. If you want to trick with this some lame software then you can but if you can trick human I dont think so. – Andrew Smith Aug 09 '12 at 19:53
  • Well that's the goal, to make the software/human assume and do more "thinking" therefore making it extra harder to crack. – IMB Aug 09 '12 at 20:03
  • But you need to think only once, and make one release of the software, and then you can crack all the rest as normal. For 100.000 password one could make it - to recompile some crack or even edit it in hex editor is very easy. – Andrew Smith Aug 09 '12 at 21:21
  • Well it's not exactly "once" since in the first time you see the hash you can't tell at once if it was truncated and how many was truncated. If you are implying you have to make your software guess many times at "once" then yes I agree. – IMB Aug 11 '12 at 08:24