Should hashing hashed hashes colide or not?

Question

Since key stretching basically boils down to hashing hashes over and over again (where salt, pepper and password individualize the hash function, but the principle remains the same), I wonder about this question.

On the one hand, hashes should be as collision-free as possible. For hashes themselves, that is theoretically achievable due to the same length, but this would yield the existence of a bijective map, i.e. for each hash h₁ you could give another hash h₂, the hash of which is the original hash h₁ again, so you can invert the hash. Then you could "simply" use the inversion to invert the entire chain of hashes and craft a valid password - even worse, if salt/pepper and password influence the key derivation in the same way, this could potentially render the salting useless since the crafted password might compensate its influence.

On the other hand, if hashing hashes collides, this means that some hashes are safer than others - some hashes cannot be obtained by hashing another hash, but other hashes can be obtained from multiple hashes. And these hashes would be attractors and be much more likely to occur after many rounds of hashing than the unobtainable ones, effectively reducing the potential keyspace.

So, which of these two possibilities is worse, and which one is the actual case in reality? Since the choice of password and salt/pepper modify the hashing function, does the answer depend on that? Could this problem be solved by concatenating the entire chain of hashes instead of only the final one? (If that doesn't already have a name, may I propose hash cat?)

Related: [How big is the risk of hash fixed points/cycles?](http://security.stackexchange.com/q/26043/3272) — Tobias Kienzler, Mar 08 '13 at 14:33
Related: [General purpose slow/unique hash routine for dup checking of private data, without storing the data itself?](http://security.stackexchange.com/q/10659/3272) — Tobias Kienzler, Mar 13 '13 at 13:36

score 1 · Accepted Answer · edited Mar 17 '17 at 13:14

Since hash functions have a much larger input space than output space, collisions are a normal feature, and it is expected that every possible output corresponds to a lot of matching inputs.

As a rule, collisions are not a problem for password hashing. The previous paragraph means that, for every hash output (in particular the one stored in the server database) then there are many matching passwords which are all equivalent. However:

to find any single password which matches the stored output, you have to break preimage resistance of the hash function;
to find a second password which yields the same hash than a known password, you have to break second preimage resistance of the hash function.

A "cryptographically secure" hash function with an output of n bits ought to offer resistance 2ⁿ to both kind of attacks, so no real problem here. Known a collision in abstracto does not give much power to any attacker: the attacker can use a given password, and he has a "spare" with the same effect; so what ?

As for the reduction of space when iterating the function, see this previous answer. Summary: space reduction does occur, but not below an inner space of size 2^n/2, which moreover:

requires a lot of effort to be actually reached;
cannot be efficiently characterized (you cannot easily know if a given value is part of that inner space or not).

Use SHA-256 and be happy.

The input space of hashed hashes is equal to the output space (a hash is as long as a hash...) - but as you state, even that is so huge that collisions are unlikely to cause harm. But what about the other side, what if the input subset of hashes is collision-free? Again there's the enormity of the hash-space, but in this case one could consider the existing inversion another hash which may actually happen to be easier to retrieve than doing a preimage attack, or not? — Tobias Kienzler, Mar 08 '13 at 14:43
Note that having a bijection and being able to compute it in two ways are distinct things. If you take a RSA key _(N,e)_ (_N_ is the modulus, _e_ the public exponent), then raising to the _e_-th power modulo _N_ is a bijection that everybody can compute in that direction, and certainly not (in the general case) in the reverse direction. — Thomas Pornin, Mar 08 '13 at 14:54
Ah yes, good point. That's the very essence of any asymmetric encryption... — Tobias Kienzler, Mar 08 '13 at 14:58

score 1 · Answer 2 · answered Mar 08 '13 at 14:37

If any level of the hash collides, all additional levels will also collide. The space is so big however that you won't often have a problem if using a cryptographically secure hash.

For example, if you hashed ABC and got 123 and then hashed 123 to get 456. If you hashed DEF and got 123, your second round of hashing would be hashing the same thing as your first example. The keyspaces are so large that it isn't a practical problem though. The problem could also be mitigated by hashing the result of the last 2 iterations (thus the hash of ABC123 is likely to be different from the hash of DEF123).

Should hashing hashed hashes colide or not?

2 Answers2

Linked