2

First of all, Hello! I'm new to the site, don't know any english equivalent to "yoroshiku".

So, I was reading the SHA-512crypt generator documentation and found the part:

The default number of rounds for both algorithms is 5,000. To ensure minimal security and stability on the other hand minimum and maximum values for N are enforced:

minimum for N = 1,000

maximum for N = 999,999,999

Bery interesting since in my humble mind if you apply to many rounds in a hashing function you could end up with a collision bringing you back to square one of the hash.

Am I correct? or is my understanding of collisions wrong?

The way I see, you can both cause a collision with too long plain texts or with too many applications of the same hashing function.

Sorry for any grammar errors, I'm not a english-native speaker.

OscarAkaElvis
  • 5,185
  • 3
  • 17
  • 48
WOkzinhan
  • 21
  • 1
  • 2

3 Answers3

5

I believe that in this context yoroshiku can be rendered as the hope of a long and productive presence on Security.StackExchange :-). So, Welcome!

You are partially correct in your understanding of the collision problem; the risk exists, but if the algorithm has a "flat" enough output, the risk is essentially the same whatever the number of rounds.

The "security" that is increased by incrementing the number of rounds is the assurance that anyone attempting to generate large numbers of hashes (e.g. for a brute force attack) will require a suitably unlikely amount of computing power.

Let's say SHA-512 with 10,000 rounds costs ten times what SHA-512 with 1,000 rounds costs. Then, enumerating 10K round hashes for whatever purpose will take ten times as long than with 1K round hashes.

Usually, in most applications the "happy day scenario" involves only one call to the hash function (e.g. you're hashing the password supplied by the user). So we can use a high round number, and still allow the user to experience a negligible delay.

The way i see, you can both cause a collision with too long plain texts or with too many applications of the same hashing function

Yes, but you're only comparing calls with the same number of rounds. Say that for example 10K-round hash of "foo" is abcdef, and 1K-round hash of "bar" is too abcdef. This is a collision of sorts.

But when the system is run, it will not use the hash result at every round; it will wait until it has completed the 10Kth round before returning the hash. It will never even know that a "collision" was generated at round 1,000.

LSerni
  • 22,521
  • 4
  • 51
  • 60
  • That translation really was a warm welcome, best answer i could've hoped for when i read the question :) . You're absolutely right, rolling a dice a thousand times and reading only the last outcome has the same chance as rolling it once. It just takes longer on the first one. – J.A.K. Jan 24 '17 at 22:58
  • hmm, very clarifying, thanks; Buuuut, doesnt that mean that maybe there is a commom hashing path with round length of X and another with round length Y, so that X< – WOkzinhan Jan 26 '17 at 16:42
  • This might happen with some algorithms where the Poincaré recurrence theorem holds. But it's not that important, because the algorithm **will** run 400K rounds before checking its results. And your goal is to make it expend time. Of course if the attacker **knows** that there is a recurrence and knows it period, let it be T, then he can run a (N%T) hashing instead of N (you can google "rho structure"). But with well-designed hashes you should never get cycles shorter than about half its bitness, so with 256 bit you can still churn up to 2128 iterations before things go pear-shaped. – LSerni Jan 26 '17 at 17:14
  • And of course it is to be found on crypto.SE :-D - http://crypto.stackexchange.com/questions/25243/if-you-hashed-a-hash-an-infinite-number-of-times-would-you-end-up-with-a-unique – LSerni Jan 26 '17 at 17:16
2

Let's refresh your understanding of hash algorithms a bit.

Hash functions can be either cryptographically secure or not. Non-secure hash functions are useful for tasks like string mapping, error detection in communications protocols, so they still have a place. But they're not useful for cryptographic purposes, because they have certain weaknesses.

To be secure, a hashing function needs to have these three specific attributes:

  • Cascading changes: a one bit change in the message should result in an unpredictable change of approximately 50% of the bits in the digest.
  • Non reversible: you can't recreate a message given the hash except by brute force attack.
  • Collision resistance: it is infeasible to find two messages with the same hash value.

At its heart, the cascade effect is the most important attribute, as it provides the strength behind the other two attributes.

Each of these attributes protects against a certain type of attack. By executing a round of hashing, the crypt algorithm makes at least a one bit change to the message, resulting in a completely new hash. If the hash algorithm didn't have strong collision resistance, then yes, it would be possible to have multiple rounds that don't change the hash much.

Not having collision resistance has been a problem with non-cryptographic hash routines like CRC-32, where if the same message is altered only slightly, it can produce the same digest value. So if a source is pumping out data that varies by only a few bytes, it's possible that some could collide leaving you in the circle you described. That's why these attributes are so important for security purposes.

John Deters
  • 33,650
  • 3
  • 57
  • 110
0

The hashing function may have a short cycle or fixed point. A fixed point is a value for which hash(value) == value. This greatly reduces the security of an iterative hash, because any iterations done after reaching the fixed point don’t change anything anymore.

To solve this the hash can also use the iteration count as part of the hash:

hash = hmac(password, salt)
for (i = 0; i < iter_count; i++) {
    hash = hmac(password, hash + i)
}
return hash

Another solution, that PBKDF2 uses, is to XOR all intermediate hashes together in the result.

Hash cycles and fixed points are a bit of a theoretical risk. The chance of hitting one is very low, and no known cycles are known for hash functions such as SHA1.

Sjoerd
  • 28,707
  • 12
  • 74
  • 102