I understand why a hashing algorithm should be slow but is the method that makes it slow important to the strength of the hash? Everything I've read says that the algorithm should be computationally slow - hash the thing over thousands of iterations or concatenating it with huge strings to slow it down. This seems like it would put unnecessary strain on the CPU. Couldn't you just hash the password once with a good random salt and then just pause the thread for a set amount of time?
2 Answers
The goal isn't to make the hash slow for you to compute. The goal is to make the hash slow for an attacker to compute. More specifically, slow for an attacker who has fast hardware and a copy of both the hash and salt, and who therefore has the ability to mount an offline attack. The attacker need not pause a thread during his computations just because you added that to your application. He is going to use software and hardware that will allow him to compute the hashes as quickly and efficiently as possible. Therefore, in order to make it computationally hard for him, with all of his fast hardware and his efficient hashing software, the hash must be computationally hard for you to compute as well.
- 35,525
- 27
- 113
- 141
-
That makes sense, thanks. Given that there is an infinite number of strings that can be hashed but only a finite amount of hash values (though very large) does it really matter that I hash the string over and over? No matter how many times I hash something it's still going to be in that set of hash values. Couldn't an attacker just hash a thing once compare it to the hash that he stole and move on if it doesn't match? – Matthew Aug 08 '14 at 15:42
-
1@Matthew No, because if he is not using the same mechanism you are, then his input will not result in the same hash when he attempts to use it as the input for your mechanism. Remember, he's not trying to find the hash. He already has that. He's trying to find the password that generated that hash, and to do that he must use the same mechanism (including the same number of iterations) that you are using. – Xander Aug 08 '14 at 15:55
-
5@Matthew Simple example. Suppose my password is "pwd" and the hash of that using 1,000,000 iterations of hash_algo_1 is "123". The attacker has "123". If he attempts to generate that hash using 1 iteration of hash_algo_1, and suppose he found that "abcd" resulted in the same hash of "123". That's fine, but when he tries to use "abcd" as the password to access my account, it will be using my mechanism with 1,000,000 iterations, and at the end of that process the has will be something else entirely, like "456", there will be no match, and he will have learned nothing. – Xander Aug 08 '14 at 16:00
-
3I should add to the discussion that there may be a reason to artificially slow down your hashing to prevent online attacks, where the attacker is supplying multiple plaintext passwords in succession to brute-force login. Here you could simply pause the thread before returning a result in order to slow down their progress. But this is different to slowing down the *hashing algorithm itself* which is used to prevent offline attacks as mentioned in this very well-written answer. – lc. Aug 09 '14 at 02:19
Running a few thousand iterations or appending a very long salt is not necessarily the best way of making a hash run slow, but it is the most obvious (easiest) one.
The intent is first and foremost to slow down massively parallel attacks. An attacker will not be able to brute force your hash in reasonable time on a single CPU. He will be using either a botnet or a machine with several GPUs (or both).
Running the same hash a thousand times is indeed not exactly the wisest thing at all, since it is easy to run this task in parallel, and it mostly stresses ALU, both being perfect matches for GPU computation. On the other hand, the impact on your CPU may be very noticeable (depending on how many iterations you do).
Ideally, a slow hash would involve operations that make parallelization hard and that utilize bandwidth in addition to raw ALU, such as requiring and touching substantial amounts of RAM (this also limits the amount of instances that can run on a zombie without being detected) or requiring operations that are expensive on GPUs (e.g. random access writes).
Of course this is not exactly as trivial to achieve as simply running a hash some thousand iterations, which is why one usually resorts to doing that.
- 5,001
- 1
- 19
- 26
-
How would you presume to run the iterations in parallel when the output from earlier iterations is a required input to following iterations? – Xander Aug 09 '14 at 15:20
-
2@Xander the expected attack would target many candidate passwords; if it's possible to run a ten such tests in parallel then the brute force scan can happen ten times faster on the same hardware. Also, the naive implementation of 'run a hash a thousand times' can be vulnerable to the issue that if A hashes to B, and I just calculated A->B->...->X with a lot of effort, then calculating a second candidate B->...->X->Y can be done with just a single hash. This can be circumvented, but all such things make a good slow hash nontrivial to design, so it's not recommended to roll your own. – Peteris Aug 09 '14 at 22:27
-
@Xander: It's what Peteris said. A GPU typically has several hundred to a few thousand cores and can run hundred of thousands hardware threads in parallel, switching threads in and out on the fly to hide RAM latency and such, effectively giving a near-100% load. On a Titan, that would for example mean a roughly 5,760 times speedup. A good "slow" hash would prevent itself from scaling that way. For example, consuming non-trivial amount of RAM will put a limit on how many thousand tasks you can run in parallel. – Damon Aug 10 '14 at 14:01