Arguments that proof a hashing scheme is sufficient

Question

We're currently working on a new server application that is going to communicate with client applications on many different systems. For this we wanted to implement a new password hashing scheme.

I've done quite some research, including reading many questions on security.SE and crypto.SE, about the hashing of passwords. After this research I've come up with the following hashing scheme:

We will use PBKDF2 as the hashing algorithm. The choice for this lies mainly in it being available directly in the .NET Framework. I've looked at both bcrypt and scrypt, but because both need third party libraries we just went with PBKDF2.

With the use of PBKDF2 there is also the free salt you get when generating a hash for the first time.

The question then was how to save the data in the database. We had two options:

Save each value seperately, i.e. hash, salt, iterations and algorithm.
Or save it as a single string.

We went with the single string value which looks as follows:

<algorithm>$<iterations>$<salt>$<hash>

This is similar to how, for example, Django does it.

I've implemented it like this and wrote up a report with our choices and arguments. But now I get questions about the design. For example some wonder why the salt is so clearly saved, wouldn't it be better to combine the salt and hash in a single string to make it less obvious to attackers.

Usually these cases make assumptions where attackers only have the database and such. While I claim that we should always look at the worst case scenario and assume attackers have both the code and the database. I also claim that the salt is there to negate the effect of rainbow tables and lookup tables. And the number of iterations is the deter brute force attacks.

But because I'm not a security professional, what I say can easily be ignored (although this goes for any argument on the internet ;)).

Therefore, I need anything that can show that what we created is a correct way of doing it.

I've just implemented a lot of things I found on the internet, but have no (scientific) proof that this is sufficient.

If you are wondering how many iterations to use in PBKDF2, read http://security.stackexchange.com/questions/3959/recommended-of-iterations-when-using-pkbdf2-sha256 — Matrix, Feb 20 '13 at 16:17
We've determined the amounf of iterations based on the time it takes to hash a password on a production server. This is also something that wasn't mentioned in a question. But I've did come across that question in my research phase. — Chrono, Feb 20 '13 at 17:45

score 5 · Accepted Answer · answered Feb 20 '13 at 19:30

The salt needs not be secret, because if it needed to be secret we would call it a key.

Doing things correctly, and convincing other people that you did things correctly, are two distinct issues. In any case, we always assume that the attacker knows both the database contents and the code, because that's what happens in practice. Keeping code secret is very hard, since it exists as binaries and source codes in many places (the server itself, the developers' machines, backup tapes, in the developers' heads, and possibly over the whole Internet if the code is opensource). Assuming that the code is secret does not seem to be a good foundation for security.

Therefore, you may (and must) assume that the attacker will know which part of your stored string is the salt, and which part is the hash value. Hence, you may as well use a clear and precise encoding, like you suggest.

For the convincing part, quote NIST (the US federal organization which deals, in particular, with cryptography standards), special publication 800-132. See for instance page 3 of that document, where the salt is defined as:

A non-secret binary value that is used as an input to the key derivation function PBKDF specified in this Recommendation to allow the generation of a large set of keys for a given password.

(emphasis on "non-secret").

Nothing more convincing than a quote from NIST eh. – Feb 21 '13 at 00:12 — , Feb 21 '13 at 00:12

score 2 · Answer 2 · answered Feb 20 '13 at 16:16

Simply appending the salt to the hash rather than having them clearly laid out is effectively security by obscurity and adds minimal value, but the other question is, is there any reason not to. If you know how to parse it, then it is trivial however it is formatted, so there is no downside to having it be directly concatenated.

That said, if there is a potential that the algorithm or length could change in the future, then having a fixed width field would be a negative. Since you are also storing the algorithm used, this is perhaps a good reason to not simply concatenate the values. Either way, there isn't really a significant security implication to spelling out that a particular value is the salt versus trying to obfuscate it. The secrecy of the salt is not required for the security of the system.

score 0 · Answer 3 · answered Feb 20 '13 at 19:08

You need to store all this information (algorithm, iterations, salt, hash) for each user individually, so it has to be in the database. The salt has to be unique for each user. While many entries will have the same algorithm and iteration count, over time there won't be a single value for all users. The reason is that as time goes on, you'll want to ramp up the number of iterations (each time you upgrade your server), and occasionally change the algorithm (e.g. switch to scrypt if a new version of .NET offers it). But you can only do that when a user logs in, because you need the original password string to build the new hash. Since this is unpredictable, when a user authenticates, you need to retrieve the algorithm, iteration count and salt, use these plus the password entered by the user to compute the hash, and compare that hash with the reference hash stored in the database. So you need to be able to parse all these fields.

The security of password storage doesn't rely on the meaning of the fields being obscure to attackers. Usually, the attacker knows how the application works (he may be running the application on a test server, or even have the source code). Even for an in-house application, the attacker may have created a few accounts and can easily try various combinations.

“Make it less obvious to attackers” is very obviously security through obscurity. It's akin to putting your head in the sand and assuming that your attacker will do the same. There is absolutely no point in hiding information that is irrelevant to the security of the system.

On the other hand, there are risks and costs associated with not laying out the fields clearly. Over time, you will need to maintain your application, maybe you'll have to rewrite the authentication module at some point. The simpler the way you encode is, the easier maintenance will be. Following a standard scheme is a good idea (dollar-separated schemes aren't specific to Django). Using a complex in-house encoding increases the risk that you'll get it wrong, resulting in unreadable password hashes or conversely in easily-cracked truncated hashes.

Arguments that proof a hashing scheme is sufficient

3 Answers3