1

I need some help understanding how password are used to authenticate who you are and thereby allow you, the user, to have access to the appropriate data.

I have read these two articles over here: https://stackoverflow.com/questions/549/the-definitive-guide-to-form-based-website-authentication

How to securely hash passwords?

and basically my questions sort of flow from these articles. I am going to try and summarize what I think it is saying as part of my attempt to understand.

1.User creates a password (say "123456" is the password, I know that is a terrible password).
2. The server uses a key derivation function with a salt and iterations to create a key for that user to encrypt that user's data. So say for example we use PBKDF2.

After the key derivation, I know we should not store the passwords themselves as plaintext. But I am not sure what to do next.

Specifically in one of the articles I read, the author said this

A cryptographic hash should not be used for password storage because user-selected passwords are not strong enough (i.e. do not usually contain enough entropy) and a password guessing attack could be completed in a relatively short time by an attacker with access to the hashes. This is why a KDF is used - these effectively "stretch the key" meaning that each password guesses an attacker makes involves iterating the hashing algorithm multiple times, for example, 10,000 times, making the attacker's password guessing 10,000 times slower.

Does this mean that instead of hashing the plaintext password (e.g. hashing 123456), the server instead hashes the key generated by running PBKDF2? So is the third step then to hash the keys?

Because the other article says something a bit different. The other article says

We need to hash passwords as a second line of defence. A server which can authenticate users necessarily contains, somewhere in its entrails, some data which can be used to validate a password.

Once the password has been created, say the user wants to end the session for whatever application is on. Say he wants to get on the application the next day. He enters his username and password. Are the following the correct steps taken when the user wants to authenticate who he is?

  1. User enters password.
  2. Server runs key-derivation on that password
  3. Server hashes the key with the particular hashing algorithm
  4. Server calls the hashed key for this particular username from the database.
  5. The output of steps 3 and 4 are compared. If they match, then the user is authenticated, otherwise denied.
user278039
  • 113
  • 3

1 Answers1

2

I read through the material and reread the Wikipedia entry on PBKDF2. I think I understand your confusion. PBKDF2 is a standard developed for how to derive a password verifier - it is an algorithm, but it doesn’t specify the exact details as it leaves some particulars open for implementation. These include the particular hashing function or PRF, salt size, iteration count and length of the output (key length). I’ll get back to all of this below.

Next, the article mentions cryptographic hash. What they mean by this is SHA-1 or MD5 or one of their brethren. Think of a cryptographic hash as a raw building block developed by cryptographers and designed to meet a certain set of constraints, in particular robustness in the face of known or discovered attacks against prior hashes including improvements in computing power. Also, why is the cryptographic part important? Because simple hashing that you or I might invent would be easily broken, whereas the hash algorithms cryptographers invent are resistant to various known attacks. There was a time when passwords could be stored in plaintext. People didn’t know how to break into the system files and recover the passwords. However, admins could see the raw passwords and sometimes certain system operations could be tricked into revealing passwords. Subsequently, passwords were hashed with any old algorithm and then cryptographic hashes.

The problem with cryptographic hashes is that hackers could steal the password tables/files and use rainbow tables to match passwords. I may not know your password, but if I have its hash and I run a dictionary through the hash algorithm and a result matches the hash, bingo! Now, offline I can just build a large database of every word or sequence I choose at my leisure and anytime I get any password hashes, I match them against the prebuilt table.

In comes salt. The salt, which should be the same length as the cryptographic hash output length, kills being able to prebuild a rainbow table. Every password hashed with a unique salt means even if two people share the same password (unknowingly) then no one could compare their hashes and discover that fact. The salt may be kept in the clear. It is not a secret, so much as a rainbow table nullifier.

Once computers became really fast, it became important to further thwart hash breaking attacks on the chance someone steals your password hash table. To do that, security folks decided that just doing more rounds of hashing would address the problem. Think of a cryptographic hash as really a KDF with only one iteration. Instead, some folks recommend 10,000 iterations.

Finally, we get to the KDF (Key Derivation Function) part. In the KDF, we combine a password (plus perhaps a username) and a salt and run it through some particular cryptographic hash 10,000 times. There’s the added bit of the PBKDF2 standard which allows you to suggest the output length. You might want to use the PBKDF2 to generate a key for a encryption algorithm or to pass to some other cryptographic function. The standard suggests a secure way to smear entropy bits throughout the entire output without risking reversing the inputs; always a problem for security protocols.

With all of this in mind, to answer your questions: (a) store and use the results of the PBKDF2 directly, there’s no need to apply any other operations on the password verifier and (b) in your list of steps at the end step 3 isn’t really needed and step 2 does all of the work including running many iterations of a cryptographic hash to create the PBKDF2 output.

Andrew Philips
  • 1,411
  • 8
  • 10