0

I think that I understand the right application cases for the following crypto functions, but I'd like to confirm my understanding and also propagate the safe use of crypto throughout the intertubes. Cryptography and its implementation are often messed up, and an understanding both of what the algorithm does and its use cases is really important.

Here's my understanding:

Cryptographic Hash Functions (Hashes)

A cryptographic hash function, given an input, should, when implemented properly, generate a "unique" output given that input. Popular examples of hash functions are the SHA-1 and SHA-2 families, and the SHA-256 algorithm in those families.

You should use a hash function to "guarantee" integrity of a file or payload, but both hashes and the original payloads should only be transferred over trusted/authenticated/encrypted channels because anyone can generate a hash from any input. You shouldn't use a hash function directly in key-derivation, you should use a KDF. You shouldn't use a hash function for the use case of a message-authentication-code, as hashes are anonymous and can be generated without knowledge of any secrets. You shouldn't use a hash directly to store user-entered passwords, KDFs are for that.

Hashed Machine Authentication Codes (HMAC)

Hashed Message Authentication Codes are used to combine a secret key and an input to generate a hash where only someone possessing both the original input payload/file and the secret key could generate the hashed message-authentication-code. All HMAC implementations should be implemented according to RFC-2104.

You should use an HMAC to securely validate payload contents. If you encrypt a file using an unauthenticated cipher mode like AES-CBC, you can use a HMAC with the encryption key and the encrypted output to "authenticate" the payload. This means that you can only validate the contents of the encrypted file if you also have the encryption key. This makes the encrypted file tamper-proof; you can know with fairly high certainty that the file is what you think it is, provided that your key is unique. A simpler example of a use-case for HMAC is sending a file over untrusted channels; if Bob emails a file and an HMAC to Alice and they both know that the key is "1337p@$$w0rdz," Alice can determine with high probability that the file she received is the same one that Bob sent. If Mallory also knows the key, however, and Mallory is able to modify the email that Alice receives, Mallory will also be able to generate files which Alice will trust. You shouldn't use a HMAC for hashing a user-entered password with an internally known "secret," use KDFs for that.

Key-Derivation Functions (KDF)

Key derivation functions are cryptographic functions designed to generate a hash-like output from a secret key input. These functions are specifically designed to be computationally-expensive so that key derivation is hard. Popular examples are PBKDF2, BCrypt, and Scrypt. They often use hash functions internally to either wrap the input or the output of the key.

Whenever you are dealing with a user-entered password, you should use a KDF. For example, if you need to encrypt a file using AES-256 in CBC mode, you should take the password entered by the user, run it through a KDF, and then use it as the encryption key. If you need to store user-entered passwords in a database, pass them through a computationally-difficult KDF, and then store them in the database.


Does this generally cover the different use cases of when you should and shouldn't use these three crypto operations?

Naftuli Kay
  • 6,715
  • 9
  • 47
  • 75
  • FYI: SHA-1 isn't a hash family, it's a specific hash algorithm. SHA-256 refers only to a specific algorithm in the SHA-2 family. Also, HMAC is "hash-*based* message authentication code", not "hashed message authentication code." – cpast May 08 '15 at 22:26
  • I was over-generalizing, you're right on all points. – Naftuli Kay May 09 '15 at 00:07
  • The KDF is not always computationally difficult. The HKDF, for instance, or the same HMAC can be used for KDF, without any additional difficult involved. The "KDF" that you have described is a PBKDF, when the input is a low entropy. :\ – Inkeliz May 18 '18 at 18:48

0 Answers0