41

Most of discussions involving access credentials include references to "hashing salted passwords". Is this another way to referring to the HMAC algorithm or a totally different operation? Different or not, why not using HMAC since this is easily referenced to a published standard -FIPS198?

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
Drew Lex
  • 2,013
  • 2
  • 19
  • 24
  • 1
    We use special constructions for password hashes, that are slow and salted. PBKDF2 is typically used with HMAC-SHA-x as building block. – CodesInChaos Jan 30 '13 at 14:28

2 Answers2

41

HMAC is a Message Authentication Code, which is meant for verifying integrity. This is a totally different kind of beast.

However, it so happens that HMAC is built over hash functions, and can be considered as a "keyed hash" -- a hash function with a key. A key is not a salt (keys are secret, salts are not). But the unique characteristics of HMAC make it a reasonable building block for other functions, which is what happens in PBKDF2: that's a key derivation function, commonly subverted into password hashing (a role for which it appears to be adequate). PBKDF2 includes a salt, and internally invokes HMAC (several times).

Good password hashing must be slow (in a configurable way) and include a salt. These characteristics are not easily obtained, and you will not get them out of a lone HMAC (which is fast and does not use a salt).

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • "several times" - more like several *thousand* times. Good answer though, as always. – KeithS Jan 30 '13 at 17:23
  • Why are salts not secret? Are they stored somewhere? – Minix Sep 06 '17 at 12:34
  • 1
    @Minix Salts are not secret because when salts are secret we call them "keys". Or, said otherwise, some algorithms, in particular password hashing, need the extra variation provided by the salt, but without requiring the secrecy of an actual key. This allows storing salts more easily. In practice, the salt for password hashing is normally stored in the same place (another column in the same database row) as the hash value itself, so that the same hash may be recomputed. – Thomas Pornin Sep 06 '17 at 15:37
  • What if the table is compromised, then? Is the pure act of salting a password enough of a hindrance, that keeping the hash secret wouldn't add to the complexity significantly? – Minix Sep 07 '17 at 07:18
  • 1
    @Minix The _whole and only point_ of the hash is to provide some hindrance in case of DB compromise. If you can assume that the DB is not compromise then storing the passwords themselves in plaintext would not be a problem. In general, if you have a key and can be sure that you can keep it secret (even though the machine uses it regularly) then you can use a keyed MAC instead of a password hash. The practical world is such that guaranteed secrecy even in case of total server compromise is not a real thing; hence we need a hash that stands its ground even with no secrecy. – Thomas Pornin Sep 07 '17 at 12:28
  • @ThomasPornin I meant keeping the salt secret, my mistake. My thought is simply, that also keeping the salt secret makes it even harder for an attacker to crack the hash. But I guess as long as the hash function is secure it doesn't really matter if it is even more secure. And if the hash function is broken it won't really matter, that the attacker doesn't know the salt. – Minix Sep 08 '17 at 13:41
  • @ThomasPornin You absolutely should never store the salt in the same database or any database. – Theodore R. Smith Feb 09 '18 at 16:31
  • So in PBKDF2, is the input salt used as the input key for HMAC? – cprcrack Aug 08 '18 at 16:39
30

Short answer:

Kind of, but not really. A salt is simply random data added to the message before it is hashed, with the object of making the hash produced by a salted message different from anything an attacker may have already computed on his own with the same but unsalted message (or with any other salt, for that matter). Usually, salts must be public, in order to legitimately compute the correct hash of a secret message on demand. A "secret salt" for a public message, to produce a tamper-proof authenticated hash would work in theory, but in practice our hash functions are imperfect, and when used in this way they are vulnerable to attacks that add data to the original message and change the hash in a predictable way (a "length extension attack").

An HMAC incorporates additional secret data in the form of a key. This key is combined with the message in a much deeper way than a salt, and may be used in addition to a simple salt of the message. The end result is not vulnerable to the same attack that a simple hash with a secret salt would be, and so is a much stronger method for producing an authenticated hash.

Longer answer:

The basic purpose of a hash, any hash, is to emulate a "random oracle"; a conceptual black box that can be given any message and will produce a truly random, truly unique digest value to identify it, and will produce the same random value if ever given the same message. There is a need for this type of function in many cryptographic applications, all using the random oracle in a subtly different way. Unfortunately, true random oracles don't exist, so hash functions, with known and inherent weaknesses, have to be good enough.

A "hashed salted password" is one use. The idea is that the message (the password) is a secret. The hashed value allows someone to prove to a system that they know the correct password, without the system having to remember the actual password (making it vulnerable). Theoretically, just giving the message to a random oracle and getting back a truly unique value for it would be all that's necessary, but because we don't have random oracles and must use hash functions instead, the number of possible digest values a hash function can produce is finite, so theoretically one could simply find a message, any message, that produces each desired hash, and remember all of these so they can look up an observed hash and find a message that will produce it. Salts make this theory much more difficult, by requiring an attacker to compute a hash of every possible message with every possible salt in order to guarantee that he has the correct combination of message and salt to produce the observed hash. The salt value must be public information, because the correct salt must be used when legitimately hashing a secret message; this makes it different from a key, which must be kept secret to ensure the security of the scheme.

Thomas's observation of password hashing, however, is very correct; passwords chosen by users are inherently low-entropy, and so are relatively easy to guess by brute force, even when salted. To discourage this, additional security is required, which typically comes in the form of "key-stretching"; a computationally-expensive transformation is added, to make producing a single hash slower (not so slow that it would affect legitimate users). The added work for each hash makes repeating the hash calculation hundreds of thousands of times to brute-force a password an untenable situation for an attacker. Salting is still required, because the easiest way to avoid doing this work on-demand is to do it beforehand and remember the result, as with a "faster" hash algorithm.

An HMAC has a subtly different purpose. The idea is that the message may not be secret; therefore, there may well be no secrets involved if a basic hash were used. This is not an unheard-of state of affairs; file checksums are freely available, as are the files used to generate them, but the checksum hash is typically protected against modification. However, if the digest cannot be guaranteed to be "read-only", for instance if it sent along with the message, then the hash has no value because it is trivially changed to match any message it's sent with. A secret must be added back in to the process to ensure that only someone who knows the secret could use the message (or any message) to produce a matching hash. An HMAC combines the message (which can still be salted) with a key, giving us such a secret.

Now, it's fair to ask why a salt value could not simply be kept secret as a key, and concatenated to the message before hashing it with a primitive secure hash function. The answer ties back to the "we don't have random oracles" point. If we did have a random oracle, then we could keep some of the message secret, and the oracle would never produce the same value (or even a predictable one) given the rest of the message, unless we recombined it with the exact secret we have withheld. However, with most of our real-world "secure" hash functions, there is a known vulnerability allowing an attack called "length extension". Given a hash digest of arbitrary length, and knowledge of the hashing primitive that produced it, it is possible in many cases to produce a modified hash that would correspond to appending known data onto the end of the message, without knowing the entire original message (which in our case would include a concatenated or XORed "secret salt"). HMAC was specifically designed to defeat this, by combining the key with the message multiple times in a nested hashing operation, resulting in a digest that a length extension cannot be trivially performed on.

KeithS
  • 6,678
  • 1
  • 22
  • 38
  • 1
    So can I use HMAC(password, salt) in place of Hash(password, salt)? Does it accomplish the same thing? – Arlen Beiler Sep 06 '17 at 19:43
  • @ArlenBeiler No one seems to know! From what I see, HMAC is *basically* hashing data with a secret (aka a salt), with one or more tweaks to make it exceptionally hard for someone to recompute the hash...? – Theodore R. Smith Jan 03 '18 at 15:23