Password authentication uses a pair of functions:
- Set password: takes a password as input, outputs a password hash and a salt. This is a randomized function: calling it twice on the same password returns different outputs. S(p) = h + s
- Check password: takes a password and a password hash plus salt as input, outputs “true” if the password hash is one that “set password” could have produced from the password and “false” otherwise. C(p, h) = b
Under the hood, these operations rely on a password-based hashing function F, which is a deterministic operation that takes a password and a salt as inputs and outputs a password hash. “Set password” generates a random salt and calls the password-based hashing function, and stores the salt together with the output: S(p) = F(p, s) + s where s is the randomly generated salt. The password hash contains the salt so that “check password” can call the password-based hashing function with it: C(p, h + s) = compare(F(p, s), h). (Note that I'm just describing the essence of what these operations do; the details depend on the algorithms and storage formats.)
The qualities of a good password-based hashing function are:
- Given a hash value (including the salt), there must be no way to find a matching password except trying them one by one (brute force).
- Brute force attempts must be slow.
For more information about password-based hashing functions, see How to securely hash passwords?.
Password-based encryption uses a pair of functions to turn a password into an encryption key:
- Prepare encryption: takes a password as input, outputs a key and a salt. This is a randomized function: calling it twice on the same password returns different outputs. The salt is stored with the encrypted data. The key is only loaded into memory, and wiped once the encryption is done. E(p) = k + s
- Prepare decryption: takes a password and a salt as input, outputs a key. D(p, s) = k
These two operations are very similar. “Prepare encryption” just generates a random salt, then does the same calculation as “prepare decryption”.
The qualities of a good password-based encryption function are (simplified):
- Different passwords must produce different keys, and knowing the key produced by some passwords must not help in finding the key for another password.
- Given a salt and possibly some encrypted data and the corresponding plaintext, there must be no way to find the key except trying either keys or passwords one by one (brute force).
- Brute force attempts based on the password must be slow.
This is very similar with the qualities of password-based hashing. It's possible to construct good password-based encryption functions that fail at password hashing or vice versa, but in practice these tend to be somewhat contrived examples. (For example, appending helloworld
to a good password hash does not weaken it, but doing that for encryption would be bad since it would be trivial to predict that the key ends in helloworld
.) As a consequence, password authentication and password-based encryption are built into the same kind of cryptographic primitive, which is called a password-based key derivation function or key strecthing function. In the notation I used above, F and D are both key stretching functions.
When you point a tool like hashcat at a user database, it reads the salt and the hash value from the database (including the salt s) and attempts to find a matching password by calculating F(p, s) for each password p and comparing the result with the h value stored in the database. When you point it at a zip file, it reads the salt s from the encrypted file, calculates F(p, s) for each password p and tries decrypting the file with the resulting key. If the result “looks good”, the decryption key is the correct one and therefore password is a match. The definition of “looks good” depends on the data format: some contain an explicit key check value, while for others the tool has to rely on heuristics (for example, magic values at the beginning of certain file types).