I find it hard to believe, and figure I just missed or forgot something obvious (years since I thought about this stuff), so maybe someone can point it out. But it seems like most of what I know about salts is they're kinda loose in guidelines and only really beg for uniqueness. But if you're not following some strict guidelines it seems you can attack them using rainbow tables. And a similar attack on an unsalted password would only be possible if you allowed passwords larger than your digest size as input.
Outline:
For simplicity's sake I'm making some assumptions, they are quite a few (but I feel semi-reasonable):
- The salt is random and unique and of static length.
- And let's assume our hash function is uniform random (which theoretically a good hash should be), ie if I hash a counter I'll get a statistically good PRNG.
- The salt is applied by concatenating with the password: hash(pass+salt)
- The salt is there only to defeat a rainbow table
- The entire point of using the hash is to store a check for the password, there is no encrypted or plaintext version of your password available to be checked. And the password is never used directly (as an encryption key for example).
Since the defender has no way of validating input we'll use a password of "0" (in the event it is pre-validated for password requirement correctness before sending, then use any input that passes the pre-validation, the only thing that matters in this step is that your password goes through and is constant) Compute a rainbow table for the hash(pass+salt) for "each value" of salt and a constant password. We now have a rainbow table that tells us for which salts a password of "0" would work. We've found collisions with real passwords by leveraging the salt.
On average given a uniform hash we would expect that singular password to randomly sample the space of possibilities for each salt computed.
Example:
pass+(no salt) = 1 hash found, 1 hash valid.
pass+(1 bit salt) = 2 hashes found, one for each bit state, each selected uniformly from "digest space".
If we consider the fact that any combined pass+salt string that is 1 bit longer than the digest size must have a 50% collision rate across all possible inputs. And we also consider the fact that we're sampling only a subset of all possible inputs by having static bits (our password), then a salt greater than the digest size by N bits will give 2^N collisions for that password. The password's length is discounted from collisions by being static.
In other words, if we assume we're computing a rainbow table for all possible salts and a static password, then a salt longer than the digest is bad by magnitude N as it weakens all passwords by that amount (although you have to do more work to build the table, the point stands). Less makes you more vulnerable to a more standard rainbow table application at the very least (with a 1 bit salt you only need 2 rainbow tables for example).
So the best case is a salt of exactly digest size. Which means on average we can expect to have at most 1 collision. Given that's 1 collision per fake password rainbow table that's well within reasonable security.
Some things that may increase the utility of this for the attacker that weren't fully considered:
- Choosing a password that would pass any pre-validation check.
- Doing this for something like the top 10 passwords, assuming they're not pre-validated out of consideration this probably guarantees you a decent percentage of hits.
The assumptions were to remove the following from consideration:
- If one allows the salt to have different length then you would need a rainbow table for each length (which adds a little security, but I'm fairly certain it doesn't outpace the losses from the attacker being able to precompute that many more collisions). Possibly might be able to address that in the reduction function for the rainbow table and have chains consider different salt lengths.
- Defender must be blind for this to work.
- Some attack on the salt other than described (statistical, etc.), ie we could focus on a statistically likely hash region for a larger variety of static passwords.
- Applying salt in another way essentially is a different hash function, you need rainbow tables for each hash function under consideration. Also concatenation is not associative (associativity could completely invalidate the salt).
- The no direct password use bit is just covering my bases, the usage of a security system as a whole can potentially cover weaknesses of individual parts.
The Question:
IF the salt is not random and unique and of length exactly equal to the digest. THEN isn't there a feasible attack that can be made using rainbow tables? With a bigger problem the more you deviate from those guidelines? Or did I have an invalidating assumption or something? And if this checks out why have I never heard that the salt must be the size of the digest?