Expanding/Inverse Hash function

Question

First off, before I ask my question, let me explain what I am NOT asking. I am NOT asking for a way/method to reverse a hash output; by definition, a hashing function is one-way.

Is there such a thing as an inverse hashing function? I am likely calling it the wrong name. What I mean is a function where, for example, if the input is 100 characters long the output will be 150 characters long. So, a one-way hash where the output is always larger than the input. However, traditional hashes usually make the output smaller than the input (if the input is of a reasonable size).

I am no security expert, so I am open to the possibility that I am mistaken. However, it seems to me that this kind of hash with a longer output would be harder to crack with a rainbow table, etc. Of course, it would be less useful than a traditional hash for things like assigning an image an ID, but for security applications, might it be superior? And of course, the ratio of input size to output size needs not be fixed, making it that much harder to predict the input.

Keep in mind that a hash function does not necessarily "compress" data. All it does is to map any given variable-length input to a unique fixed-length output. (For a given value of "unique", of course.) The security is not dependent on the length of the output string (or at least not directly) but on the uniqueness of the output string. — Jonathan Garber, Apr 04 '13 at 20:20

score 3 · Accepted Answer · answered Apr 04 '13 at 20:35

Hashing is an algorithmic manipulation of data. A cryptographic hashing function is one-way function with a fixed output size regardless of the input size.

You seem to be talking about increasing security by using a different type of hash function, so no... no cryptographic hash matches properties that I didn't describe above. For other types of hashing (particular in regards to data structures), see http://en.wikipedia.org/wiki/Hash_function.

Otherwise, the answer is no and variable lengthening of input wouldn't be an improvement as it is impossible to generate more entropy by without adding more input. You can't add more input as one-way hash function must be repeatable to be useful.

score 2 · Answer 2 · edited Mar 17 '17 at 10:46

There is no security gain, for password hashing, to have a longer hash output. The attacker tries "plausible passwords" until a match is found with the stored hash value. The space of passwords that the attacker will try (regardless of how he exactly tries them, e.g. with a rainbow table) will be very small, in absolute terms (less than 2⁸⁰). Correspondingly, the attacker just needs to work on the first 80 bits of the hash value, even if you stored 2000 bits of output: 80 bits are sufficient, for the attacker, because he will not get many "false positives".

The "bigger is better" meme is strong in the industry, but in cryptography it is rather wrong.

That being said, it is possible to generate a long output from a small input; this is the working principle of stream ciphers. A simple way is to hash the password with a normal password hashing function (see this) and then use the hash output as key to a stream cipher (e.g. one of these). There was a hash function which combined both steps, called Panama (it turned out to be weak, but not for that reason; some of the principles of Panama were reused in the newly chosen SHA-3, aka Keccak).

score 1 · Answer 3 · edited Mar 17 '17 at 10:46

1

To add to Jeff Ferland's answer, it specifically would not make building rainbow tables any harder, computationally, given algorithms that are computationally equivalent. It would only require additional storage space for the output of the rainbow tables. However, this is insignificant from a security perspective because rainbox tables have largely become obsolete with the widespread usage of salted hashed, and the relatively inexpensive computing power we now have access to due to Moore's Law and modern GPUs.

So, if I need to calculate all possible hashes for a given input + a given salt in any case, the size of the output is essentially irrelevant.

edited Mar 17 '17 at 10:46

Community

1

answered Apr 04 '13 at 20:54

Xander

35,525
27
113
141

Yes, but what if the hash function was more intensive in its transformation? Also, I'd think the less collilisions in the hash the longer it would take to brute force. Sorry, did not mean to impply that the only threat was rainbow tables. Plus, there will always be programmers who forget to bring salt to the table! – TheCatWhisperer Apr 04 '13 at 21:53
@NotCodingCoder Sure, but the length of the output has mo effect on how computationally intensive the hash function is. You can make a hash function with a fixed output arbitrarily computationally intensive as it is, so there's really no value added by making the output longer. See the authoritative answer on hash function here: http://security.stackexchange.com/a/31846/12 – Xander Apr 04 '13 at 22:03
Sorry, but I do not understand how a hash with a longer output is not more secure when confronted with a brute force attack. No matter the function, the smaller the output the less possibilities, the less possibilities means more collisions. An attacker does not need to know your password, only a phrase which will result in the same hash. – TheCatWhisperer Apr 04 '13 at 22:09
@NotCodingCoder The nature of a cryptographically strong hash function is that it's unfeasible to find collisions. So while a longer output theoretically means that they would be fewer collisions, practically, it would only mean that something that's impossible is still impossible. See this answer for more: http://security.stackexchange.com/a/3145/12 – Xander Apr 04 '13 at 22:16

score 0 · Answer 4 · edited Apr 13 '17 at 12:48

Not sure what you are getting at. There are two main categories of uses of a cryptographic hash.

To check the integrity of a large file with a checksum delivered securely. E.g., the checksum is delivered via a secure channel (say https from a known site) while the file is transmitted via an unsecure channel but checked against the checksum at the end, and
To authenticate a user's identity via a password without storing the password in plaintext in a database. That is if my password is Correct Horse Battery Staple, I'd store $2a$12$5vc0XINA0MiPvgE3ffLnaOVGqrMeskLKjFvIg4BKryq3v5PXt1oIi as a 184-bit bcrypt hash in my database (with a 128-bit salt). When a user who knows the secret password logs in, the server compares the hash generated from the password (and salt) with the stored hash and only lets them in if they match.

The chance of an attacker randomly guessing the hash in one trial goes as 1 chance in 2^{{bit length of the hash}}. So for a 256 bit hash, an attacker has a 1 in 115792089237316195423570985008687907853269984665640564039457584007913129639936 chance of randomly guessing the hash in one try (and even if you let them try a billion hashes per second for a billion years on a billion computers, they would have only tried ~10³⁴ password and their chance of only successfully cracking the password is only about 1 in 10⁴² (e.g., the chance of winning powerball is about 1 in 10^8; so this is similar to your chance of playing powerball five consecutive times and winning the jackpot each of those five times). And again, that was for a 256 bit hash, a 184 or 512 or 1024 bit hash would be exponentially stronger.

Now if you had a hash that had a non-fixed size, that seems to be counterproductive. E.g,. you transfer a 1 GB file over the internet, do you really want to compute a checksum on a checksum that's the same size or larger? Also that would leak information about data that's hashed. E.g., if your hashing function increases in length by two bytes for each byte added, you can find the length of the original password; which could be useful to quickly find the shortest passwords to try to brute force via dictionary search of passwords of that length. And again, 2²⁵⁶ is already strong enough to overcome any feasible attacker, there's no need to try using a hash that's 1 GB in size to prevent an attacker that can make 2^8589934592 attempts, as there's no feasible way to make 2²⁵⁶ even with very unreasonable assumptions.

That said, key expansion -- that is taking one pseudo-random key of a given relatively-short length (like 128-bits/256-bits) and then using it as a "seed" to create many more pseudo-random keys (of total length much longer than the original key) is commonly used for many cryptographic purposes. E.g., in AES128, a 128 bit key is expanded into 11 round keys that are each 128-bits (e.g., 1408 bits in total) that are used in each round of the AES encryption process.

Similarly, a stream cipher is often essentially equivalent a key expansion; e.g., you can construct a stream cipher from a block cipher operating in counter mode. Basically, say you have a block cipher (like AES256) where you take a 256-bit block of data P, a 256-bit key K, and can get a ciphertext from an encryption function like C = E(K,P). Well if you start with a random 256bit P, you can create a pseudorandom stream of very long length using C0= E(K,P) for the first 256 bits, C1=E(K,P+1) for the next 256 bits, C2= E(K+2) for the next 256 bits. And you can do this to have a pseudorandom stream which you can XOR with your message. And you can do this for many terabytes of data before attacks become computationally feasible.

However, I wouldn't call key expansion/stream cipher the opposite of a hash. They are just different things.

So expansion might be helpful in a ciper, not not really a hash? — TheCatWhisperer, Apr 04 '13 at 22:38

score 0 · Answer 5 · edited Oct 07 '21 at 06:58

Something that closely matches this description is a KDF, or key stretching, e.g. PKCS#5 PBKDF2. The intent of such a scheme (in the absence of a more suitable password hashing scheme such as bcrypt) is to make brute-forcing of a user chosen password (dictionary-based attack) more expensive than a full brute-force attack over the entire input domain.

Although the output of the KDF may not always have more bits than an arbitrary user-chosen password, it is more "opaque", and may be longer that than the minimum password length (though it cannot add entropy, as noted in Jeff Federland's answer).

Hash output size isn't a good indicator of resistance to rainbow-table attacks, salt is.

Expanding/Inverse Hash function

5 Answers5