10

Would hashing the result of a regular random number generator produce a cryptographically secure PRNG?

For example, would sha1(rand()) effectively be a secure PRNG?

Assuming it doesn't, how would you go about attacking it?

Edit: Lets assume that by attack I mean determine the next value it will generate after seeing a series of values it has produced.

Note: I should note that I'm not choosing to implement a PRNG in this way. I'm really interested in analysing the properties of this because I have seen similar constructs in code. It doesn't look like a good idea to me, but I'm struggling to think of the best way to attack it.

Colin Newell
  • 201
  • 2
  • 5
  • 1
    1) Yes: [KDF](https://en.wikipedia.org/wiki/Key_derivation_function) 2) No: [produces hash collisions with a complexity of 2^61 operations](https://code.google.com/p/hashclash/) 3) Depends on chosen cryptographic hash function. – TildalWave Jun 08 '13 at 21:13
  • 1
    @TidalWave how are hash collisions useful in this context? – Colin Newell Jun 08 '13 at 21:15
  • 3
    with `Rand()` the seeding is the biggest problem. – CodesInChaos Jun 08 '13 at 22:05
  • So I want to say that I'm very unsatisfied with some of the answers above. Much better answers: http://crypto.stackexchange.com/questions/9076/using-a-hash-as-a-secure-prng – derekmc Mar 29 '14 at 22:14
  • 1
    @derekmc That's a completely different question. This question is about hashing the output of a bad PRNG which is usually but not always insecure mostly because those PRNGs will be badly seeded. The linked question is about constructing a stream cipher from a hash, which is easy since we simply assume that the key/seed has high enough entropy. – CodesInChaos Mar 30 '14 at 08:20

4 Answers4

8

Hashing the output of a RNG is typically a component of making a cryptographically secure RNG, but it's not magic. It can't make a crappy RNG suddenly secure.

A key component in a cryptographically secure RNG is absolute unpredictability. If you can predict the output, then you can use that prediction as part of your attack. Running the output through a hash doesn't change the predictability.

Say, for example, that my super awesome random number generator alternates between returning the number 4 and 7. You can argue that my RNG is horrible, and you'd be right. But let's say that you now run the output through SHA1. So it alternates between returning SHA1(4) and SHA1(7). Still not random. So the answer is NO.

But hashes are useful for crypto RNGs

On the other hand, lets say you have a true source of entropy (e.g. a geiger counter pointed at a block of plutonium) which generates provably random but in our case biased results. Due to our sampling apparatus, the output stream is 57% 1's and 43% 0's. It's random in that you absolutely cannot predict the results, but if you guess "1", you'll be right more often.

We can neutralize this bias by running the output through a cryptographic hash. One of the properties of a cryptographic hash is that the output is NOT inherently biased, no matter what the input is. Obviously we have to collect sample blocks of input long enough to prevent ever hashing the same value twice. But the final output will have an approximately equal number of 1's and 0's, which makes our crypto more secure.

tylerl
  • 82,225
  • 25
  • 148
  • 226
  • 1
    Note that hashing is only one of a whole list of ways in which you can remove bias from a random bit stream. See [Dealing With Bias](http://en.wikipedia.org/wiki/Hardware_random_number_generator#Dealing_with_bias) in the WP HRNG page. – tylerl Jun 09 '13 at 21:36
  • Speaking from a strictly cryptographic point of view, passing a biased but truly random number generator through a hash function will not necessarily neutralize the bias. Cryptographically secure hash functions have the requirement of being collision resistant. Collision resistance is achievable even if the output is biased. If you make the additional assumption that your hash function behaves like the random oracle model, only then the bias will be eliminated. In fact, as a counter example I can easily design pathological hash functions that are intentionally biased yet collision-resistant. – dionyziz Feb 12 '17 at 09:01
  • @dionyziz collision resistance isn't the only property of a cryptographic hash. You could argue that it's not even a particularly important one. – tylerl Feb 12 '17 at 18:14
  • In "Introduction to Modern Cryptography", a seminal textbook of cryptography by Katz et al., they write that "In the context of cryptography, we [are] concerned with an adversary who may select elements depending on the hash function with the explicit goal of causing collisions." (pp. 128 in the first edition). They then devote a whole chapter to collision-resistant hash functions. There's no mention of other desirable properties of cryptographically secure hash functions by them (other than weaker notions of collision resistance). Can you explain why this is not an important property? – dionyziz Feb 12 '17 at 20:21
  • @dionyziz All cryptographic hash functions have an infinite number of possible collisions in an absolute sense, assuming infinite possible inputs. What makes a hash secure is the fundamental unpredictability of the mapping between inputs and outputs. Being able to predict or cause collisions is an attack on that unpredictability. Even being being able to predict properties of the input or output (including bias) weakens the hash. A correlary is that each bit of output must have an independently equal chance of being 1 or 0 for every input, which causes the whitening effect from the answer. – tylerl Feb 12 '17 at 22:13
6

Hashing has no effect at all in increasing the security of a PRNG like rand(). In fact, if the output of the PRNG is larger than the output of the hashing function, it will decrease its security.

sha(rand()) is, basically, security by obscurity. You're assuming that by making the PRNG output appear more random it will make it more secure, which is incorrect. I'll not approach the subject of cryptographically secure pseudo-random number generators (CSPRNGs), as I'm not well-versed in the subject. You just need to know that if an arbitrary rand() is insecure, then no matter how you mutate its output, it will stay insecure, and it might even make it less secure.

Now, if you're talking about PHP/C's rand(), then it's important to know it is far from being a cryptographically-secure PRNG.

As CodesInChaos said, the biggest problem with rand() is its seed's sources. Looking at php-src/ext/standard/php_rand.h, you can clearly see how the seed is generated

GENERATE_SEED() (((long) (time(0) * getpid())) ^ ((long) (1000000.0 * php_combined_lcg(TSRMLS_C))))

php_combined_lcg() boils down to a linear congruential generator (LLG), and those, LLGs, are not recommended to be used for cryptography purposes.

The Attack

The value returned by php_combined_lcg() is identical to the value returned by uniqid() which could be leaked through generated file names, generated passwords, poorly implemented tokens, etc. Once the attacker acquires that value, the rest is simple. The values of time() and getpid() are reasonably brue-forcable.

Adi
  • 43,808
  • 16
  • 135
  • 167
3

Part 1

No this is not secure, it would not be Semantically Secure (because Random() isn't secure)

I would argue that it wouldn't be secure because a hash function reduces the entropy of its input. In other words a hash function usually has less than a 1:1 mapping of results to input, and it has a greater chance of colliding with prior inputs than the raw input itself does.

In this case, non random input will simply be "longer" with more bits, but doesn't make it more random, which is the basis of most cryptographic needs. (Think One Time Pad)

A better solution:

Feed the non-random input into a "HKDF". A HKDF allows you to derive many keys from one key, with the assumption that the source key/PRF is not necessarily random. The HKDF uses the "Extract then expand" paradigm that adds randomness using a non-secret "salt" that is random. Sample HMAC(salt, yourKey)

Another approach is to feed that non random input into a stream cipher such as AES or 3DES and use that as a one time pad (or whatever you intend to use it for)

How would you attack it?

I need to know how this is being used. The attack is the same as the attack on random() with the added hashing delay afterwards. The simplified version of this attack is knowing that the area you need to brute force is much much smaller than 2^51 and that is likely non-negligible.

In a stream cipher example, it means that SHA1 will cause a collision in less than 281.5 terabytes is encrypted and the rand() function reduces that even further. (if anyone knows how to calculate this I'd appreciate that info)

Part 2

Just focusing on the "hashing" aspect that doesn't have the relation to a PRNG as you suspected:

Collisions are bad because it allows the attacker to do many things such as implement a Chosen Message Attack, Existential Forgery, or a variety of other attacks depending on where you use the resulting hash output.

In terms of hashing, a collision is most likely to occur in 2^n/2 operations (roughly the square root of the size of the output space).

Name       Size    Attack Operations
SHA-1       160    2^80               (Insecure, don't use)
SHA-256     256    2^128 
SHA-512     512    2^256
Whirlpool   512    2^256              (AES based and slowest)

What is listed above is the best possible collision resistance (in theory) based solely on the number of bits in the output.

Keep in mind the the realistic security of a hash will likely be less secure than the theoretical maximum. Take SHA-1 for example, the best known collision finder requires only 2^51 hash evaluations and the theoretical best case is 2^80)

makerofthings7
  • 50,090
  • 54
  • 250
  • 536
  • I've added a little more to my question to clarify what I mean by attack. I'm still struggling to understand where the general hashing problems mentioned in part 2 play a part in attacks on a PRNG, unless you're assuming that the PRNG is then being used to do some further operation. – Colin Newell Jun 08 '13 at 23:48
  • I should probably note that I'm not choosing to implement a PRNG in this way. I'm really interested in analysing the properties of this because I have seen similar constructs in code. It doesn't look like a good idea to me, but I'm struggling to think of the best way to attack it. I guess trying to guess the seeds to rand() is probably the most feasible attack? – Colin Newell Jun 08 '13 at 23:51
  • Guessing the seeds to rand is probably the best way. What specifically are you looking at? Also think of part 2 as a tangent... and might be an additional way the PRNG could be attacked. – makerofthings7 Jun 09 '13 at 00:53
-1

That completely depends on your application's security requirements, your random number generator, and your hash function. The fact is, it could be secure, but answering that question would require careful research. "Secure" doesn't always mean the same thing in every context. How would an attacker try to capitalize on your compromised random number generator? When security researchers develop tools such as CSPRNGs they try to make it meet certain requirements that can make it a secure tool for a wide variety of applications.

That being said, it is generally a bad idea to create your own security solutions, especially when they are based on good ideas, because the devil is in the details, and a subtle bug is much more dangerous than an obvious one.

I suggest you spend some time reading how HMAC's work. At first glance, hash(message|secretkey) seems like it would be an effective way to create a message authentication code. However, because of certain unavoidable weaknesses of hash functions, among other reasons, security experts have decided that hashing the input and/or key multiple times and throwing in pads in between can create a more secure authentication code.

If you take some time to read why HMAC's are done the way they are, that may help you appreciate why recommended CSPRNGs are probably going to be better than what you can do yourself.

That being said, it appears like what you want to do is cut some corners. Instead of asking, how can I make a half-ass CSPRNG? You may want to ask if your random number generator really needs to be cryptographically secure.

If you really need that level of security, you should do it right. If you don't need it, going half-way only serves to give you a false sense of security, and makes it difficult to analyze and predict your security weaknesses.

If I am a hacker, a well hidden bug or weakness, (something like heartbleed), is going to be infinitely more valuable than something obvious, that will get recognized and fixed as soon as it becomes important enough to merit attention.

I would say that's one of the most important reasons not to invent your own when it comes to security.

If you are interested in researching more in depth the security potential of your proposed scheme, this might be a place to start. It is a related question in the cryptography stack exchange:

https://crypto.stackexchange.com/questions/9076/using-a-hash-as-a-secure-prng

derekmc
  • 1
  • 2
  • I have added a note to the question to explain that I'm not trying to implement a CSPRNG in this way, I am trying to determine the usefulness of an existing piece of code. – Colin Newell Apr 25 '14 at 22:01
  • makes sense now. sorry about my horribly unclear and long winded post. I still think you should read [this](http://crypto.stackexchange.com/questions/9076/using-a-hash-as-a-secure-prng) – derekmc Sep 21 '14 at 15:07