How are pseudorandom and truly random numbers different and why does it matter?

668

300

I've never quite got this. Just say you write a small program in any language at all which rolls some dice (just using dice as an example). After 600,000 rolls, each number would have been rolled around 100,000 times, which is what I'd expect.

Why are there websites dedicated to 'true randomness'? Surely, given the observation above, the chances of getting any number are almost exactly 1 over how ever many numbers it can choose from.

I tried it in Python: Here's the result of 60 million rolls. The highest variation is like 0.15. Isn't that as random as it's going to get?

1 - 9997653 2347.0
2 - 9997789 2211.0
3 - 9996853 3147.0
4 - 10006533 -6533.0
5 - 10002774 -2774.0
6 - 9998398 1602.0

Peter

Posted 2014-02-05T03:26:25.377

Reputation: 1 015

If your "pseudorandom number generator" just output the sequence 1-2-3-4-5-6-1-2-3-4-5-6-1-2-3-4-5-6-.... forever, your table would be even more uniform than the one you're showing here, but I doubt most people would have a very high opinion of such a PRNG. – Daniel McLaury – 2017-02-14T00:43:16.237

Take a look at wikipedia article on hardware generated random numbers Also see this - http://stats.stackexchange.com/questions/32794/are-truncated-numbers-from-a-random-number-generator-still-random

– steadyfish – 2014-02-09T04:53:03.290

21What do you mean by "rolls some dice"? Does it have a robot arm and camera attached? – starblue – 2014-02-09T19:34:31.060

while i agree with the general gist of your tone, that we often worry about this too much, but it has been exploited in real life: http://en.wikipedia.org/wiki/Ronald_Dale_Harris

– Grady Player – 2014-02-10T01:18:03.223

In the vein of this topic and the top/accepted answer: http://www.lauradhamilton.com/random-lessons-online-poker-exploit

– WernerCD – 2014-02-10T15:08:14.947

See this article about an online poker game missing true randomness for why it matters.

– Varaquilex – 2014-02-10T16:48:56.370

What happens when you run your program 100x? Do you usually get a slight overabundance of 4 and 5? Or is the error random? 2) Is there randomness to the odds of X followed by X+Y, where X,Y is member of {1...6} (and you are looping around)? You need randomness in the number that comes next, not just randomness in the distribution. 3) If you are generating a new seed each time, based on the clock, then you might actually have a truly-random (hardware-based) RNG. But I think that is cheating. A PRNG is supposed to be an algorithm that operates on a seed, if I understand it correctly.

< – dmm – 2014-02-10T19:35:17.543

see also what randomness really is [cs.se]

– vzn – 2014-02-13T05:25:17.777

1If you just keep a 0-5 counter and roll a dice accordingly, 666 gorillion times, you'd get an equal distribution as well. – jcora – 2014-02-13T20:49:54.747

@starblue, I think you're referring to the Dice-o-matic. http://hackaday.com/2009/05/26/dice-o-matic/

– JMD – 2014-02-14T20:07:09.667

Actually I wanted to highlight how hopelessly naive the question is, because it glosses over the most important aspect of randomness, the process by which the numbers are generated. – starblue – 2014-02-16T08:05:57.043

@Varaquilex Nice article! Look at the original report here, though. The problem is they used a weak PRNG with a predictable seed. At the end of the article, what the authors advise is using a stronger PRNG with an unpredictable seed. True randomness is required to provide a seed. There is little use for it once this is done.

– Erwan Legrand – 2014-02-19T14:40:35.883

Answers

1 386

Let's play some computer poker, just you, me and a server we both trust. The server uses a pseudo-random number generator which is initialized with a 32-bit seed right before we play. So there are about four billion possible decks.

I get five cards in my hand -- apparently we are not playing Texas Hold 'Em. Suppose the cards are dealt out one to me, one to you, one to me, one to you, and so on. So I have the first, third, fifth, seventh and ninth cards in the deck.

Earlier I ran the pseudo-random number generator four billion times, once with each seed, and wrote down the first card generated for each into a database. Suppose my first card is the queen of spades. That only shows up one as the first card in one in every 52 of those possible decks, so we've cut down the possible decks from four billion to around 80 million or so.

Suppose my second card is the three of hearts. Now I run my RNG 80 million more times using the 80 million seeds that produce the queen of spades as the first number. This takes me a couple of seconds. I write down all the decks that produce the three of hearts as the third card -- the second card in my hand. That's again only about 2% of the decks, so now we're down to 2 million decks.

Suppose the third card in my hand is the 7 of clubs. I have a database of 2 million seeds that deal out my two cards; I run my RNG another 2 million times to find the 2% of those decks that produce the 7 of clubs as the third card, and we're down to only 40 thousand decks.

You see how this goes. I run my RNG 40000 more times to find all the seeds that produce my fourth card, and that gets us down to 800 decks, and then run it 800 more times to get the ~20 seeds that produce my fifth card, and now I just generate those twenty decks of cards and I know that you have one of twenty possible hands. Moreover, I have a very good idea of what I'm going to draw next.

Now do you see why true randomness is important? The way you describe it, you think that distribution is important, but distribution isn't what makes a process random. Unpredictability is what makes a process random.

UPDATE

Based on the (now deleted due to their unconstructive nature) comments, at least 0.3% of people who've read this are confused as to my point. When people argue against points I haven't made, or worse, argue for points that I did make on the assumption that I didn't make them, then I know that I need to explain more clearly and carefully.

There seems to be particular confusion around the word distribution so I want to call out usages carefully.

The questions at hand are:

How do pseudorandom numbers and truly random numbers differ?
Why is the difference important?
Do the differences have something to do with the distribution of the output of the PRNG?

Let's begin by considering the perfect way to generate a random deck of cards with which to play poker. Then we'll see how other techniques for generating decks are different, and if it is possible to take advantage of that difference.

Let's begin by supposing that we have a magic box labeled TRNG. As its input we give it an integer n greater than or equal to one, and as its output it gives us a truly random number between one and n, inclusive. The output of the box is entirely unpredictable (when given a number other than one) and any number between one and n is as likely as another; that is to say that the distribution is uniform. (There are other more advanced statistical checks of randomness that we could perform; I'm ignoring this point as it is not germane to my argument. TRNG is perfectly statistically random by assumption.)

We start with an unshuffled deck of cards. We ask the box for a number between one and 52 -- that is, TRNG(52). Whatever number it gives back, we count off that many cards from our sorted deck and remove that card. It becomes the first card in the shuffled deck. Then we ask for TRNG(51) and do the same to select the second card, and so on.

Another way to look at it is: there are 52! = 52 x 51 x 50 ... x 2 x 1 possible decks, which is roughly 2²²⁶. We have chosen one of them at truly at random.

Now we deal the cards. When I look at my cards I have no idea whatsoever what cards you have. (Aside from the obvious fact that you don't have any of the cards I have.) They could be any cards, with equal probability.

So let me make sure that I explain this clearly. We have uniform distribution of each individual output of TRNG(n); each one picks a number between 1 and n with probability 1/n. Also, the result of this process is that we have chosen one of 52! possible decks with a probability of 1/52!, so the distribution over the set of possible decks is also uniform.

All right.

Now let's suppose that we have a less magic box, labeled PRNG. Before you can use it, it must be seeded with a 32-bit unsigned number.

ASIDE: Why 32? Couldn't it be seeded with a 64- or 256- or 10000-bit number? Sure. But (1) in practice most off-the-shelf PRNGs are seeded with a 32-bit number, and (2) if you have 10000 bits of randomness to make the seed then why are you using a PRNG at all? You already have a source of 10000 bits of randomness!

Anyway, back to how the PRNG works: after it is seeded, you can use it the same way you use TRNG. That is, you pass it a number, n, and it gives you back a number between 1 and n, inclusive. Moreover, the distribution of that output is more or less uniform. That is, when we ask PRNG for a number between 1 and 6, we get 1, 2, 3, 4, 5 or 6 each roughly one sixth of the time, no matter what the seed was.

I want to emphasize this point several times because it seems to be the one that is confusing certain commenters. The distribution of the PRNG is uniform in at least two ways. First, suppose we choose any particular seed. We would expect that the sequence PRNG(6), PRNG(6), PRNG(6)... a million times would produce a uniform distribution of numbers between 1 and 6. And second, if we chose a million different seeds and called PRNG(6) once for each seed, again we would expect a uniform distribution of numbers from 1 to 6. The uniformity of the PRNG across either of these operations is not relevant to the attack I am describing.

This process is said to be pseudo-random because the behaviour of the box is actually fully deterministic; it chooses from one of 2³² possible behaviours based on the seed. That is, once it is seeded, PRNG(6), PRNG(6), PRNG(6), ... produces a sequence of numbers with a uniform distribution, but that sequence is entirely determined by the seed. For a given sequence of calls, say, PRNG(52), PRNG(51) ... and so on, there are only 2³² possible sequences. The seed essentially chooses which one we get.

To generate a deck the server now generates a seed. (How? We'll come back to that point.) Then they call PRNG(52), PRNG(51) and so on to generate the deck, similar to before.

This system is susceptible to the attack I described. To attack the server we first, ahead of time, seed our own copy of the box with 0 and ask for PRNG(52) and write that down. Then we re-seed with 1, ask for PRNG(52), and write that down, all the way up to 2³²-1.

Now, the poker server that is using PRNG to generate decks has to generate a seed somehow. It does not matter how they do so. They could call TRNG(2^32) to get a truly random seed. Or they could take the current time as a seed, which is hardly random at all; I know what time it is as much as you do. The point of my attack is that it does not matter, because I have my database. When I see my first card I can eliminate 98% of the possible seeds. When I see my second card I can eliminate 98% more, and so on, until eventually I can get down to a handful of possible seeds, and know with high likelihood what is in your hand.

Now, again, I want to emphasize that the assumption here is that if we called PRNG(6) a million times we would get each number roughly one sixth of the time. That distribution is (more or less) uniform, and if uniformity of that distribution is all you care about, that's fine. The point of the question was are there things other that distribution of PRNG(6) that we care about? and the answer is yes. We care about unpredictability as well.

Another way to look at the problem is that even though the distribution of a million calls to PRNG(6) might be fine, because the PRNG is choosing from only 2³² possible behaviours, it cannot generate every possible deck. It can only generate 2³² of the 2²²⁶ possible decks; a tiny fraction. So the distribution over the set of all decks is very bad. But again, the fundamental attack here is based on our being able to successfully predict the past and future behaviour of PRNG from a small sample of its output.

Let me say this a third or four time to make sure this sinks in. There are three distributions here. First, the distribution of the process that produces the random 32-bit seed. That can be perfectly random, unpredictable and uniform and the attack will still work. Second, the distribution of a million calls to PRNG(6). That can be perfectly uniform and the attack will still work. Third, the distribution of decks chosen by the pseudo-random process I have described. That distribution is extremely poor; only a tiny fraction of the IRL possible decks can possibly be chosen. The attack depends on the predictability of the behaviour of the PRNG based on partial knowledge of its output.

ASIDE: This attack requires that the attacker know or be able to guess what the exact algorithm used by the PRNG is. Whether that is realistic or not is an open question. However, when designing a security system you must design it to be secure against attacks even if the attacker knows all the algorithms in the program. Put another way: the portion of a security system which must remain secret in order for the system to be secure is called the "key". If your system depends for its security on the algorithms you use being a secret then your key contains those algorithms. That is an extremely weak position to be in!

Moving on.

Now let's suppose that we have a third magic box labeled CPRNG. It is a crypto-strength version of PRNG. It takes a 256-bit seed rather than a 32-bit seed. It shares with PRNG the property that the seed chooses from one of 2²⁵⁶ possible behaviours. And like our other machines, it has the property that a large number of calls to CPRNG(n) produce a uniform distribution of results between 1 and n: each happens 1/n of the time. Can we run our attack against it?

Our original attack requires us to store 2³² mappings from seeds to PRNG(52). But 2²⁵⁶ is a much larger number; it is completely infeasible to run CPRNG(52) that many times and store the results.

But suppose there is some other way to take the value of CPRNG(52) and from that deduce a fact about the seed? We've been pretty dumb so far, just brute-forcing all the possible combinations. Can we look inside the magic box, figure out how it works, and deduce facts about the seed based on the output?

No. The details are too complicated to explain, but CPRNGs are cleverly designed so that it is infeasible to deduce any useful fact about the seed from the first output of CPRNG(52) or from any subset of the output, no matter how large.

OK, so now let's suppose the server is using CPRNG to generate decks. It needs a 256-bit seed. How does it choose that seed? If it chooses any value that an attacker can predict then suddenly the attack becomes viable again. If we can determine that of the 2²⁵⁶ possible seeds, only four billion of them are likely to be chosen by the server, then we are back in business. We can mount this attack again, only paying attention to the small number of seeds that can possibly be generated.

The server therefore should do work to ensure that the 256-bit number is uniformly distributed -- that is, each possible seed is chosen with probability of 1/2²⁵⁶. Basically the server should be calling TRNG(2^256)-1 to generate the seed for CPRNG.

What if I can hack the server and peer into it to see what seed was chosen? In that case, the attacker knows the complete past and future of the CPRNG. The author of the server needs to guard against this attack! (Of course if I can successfully mount this attack then I can probably also just transfer the money to my bank account directly, so maybe that's not that interesting. Point is: the seed has to be a hard-to-guess secret, and a truly random 256-bit number is pretty darn hard to guess.)

Returning to my earlier point about defense-in-depth: the 256-bit seed is the key to this security system. The idea of a CPRNG is that the system is secure as long as the key is secure; even if every other fact about the algorithm is known, as long as you can keep the key secret, the opponent's cards are unpredictable.

OK, so the seed should be both secret and uniformly distributed because if it is not, we can mount an attack. We have by assumption that the distribution of outputs of CPRNG(n) is uniform. What about the distribution over the set of all possible decks?

You might say: there are 2²⁵⁶ possible sequences output by the CPRNG, but there are only 2²²⁶ possible decks. Therefore there are more possible sequences than decks, so we're fine; every possible-IRL deck is now (with high probability) possible in this system. And that's a good argument except...

2²²⁶ is only an approximation of 52!. Divide it out. 2²⁵⁶/52! cannot possibly be a whole number because for one thing, 52! is divisible by 3 but no power of two is! Since this is not a whole number now we have the situation where all decks are possible, but some decks are more likely than others.

If that's not clear, consider the situation with smaller numbers. Suppose we have three cards, A, B and C. Suppose we use a PRNG with an 8-bit seed, so there are 256 possible seeds. There are 256 possible outputs of PRNG(3) depending on the seed; there's no way to have one third of them be A, one third of them be B and one third of them be C because 256 isn't evenly divisible by 3. There has to be a small bias towards one of them.

Similarly, 52 doesn't divide evenly into 2²⁵⁶, so there must be some bias towards some cards as the first card chosen and a bias away from others.

In our original system with a 32-bit seed there was a massive bias and the vast majority of possible decks were never produced. In this system all decks can be produced, but the distribution of decks is still flawed. Some decks are very slightly more likely than others.

Now the question is: do we have an attack based on this flaw? and the answer is in practice, probably not. CPRNGs are designed so that if the seed is truly random then it is computationally infeasible to tell the difference between CPRNG and TRNG.

OK, so let's sum up.

How do pseudorandom numbers and truly random numbers differ?

They differ in the level of predictability they exhibit.

Truly random numbers are not predictable.
All pseudo-random numbers are predictable if the seed can be determined or guessed.

Why is the difference important?

Because there are applications where the security of the system relies upon unpredictability.

If a TRNG is used to choose each card then the system is unassailable.
If a CPRNG is used to choose each card then the system is secure if the seed is both unpredictable and unknown.
If an ordinary PRNG with a small seed space is used then the system is not secure regardless of whether the seed is unpredictable or unknown; a small enough seed space is susceptible to brute-force attacks of the kind I have described.

Does the difference have something to do with the distribution of the output of the PRNG?

The uniformity of distribution or lack thereof for individual calls to RNG(n) is not relevant to the attacks I've described.

As we've seen, both a PRNG and CPRNG produce poor distributions of the probability of choosing any individual deck of all the possible decks. The PRNG is considerably worse, but both have problems.

One more question:

If TRNG is so much better than CPRNG, which is in turn so much better than PRNG, why does anyone use CPRNG or PRNG?

Two reasons.

First: expense. TRNG is expensive. Generating truly random numbers is difficult. CPRNGs give good results for arbitrarily many calls with only one call to TRNG for the seed. The down side is of course that you have to keep that seed a secret.

Second: sometimes we want predictability and all we care about is good distribution. If you're generating "random" data as program inputs for a test suite, and it shows up a bug, then it would be nice that running the test suite again produces the bug again!

I hope that is now much more clear.

Finally, if you enjoyed this then you might enjoy some further reading on the subject of randomness and permutations:

Eric Lippert

Posted 2014-02-05T03:26:25.377

Reputation: 1 816

1@Eric But the seed is not reset before each new deck draw, is it? So while you are correct that there are only relatively few trajectories we are sampling from, you don't know exactly where in the trajectory you are at the moment and trajectories intersect. – A.S. – 2016-03-12T20:33:16.743

2Someone actually did something kind of like this – EJoshuaS - Reinstate Monica – 2017-04-20T20:10:08.507

A good (but dense) treatment of related issues is in Knuth's TAOCP vol 2, section 3.5 “What Is a Random Sequence?” (p. 149), starting with illuminating definitions of equidistributed, k-distributed, and ∞-distributed sequences. Pseudorandom sequences are discussed in 3.5.F (p. 170). See also criteria of pseudorandomness from complexity theory and German BSI.

– ShreevatsaR – 2017-09-23T15:05:22.880

20Ok, boys and girls. That's enough commenting for now. If you want to discuss this any further, go grab yourself a chatroom, kthnxbye! – Ivo Flipse – 2014-02-11T23:23:48.523

162

As Eric Lippert says, it not just distribution. There are other ways to measure randomness.

One of the early random number generators has a sequence in the least significant bit - it alternated 0's and 1's. Therefore the LSB was 100% predictable. But you need to worry about more than that. Each bit must be unpredictable.

Here is a good way to think about the problem. Let's say you are generating 64 bits of randomness. For each result, take the first 32 bits (A), and the last 32 bits(B), and make an index into an array x[A,B]. Now perform the test a million times, and for each result, increment the array at that number, i.e. X[A,B]++;

Now draw a 2D diagram, where the larger the number, the brighter the pixel at that location.

If it is truly random, the color should be a uniform grey. But you might get patterns. Take for instance this diagram of the "randomness" in the TCP sequence number of the Windows NT system:

Windows NT

or even this one from Windows 98:

Windows 98

And here is the randomness of the Cisco router (IOS) implementation. Cisco ISO

These diagrams are courtesy of Michał Zalewski's paper. In this particular case, if one can predict what the TCP sequence number will be of a system, one can impersonate that system when making a connection to another system - which would allow hijacking of connections, interception of communication, etc. And even if we can't predict the next number 100% of the time, if we can cause a new connection to be created under our control, we can increase the chance of success. And when computers can generate 100,000 of connections in a few seconds, the odds of a successful attack goes from astronomical to possible or even likely.

Bruce Barnett

Posted 2014-02-05T03:26:25.377

Reputation: 251

30This is so brilliant it brings tears to my eyes. There should be an app that creates these for every OS (mobile/desktop/server) and platform (JVM/Javascript/etc). – HDave – 2014-02-09T18:54:50.887

The Windows rand() function is quite good! It produces a cloud that is don't have any apparent patterns. See my implementation to try it (and other algorithms) out: https://github.com/Zalastax/visualize_random

– Zalastax – 2014-02-17T14:56:33.960

While pseudorandom numbers generated by computers are acceptable for the majority of use cases encountered by computer users, there are scenarios that require completely unpredictable random numbers.

In security-sensitive applications such as encryption, a pseudorandom number generator (PRNG) may produce values which, although random in appearance, are in fact predictable by an attacker. Someone trying to crack an encryption system may be able to guess the encryption keys if a PRNG was used and the attacker has information on the state of the PRNG. Hence, for such applications, a random number generator which produces values that are truly unguessable is necessary. Note that some PRNGs are designed to be cryptographically secure and are usable for such security-sensitive applications.

More information about RNG attacks can be found in this Wikipedia article.

bwDraco

Posted 2014-02-05T03:26:25.377

Reputation: 41 701

9Cryptographic PRNGs exist, and are widely used. They can from a modestly sized seed generate a practically unlimited stream of random numbers. It is computationally infeasible to distinguish such a stream from true random numbers, thus no additional information can be gained from any portion of such a stream, and for any practical purpose the numbers are as good as true random numbers. – aaaaaaaaaaaa – 2014-02-05T21:20:06.923

I think the easiest way to explain this is that randomly number generator algorithms have to be programmed. That means there is set of instructions that is being followed. If there is a set of instructions, it cant be random. – Keltari – 2014-02-07T21:49:52.030

6@Keltari You're missing the element of entropy... Most RNGs (at least cryptographic ones) gather input from outside sources (eg mouse movement) and use that as part of the starting condition - thus, the transformation from A to B is programmed but the initial state of A (should) be unguessable. Linux's /dev/random will keep an approximation of how much entropy is available and stop giving out numbers if it falls too low. – Basic – 2014-02-07T23:58:40.423

Out of curiosity - why are lava lamps considered "truly random"? I understand it exhibits rather unpredictable behavior, but someone with a firm enough grasp on fluid dynamics and how those fluids interact in Earth's gravitational environment can surely produce "predictable" results, no? Sure, lava lamps are unpredictable, but to me, they're not random at all, but highly predictable. – theGreenCabbage – 2014-02-10T14:29:13.917

1@theGreenCabbage: I suspect that lava lamps are chaotic. Given a good enough computer model, and enough digits of accuracy, you could (in principle) predict the behavior for a while. But, because the system is chaotic, two lava lamps with the tiniest change in initial conditions will quickly diverge in behavior. (And this comment ignores chaotic attractors.) – dmm – 2014-02-10T19:44:01.183

@theGreenCabbage not answering your question, but "Lavarand" is patented http://www.google.com/patents/US5732138

– oberhamsi – 2014-02-11T13:30:35.477

I tried it in Python: Here's the result of 60 million rolls. The highest variation is like 0.15. Isn't that as random as it's going to get?

Actually, it's so "good" it's bad... All the existing answers focus on predictability given a small sequence of initial values. I want to raise another issue:

your distribution has much smaller standard deviation than random rolls should

True randomness just doesn't come quite that close to averaging "almost exactly 1 over how ever many numbers it can choose from" that you're using as an indication of quality.

If you look at this Stack Exchange question about probability distributions for multiple dice rolls, you'll see a formula for the standard deviation of N dice rolls (assuming genuinely random outcomes):

 sqrt(N * 35.0 / 12.0).

Using that formula, the standard deviation for:

1 million rolls is 1708
60 million rolls is 13229

If we look at your results:

1 million rolls: stddev(1000066, 999666, 1001523, 999452, 999294, 999999) is 804
60 million rolls: stddev(9997653, 9997789, 9996853, 10006533, 10002774, 9998398) is 3827

You can't expect the standard deviation of a finite sample to exactly match the formula, but it should come pretty close. Yet, at 1 million rolls you've less than half the proper stddev, and by 60 million you're under a third - it's getting worse, and that's no coincidence....

Pseudo-RNGs tend to move through a sequence of distinct numbers, starting with the seed and not revisiting the original number for a specific period. For example, implementations of the old C library rand() function commonly have a period of 2^32, and they'll visit every number between 0 and 2^32-1 exactly once before repeating the seed. So, if you simulated 2^32 dice rolls the pre-modulus (%) results would include each number from 0 to 2^32, the counts for each 1-6 outcome would be 715827883 or 715827882 (2^32 isn't a multiple of 6), and the standard deviation therefore only trivially above 0. Using the formula above, the correct standard deviation for 2^32 rolls is 111924. Anyway, as your number of pseudo-random rolls increases you converge towards 0 standard deviation. The issue can be expected to be significant when the number of rolls is a significant fraction of the period, but some pseudo-RNGs may exhibit worse problems - or problems even with fewer samples - than others.

So even if you don't care about cryptographic vulnerabilities, in some applications you may care about having distributions that don't have overly, artificially even results. Some types of simulation are quite specifically trying to work out the consequences of the uneven results that naturally occur with large samples of individually random outcomes, but they're under-represented in some pRNG's results. If you're trying to simulate how a huge population reacts to some event, this issue could radically alter your results leading to wildly inaccurate conclusions.

To give a concrete example: Say a mathematician tells a poker machine programmer that after 60 million simulated rolls - used to flicker hundreds of little "lights" around the screen, if there've been 10,013,229 or more sixes, which the mathematician expects to be 1 stddev away from mean, there should be a small payout. Per the 68–95–99.7 rule (Wikipedia) this should happen about 16% of the time (~68% fall within a standard deviation / only half outside are above). With your random number generator, this is from about 3.5 standard deviations above the mean: Under 0.025% chance - almost no customers get this benefit. See the Higher Deviations table on the page just mentioned, specifically:

| Range    | In range   | Outside range | Approx. freq. for daily event  |
| µ ± 1σ   | 0.68268... | 1 in 3        | Twice a week                   |
| µ ± 3.5σ | 0.99953... | 1 in 2149     | Every six years                |

Tony D

Posted 2014-02-05T03:26:25.377

Reputation: 61

You're comparing apples and oranges here. The two standard deviations have absolutely nothing to do with each other. – Jbeuh – 2014-02-10T20:33:21.203

I just wrote this random number generator to generate dice rolls

def get_generator():
  next = 1
  def generator():
    next += 1
    if next > 6:
      next = 1
    return next
  return generator

You use it like this

>> generator = get_generator()
>> generator()
1
>> generator()
2
>> generator()
3
>> generator()
4
>> generator()
5
>> generator()
6
>> generator()
1

etc etc. Would you be happy to use this generator for a program that ran a dice game? Remember, its distribution is exactly what you would expect from a "truly random" generator!

Pseudo-random number generators do essentially the same thing - they generate predictable numbers with the correct distribution. They are bad for the same reason that the simplistic random number generator above is bad - they are not suitable for situations where you need genuine unpredictability, not just the correct distribution.

Chris Taylor

Posted 2014-02-05T03:26:25.377

Reputation: 312

Related on Wikipedia: Kolmogorov randomness – a theoretical definition of randomness of a string (also could be string of digits).

– Palec – 2015-03-05T07:05:08.550

2"Pseudo-random number generators... generate predictable numbers with the correct distribution" -- Just because it's a PRNG doesn't guarantee that it has perfect distribution (in fact, the commercial ones by and large don't, for exactly the reasons outlined in these answers). While they may be predictable given sufficient information (the algo used, starting seed, output values, w/e), they still have variance. – Brian S – 2014-02-05T20:46:41.540

3Besides the point, I know, but get_generator = lambda: itertools.cycle(range(1,7)), generator = get_generator(), next(generator) # and so on is just too elegant not to mention :) – Janus Troelsen – 2014-02-06T22:30:16.400

2@BrianS Actually, a PRNG that failed distribution tests over time would be predictable by definition. So over some large N, if you get even a little way from N/2 heads in N coin flips, you can start betting on heads, and you can win more than you lose. Likewise, if you got a perfect distribution of heads v. tails, but heads always came in pairs, then you'd again have a recipe for winning. Distribution tests are how you know a PRNG is any good. – Jon Kiparsky – 2014-02-07T06:45:49.393

1You forgot nonlocal next :-). – Kos – 2014-02-07T09:41:24.163

5Even better example: Pi is believed to be normal, meaning that any sequence of digits of any given length in any base appears no more often than any other sequence of that length in that base. An algorithm which, when asked for n random bits, takes the next n bits of pi and returns them (the "seed" is the bit you start on), should in the long run produce a perfectly even distribution. But you still wouldn't want it for your generator - someone who knows the last bunch of bits you generated could find the first time that sequence occurs, assume your seed is there, and likely be correct. – cpast – 2014-02-10T21:31:05.863

The random number generation your computer can perform is suitable for most needs, and you're unlikely to come across a time where you need a truly random number.

True random number generation has its purposes though. In computer security, gambling, large statistical sampling, etc.

If you are interested in the applications of random numbers check out the Wikipedia article.

Alex McKenzie

Posted 2014-02-05T03:26:25.377

Reputation: 1 559

12The big issue is when you need random numbers that an attacker cannot predict for security reasons. – David Schwartz – 2014-02-05T03:45:01.587

16You're sure as hell likely to come across a time where you need a truly random number. It's enough to open a web page that starts with https://... – Jan Hudec – 2014-02-05T07:47:55.983

@JanHudec: Well, in daily use, you'll need secure random numbers the moment you open any program, well before you are typing into an address bar: see address space layout randomization. That's why stuff like this happens.

– Reid – 2014-02-05T08:31:18.157

5@JanHudec I was specifically speaking in the sense that you would need to use an online random number generator. True random numbers are used frequently, but very few people actually need to generate them themselves. – Alex McKenzie – 2014-02-06T06:02:15.627

2Slot machines also use a PRNG, not a TRNG. The generator runs all the time and a number is picked at the exact time that the spin button is pushed. The sum of the PRNG and the truly random button press time amounts to a TRNG. – Roger Dahl – 2014-02-10T19:00:25.333

@JanHudec Not true, SSL uses psuedo randoms just like everyone else. Install OpenSSL on a machine without hardware random generation and it will work just fine. Psuedorandom != insecure. – George Reith – 2014-02-11T11:12:15.697

@GeorgeReith: Pseudorandom definitely equals insecure. OpenSSL will work on machine without hardware random generation because all operating systems contain a truly random number generators that can provide limited but sufficient amount of true randomness. These work by observing very high granularity timer at various events (key presses, network packet arrivals etc.). The events can't be externally timed to the same precision, so the low bits hashed together through suitable cryptographic hash provide truly random (unguessable) numbers even in absence of hardware generator. – Jan Hudec – 2014-02-11T18:12:51.607

@JanHudec Cryptographically secure psuedo random number generators are enough for cryptography (such as SSL/TLS). True random is preferable. Those are not true random as they can be monitored and reproduced (certainly not easily but theoretically). Things like radioactive decay are true random. – George Reith – 2014-02-11T22:35:14.997

@GeorgeReith: The CSPRNG is used to generate keys from non-uniform values and from less entropy if you are short of it (which you often are without hardware random number generator), but they still have to be fed some true randomness to provide security. The properties of CSPRNG ensure that it can distil good randomness from only somewhat uncertain sources like the various timings. – Jan Hudec – 2014-02-12T06:00:47.613

@JanHudec Given further research you are correct. – George Reith – 2014-02-12T09:19:35.703

The random numbers generated by typical functions in most programming languages are not purely random numbers. They are pseudo random numbers. Since they are not purely random numbers they can be guessed with enough information on previously generated numbers. So this will be a disaster for security in cryptography.

For an example the following random number generator function used in glibc does not generate purely random number. The pseudo random number generated by this can be guessed. It is a blunder for security issues. There is a history of this becoming disastrous. This should not be used in cryptography.

glibc random():
    r[i] ← ( r[i-3] + r[i-31] )  % (2^32)
    output  r[i] >> 1

This type of pseudo random number generator should never ever be used in security sensitive places even though statistically far significant.

One of the famous attacks on pseudo random key is attack on 802.11b WEP. WEP has 104-bit longterm key, concatenated with 24-bit IV (counter) to make 128 bit key, which is in turn applied to RC4 algorithm to generate pseudo random key.

( RC4( IV + Key ) ) XOR (message)

The keys were closely related with one another. Here, only IV increased by 1 in each step and all others remained same. Since this was not purely random, it was disastrous and easily broken down. The key could be recovered by analyzing about 40000 frames, which is the matter of minutes. If the WEP used purely random 24-bit IV, then it could be safe until about 2^24 (nearly 16.8 million) frames.

So one should go with pure random number generator in security sensitive issues when possible.

Prabhu

Posted 2014-02-05T03:26:25.377

Reputation: 759

3I'd blame the WEP stuff on a badly designed protocol using a weak cipher. With modern stream ciphers you can use a counter as IV. – CodesInChaos – 2014-02-05T09:05:33.200

2The main problem with WEP was repeating the key in 2^24 (nearly 16 million) frames. It was even worse with related keys which made it possible to crack the code in about 40000 frames. The main point here is that the key is not random. It is closely related, so that is that easy to crack. – Prabhu – 2014-02-05T18:18:27.070

1Pseudo-randomness is bad in cryptography only when generating cryptographic keys. It is perfectly fine beyond that. Indeed, RC4 is little more than a pseudo-random number generator seeded with the 128-bit expansion of the key XORed onto the plaintext of the message. – Matt – 2014-02-11T05:14:18.943

The difference is that pseudorandom generated numbers are predictable (repeating) after some time where true random numbers aren't. The length it takes to repeat depends on the length of the seed which is used for its generation.

Here is a pretty nice video about that topic: http://www.youtube.com/watch?v=itaMNuWLzJo

Fatal705

Posted 2014-02-05T03:26:25.377

Reputation: 1

Predictability != Repeating. Mersenne Twister is a good example of that. On most implementation after 624 Int32 you can predict all next number, but the Mersenne Twister sequence are much longer than that (2 ^19937 - 1). – HoLyVieR – 2014-02-08T01:58:12.450

I don't understand why this answer is not pushed up the stack, as this seems to me that this is the accurate and concise answer to the question, at least partially. Pseudo random numbers can be easily predicted after some draws, the number of draws varying with the pseudo random algorithm "quality". Selecting a "good" algorithm is looking at to aspects: 1. every value is drawn in equal frequency (distribution), 2. it takes a "long time" to restart the sequence at the beginning and start drawing again the same numbers in the same order. – mins – 2014-02-08T16:06:32.330

"true random numbers aren't [predictable]". For today this is true. Now if we believe in the Big Bang theory, and we have a lot of power to compute the state of the Universe at any time after the BB, based on physics then... we are able to predict the future, including the fact that I'm writing this very exact comment. Right? – mins – 2014-02-08T16:14:10.483

That is hypothetically true, however, considering the vast degree of entropy involved in the actual actions of real bodies, the computing power required would be ridiculously huge. Think continents covered in computers. Plus, because of dependence on previous state, the state of every body in the universe at every point in time would need to be stored, which by definition would require more space than is available in the universe, completely filled with memory apparatus – TheEnvironmentalist – 2014-02-09T13:43:35.600

@TheEnvironmentalist - Ah! "Continents covered in computers"... isn't it what "The Hitchhiker's Guide to the Galaxy" all about? ;-) – ysap – 2014-02-15T08:16:28.510

Assume that a pseudo random number can be guessed by anyone before it is generated.

For trivial applications a pseudo randomness is fine, as with your example, you'll get approximately the correct percentage (approximately 1/6th of the total result set) with some minor variation (Which you would see if you were to roll a dice 600k times);

However, when it comes to things like computer security; True randomness is required.

For example the RSA algorithm begins with the computer choosing two random numbers (P and Q) and then doing several steps to those numbers to generate the special numbers known as your public and private keys. (The important part of a private key is that it is private, and nobody else knows it!)

If an attacker can know what the two 'random' numbers that your computer is going to pick, They can do the same steps to calculate your private key (The one that nobody else is supposed to know!)

With your private key an attacker can do things like a) Talk to your bank pretending to be you, b) Listen to your 'secure' internet traffic and be able to decode it, c) Masquerade between you and other parties on the internet.

That's where true randomness (ie. not being able to be guessed/calculated) is required.

DoubleFission

Posted 2014-02-05T03:26:25.377

Reputation: 151

The first random number that I ever used had the excellent property that of any two consecutive random numbers, the second one was larger with a probability of 0.6. Not 0.5. And the third was larger than the second with probability 0.6, and so on. You can imagine how that plays havoc with a simulation.

Some people wouldn't believe me this was even possible with the random numbers being equally distributed, but it's obviously possible if you look at the sequence (1, 3, 5, 2, 4, 1, 3, 5, 2, 4, ... ) where the second of two numbers is larger with probability 0.6.

On the other hand, for simulations it can be important to be able to reproduce random numbers. Let's say you do a traffic simulation and want to find out how some actions you might take could improve traffic. In that case you want to be able to re-create the exact same traffic data (like people trying to enter a town) with different actions you tried to improve traffic.

gnasher729

Posted 2014-02-05T03:26:25.377

Reputation: 277

The short answer is that usually people require "true randomness" for a bad reason, namely that they have no understanding of cryptography.

Cryptographic primitives such as stream ciphers and CSPRNGs are used to produce huge streams of unpredictable bits once they have been fed a few unpredictable bits.

The careful reader will now have realised there is a bootstrapping issue here: we must gather a few bits of entropy to start it all. Then be can feed them to a CSPRNG which will in turn happily provide all the unpredictable bits we need. Thus a hardware RNG is required to seed a CSPRNG. This is the sole case where entropy is required in truth.

(I think this should have been posted in Security or Cryptography.)

Edit: In the end, one must select a random number generator which is good enough for the task envisioned and as far as random number generation is concerned, hardware does not necessarily equate good. Just like bad PRNGs, hardware random sources usually have biases.

Edit: Some folks here assume a threat model in which an attacker could read the internal state of a CSPRNG and from there go to the conclusion that CSPRNGs are not a safe solution. This is an example of poor thread modelling. If an attacker owns your system, the game is over, plain and simple. It does not make any difference whether you use a TRNG or a CSPRNG at this point.

Edit: So, to sum all of this up... Entropy is required to seed a CSPRNG. Once this is done, a CSPRNG will provide all the unpredictable bits we need for security applications much faster than we can (usually) collect entropy. If unpredictability is not required, such as for simulation, a Mersenne Twister will provide numbers with good statistical properties at a much higher rate.

Edit: Anyone willing to understand the problem of secure random number generation should read this: http://www.cigital.com/whitepapers/dl/The_Importance_of_Reliable_Randomness.pdf

Erwan Legrand

Posted 2014-02-05T03:26:25.377

Reputation: 181

2It's not necessarily a security question. I think there are be reasons to use truly random numbers that don't involve security. If I were doing some scientific research that depends on random numbers and it was for whatever reason critical that the numbers be as random as possible, I'd certainly take advantage of a hardware RNG so I can rest assured that any properties observed are not due to quirks of the RNG. – Kef Schecter – 2014-02-05T14:13:44.890

3@KefSchecter It their heard hardware PRNGs generally have biased and/or correlated output. They need a post processing step to turn it into uniform independent output. There is no reason to believe that this post processing step is more reliable than a modern stream cipher. I certainly would trust the stream cipher more. As an extra bonus it's reproducible, which is valuable in science. – CodesInChaos – 2014-02-05T17:45:40.960

OK, fair enough. But wouldn't the same apply equally to cryptography applications? Even the answer gievn here says you need a hardware RNG to seed the CSPRNG. – Kef Schecter – 2014-02-05T20:41:29.403

2@KefSchecter Yes, crypto applications need true random numbers to seed the CSPRNG. But for everything else we can use that CSPRNG. – CodesInChaos – 2014-02-06T16:56:45.073

@KefSchecter: Cryptographic applications require that the stream not be reproducible by the world at large. By contrast, in scientific applications, being able to show that the "random" numbers one is using weren't simply chosen to show one's analysis in a good light is helpful. For example, if one announces after announcing one's methods that one will generate data in a certain fashion using the next day's state lottery numbers, readers can be somewhat confident that one hasn't fudged one's results even if the weekday drawing only has a couple dozen bits of entropy. – supercat – 2014-02-15T04:03:57.123

Not all PRNGs are suitable for all uses. For example, Java.util.SecureRandom uses the SHA1 hash, which has an output size of 160 bits. That means there are 2¹⁶⁰ possible streams of random numbers that can come from it. Simple as that. You cannot get more than 2¹⁶⁰ values of the internal state. Thus you cannot get more than 2¹⁶⁰ unique streams of random numbers from a single seed, no matter where your seed came from. Windows CryptGenRandom is believed to use a 40-byte state, it has 2³²⁰ possible streams of random numbers.

The number of ways to shuffle a standard 52-card deck is 52!, which is approximately 2²²⁶. Thus, regardless of seeding, you could not use Java.util.SecureRandom to shuffle a deck of cards. There are approximately 2⁶⁶ possible shuffles that it cannot produce. Of course, we don't know which ones they are...

So, if I had a source of, say, 256-bits of true randomness (e.g., from a Quantis RNG card), I could seed a PRNG like CryptGenRandom() with that seed and then use the PRNG to shuffle up a deck of cards. If I reseed with true randomness each shuffle, this will be fine: unpredictable and statistically random. If I did the same thing with Java.util.SecureRandom, there would be shuffles that could not possibly be produced, because it cannot be seeded with 256 bits of entropy, and its internal state cannot represent all the possible shuffles.

Note that the java.util.SecureRandom results would be both unpredictable and statistically random. No statistical test would ever identify a problem! But the output of the RNG isn't large enough to cover the full domain of all possible outputs needed to simulate a deck of cards.

And remember, if you add the jokers in, it's 54! that you have to cover, which requires about 2²³⁸ possibilities.

Paco Hope

Posted 2014-02-05T03:26:25.377

Reputation: 111

2Why do you care that some shuffles can't happen? That restriction has no observable effect. – CodesInChaos – 2014-02-06T16:20:12.163

2I'm sorta gobsmacked at the question. For heavily-regulated gaming companies, such a bias would mathematically prove that your chances of winning the card game are different with the computer than with a paper deck of cards. It doesn't matter whether the chances are better or worse. They're DIFFERENT. The computer is not morally equivalent to a real deck. Moreover we can't characterise the difference. The gaming company facing stiff regulatory fines would care a lot. – Paco Hope – 2014-02-06T16:26:24.580

My point is that a deviation which is impossible to detect by any known process is irrelevant. In any case it's much smaller than the chance of random hardware errors or bugs. The seed needs to be big enough to avoid repeating a seed, ever. A repeated seed could be detected with long outputs. But it doesn't need to be big enough to reach all possible output sequences. A PRNG seeded with 256 bits is indistinguishable from true random numbers unless you either 1) observe close to 2^128 different streams or 2) brute force 2^256 seeds. Neither of which is feasible. – CodesInChaos – 2014-02-06T16:39:46.687

1But it is detectable. I detect it using a known process: source code review and knowledge of the problem domain. That's what is remarkable. I can NOT use automated statistical analysis. It is as detectable as someone using java.util.Random or the Mersenne Twister. Statistical analysis is not the only valid detection mechanism for RNG/problem domain mismatch. Failures that pass that detector are not, by definition, successes. – Paco Hope – 2014-02-06T16:40:11.690

Study the source code of a PRNG. Then you're given as many samples of the output as you'd like of something that may be a true rng or the PRNG you studied. With MT and similar bad PRNGs you will easily be able to tell if it was the PRNG. With a good PRNG you won't be able to tell this, even after observing huge amounts of output. – CodesInChaos – 2014-02-06T16:50:30.483

1I never disagreed with that statement. What I said is that statistical analysis is not infallible proof that the RNG/PRNG is correct. This is an example of a false negative. It should be incorrect, but the statistical output test will pass it. If I use SHA1(1), SHA1(2), SHA1(3)... SHA1(n) as my "RNG" that will also pass statistical tests. It is also wrong. The definition of correct extends beyond the definition of "passes statistical tests." Passing statistical tests is necessary, but not sufficient. – Paco Hope – 2014-02-06T17:09:06.760

1Paco I strongly agree with your basic premise but I am not following your arithmetic. There are 2^226 possible decks, and 2^160 of them can be produced by an algorithm with a 160 bit seed. How from that do you deduce that there are 2^66 decks not produced? 2^226 - 2^160 is not 2^66. 2^226 / 2^160 is 2^66 but why is division being used to compute a difference? – Eric Lippert – 2014-02-06T23:59:09.497

4@CodesInChaos: The argument "we don't know of an attack which can take advantage of the fact that the vast majority of possible-IRL-shuffles will never be produced" does not imply that such an attack is impossible, just that we don't know what it is or how to defend against it. The right attitude in that case is to eliminate the possibility of attack by eliminating the condition: make an RNG of sufficient quality that it actually can generate every possible deck. – Eric Lippert – 2014-02-07T00:02:56.183

@CodesInChaos said "Why do you care that some shuffles can't happen? That restriction has no observable effect". It can be very important depending on the usage. For example, imagine security sweep, which moves sensors to pseudorandom places. If your PRNG is known not to produce some values, that means that the guys trying to break in will know that there are completely safe places in "secured" building (where sensors will never be). And if timer for moving sensors is also pseudorandom, they will also know there are safe times when sensors will not be moved, thus allowing undetected break-in – Matija Nalis – 2014-02-08T21:31:57.000

Pseudorandom numbers are generated using a mathematical function and an initial value (called the seed), while random numbers are not. Their predictability makes them incredibly useful for game replays, as you only need to save the seed and the player input – the AI will respond in the exact same "random" way every time.

BonzaiThePenguin

Posted 2014-02-05T03:26:25.377

Reputation: 201

The difference between "true" random and "pseudo" random number is the predictability. This answer has already been provided.

However, the predictability is not necessarily a bad thing like most examples are showing. Here is a practical example of one of the rare cases where the predictability is good: The Global Positioning System.

Each satellite uses a distinct PRN code (the Gold codes) suitable for the auto correlation or cross correlation which is necessary for the measurement of the signal propagation time. For these Gold codes the correlation among each other is particularly weak, making an unequivocal identification possible of the satellite, but allowing for the distance computation by the correlation between the emitted sequence and the receiver.

radouxju

Posted 2014-02-05T03:26:25.377

Reputation: 101

For fast check of randomness, you take points with random coordinates in [0;1) then put them in k-dimensional cube. Then you do procedure to slice this cube into subcubes - each volume of subcube (or subsphere) must be measured correctly by this procedure with fluctuations according to well known theorem.

Quality of randomness is important where you meet...

security purposes. When you generate number for using as parameter for your key generation, and it is well predictable - enemy will find out it with 100% probability and make field for search much smaller.
scientific purposes. In science you must not only have average mean in good condition, but also correlations between various random numbers must be eliminated. So if you take (a_i - a)(a_{i+1}-a) and find its distribution it must correspond to statistics.

Pair correlation is so called "weak randomness". If you want real randomness, you must have high order correlation with more then 2 variances.

Today only quantum mechanics generators provide true randomness.

sanaris

Posted 2014-02-05T03:26:25.377

Reputation: 161

Why is true randomness important?

There are basically two main reasons why true randomness is necessary:

If you are using the RNG for cryptography (including things like real-money gambling and running a lottery), then a PRNG will make you cypher much weaker than mathematical analysis of it (which assumes a TRNG) would have you believe. The PRNG won't actually be random, but have a pattern - adversaries can exploit the pattern to crack a cipher that should have been uncrackable.
If you are using the RNG to simulate "random" inputs, for instance for bug testing or simulation, then a PRNG makes your approach weak. When you discover no bugs, there will always be that nagging doubt: Is there a bug that is not noticeable with my PRNG's pattern, but would have showed up if I only used a TRNG? Do my simulation's finding accurately describe reality, or is the phenomenon I discovered simply an artifact of the PRNG's pattern?

Outside of these areas, it doesn't really matter. Caveat: If your PRNG is very, very bad, it might be unsuitable still - you don't want to make a Craps game where the dice always come up even, your players wouldn't like it.

How is Python's PRNG not good enough?

It is very unlikely that you will be able to detect the pitfalls of a real PRNG by using such simple methodology. Statistical analysis of RNGs is a field of science in its own right, and some very sophisticated tests are available to benchmark an algorithm's "randomness". These are much more advanced than your simple attempt.

Every software developer who creates real-world libraries, such as the Python developers, use these statistical tests as a yardstick to see if their PRNG implementation is good enough. So, except for instances of actual developer oversight, it is very unlikely that you will be able to easily detect a pattern in a real-world PRNG. That doesn't mean there is no pattern - a PRNG has a pattern by definition.

Superbest

Posted 2014-02-05T03:26:25.377

Reputation: 1 739

Basically, you cannot prove a source is random by math analysis of the output, you need e.g. a physical model which says the source is random (as in radioactive decay).

You can just run batch tests to find statistic correlation in the output data, in that case the data is proven to be non random (but also a random source can have non random outputs, or it will not be truly random if it cannot give specific output). Otherwise if tests are passed, you can say the data is pseudo random.

Passing some randomness tests only means you have a good PRNG (pseudo random number generator), which can be useful for applications where security is not involved.

If security is involved (i.e. encryption, generating a key salt, random number generation for gambling...) it is not enough to have a good PRNG it needs to have additional qualities, like the function output not being easily guessed from previous outputs, the function needs to have a desirable computational cost (limited enough to be usable, but high enough to defeat brute forcing attempts), the hardware that runs the function - or the device, in the today odd case it is an analog device - should not be easily tampered, etc.

Having a good PRNG can be useful in games to create new and unpredictable patterns, and in encryption - too cumbersome to explain in a single post, just think as a role of thumb what exit the encryption procedure should be pseudo-random, not showing patterns that could relate previous encrypted data with following encrypted data, or relate plain text data to encrypted data, or relate two different ciphertexts each other (so guesses can be made on the plain texts)....

Dice9

Posted 2014-02-05T03:26:25.377

Reputation: 191

-5

Short story:

Generates a random seed by using the current microsecond of the system.

This trick is quite old and is still functional.

Excluding the force brute factor, where I can determine every combination by "betting" in all the possible numbers and it is not the point of this question, specially when most random numbers are rounded prior his use.

Let's say an example, I can determine the seed used using only 10 values. So, knowing the seed, I can guess the next value.

If I would use the seed=1 then I could obtain the next sequence:

1, 2, 3, 4, 5, 6, 7, 8, 9... (and I deduct that the seed used id 1 and the next value 10)

But, what will happens if if change the send every the "nth" values?. Changing the seed by the current microseconds is a cheap trick (that is, it does not requires many CPU cycles).

So the sequence now is: (seed=1) 1, 2, 3, 4, 5, (seed=2), 7, 9, 11, 13... (15?)

In this case:

a) I can't deduct which seed was used.

b) Ergo, I can't guess the next value.

c) The only guess that I can do is to deduct that the next seed could be a major number.

Anyway, most modern random generator algorithms already uses this trick under the hood.

The true fact is that, we don't need a quantum computer for create a "true" random number, the imprecision of our quartz crystal of our computer acts as an random generator, also the random efficiency of our CPU is also variable without considering that the CPU usually does several tasks at the same time.

magallanes

Posted 2014-02-05T03:26:25.377

Reputation: 461

2This is a rather bad idea and it's a source of vulnerability for thing which need trully unpredictable sequence. If you take microseconds, you only have 10^6 possibilities of seed which is rather low. – HoLyVieR – 2014-02-08T01:50:30.867

@HoLyVieR: it's certainly a bad idea if you care about security, but not quite as bad as you make out: you would normally use microseconds since system start (or unix epoch....) which significantly increases the range of possible values. – mikera – 2014-02-09T02:12:12.517

1@mikera It's not any better, the time at which the request was processed is predictable. It's a vector of vulnerability for a good number of password reset functionality. Those scripts generated "random" token with your technique and the attacker could find the token generated since finding the time at which it was executed is rather trivial ... it's the same time that the request for password reset was sent +- 150ms. – HoLyVieR – 2014-02-09T02:53:16.977

Sure, that situation is very bad. But the situation where the state was seeded at system startup, and the attacker doesn't have a good way of guessing the startup time is not quite as bad. You might easily have 10^12 possible microsoconds to choose from, which can make some types of attack infeasible. To be clear: all these solutions are pretty bad from a crypto perspective, but constants matter. – mikera – 2014-02-09T04:27:28.693

For online servers, system uptime information is sometimes offered publicly. Or you can get it from a status page "Incidents. Server up again.". Or you can ping, wait a big downtime, and note it could be a machine reboot (which would give some hundreds of millions of time to check, which is rather low). – Dereckson – 2014-02-09T08:07:18.890

@HoLyVieR meh, you can re-seed it regularly so the chances to determine the next sequence is close to zero. – magallanes – 2014-02-09T13:37:55.300

@magallanes reseeding it regularly is predictable too, isn't it? – gronostaj – 2014-02-10T10:34:15.437