Generate a (completely deterministic) pseudorandom bit stream

11

2

Inspired by Random with your hands tied:


The Goal

The goal of this challenge is to write a program that generates a pseudorandom bit stream, which is a string of 1s and 0s that appears to be purely random, but is actually generated in a deterministic way. Your program should output a string of 1 and 0s (with optional whitespace) and should pass the following requirements:

  1. Given unlimited time and memory, your program must continue to output a string of 1s and 0s forever
  2. Your program must output more than 1000 random bits in about one minute, on a reasonable machine. If this requirement is impossible, then I will lessen it.
  3. The string of bits can repeat, but the length of the repeating section must be more than 1000 bits.
  4. The string of bits must pass as many of the randomness tests (described below) as possible.
  5. The program must not take any input from any external sources or use any built-in rand()-like function.
  6. Because of the above requirement, the program must output the same exact string of bits each time that it is run.

Randomness Test #1

The string of pseudorandom bits must not include any obvious pattern upon visual inspection.

Randomness Test #2 (subject to change based on comments)

The string of bits must contain an equal distribution of 1s and 0s. To test this (and other things too), the stream of bits is broken into segments that are 3 bits long, such as 101|111|001.

Out of all of these segments, 1/8 of them should have three 1s and no 0s, 3/8 of them should have two 1s and one 0, 3/8 of them should have one 1 and two 0s, and 1/8 of them should have no 1s and three 0s.

Randomness Test #3

A "run" is defined as a consecutive series of bits that all have the same value. The string 1001001110 has three runs of size 1 (1..1.....0), two runs of size 2 (.00.00....) and one run of size 3 (......111.). Notice that runs do not overlap.

Out of a string of 1000 random bits, there should be about 250 runs of size 1, 125 runs of size 2, 62 runs of size 3, etc. In general, for run size R, there should be about 1000/(2**(R+1)) runs of that size.

Randomness Test #4

The first 840 bits are split into two halves of 420 bits each. Each bit on the first half is compared to the corresponding bit on the second half. The two bits should match about fifty percent of the time.


Here is the source code of a Perl program that performs tests 2 through 4. As of now, it requires that the string of bits not contain any whitespace.


Objective Winning Criterion Time!

The winner is the program that passes all of the 6 requirements and all of the randomness tests to the degree that it is indistinguishable from randomness. If multiple programs accomplish this, then the one that takes the longest amount of time to repeat will win. If multiple programs accomplish this, then I might have to find more randomness tests to act as tie-breakers.

PhiNotPi

Posted 2012-08-02T21:36:38.547

Reputation: 26 739

#2 and #3 aren't really very good criteria for randomness. For #2 especially, a random sample will probably not exhibit this characteristic. Maybe you can do a larger sample size? I would suggest something between 100 and 300. – Joel Cornett – 2012-08-02T22:00:32.953

A better measurement method would be a moving average, as the mean over a large window on the bitstream will not change much (and should be around 0.5) – Joel Cornett – 2012-08-02T22:02:19.580

@JoelCornett Thanks for the advice. I don't know much about randomness tests. I'll change #2 to something else, and I'm reading about moving averages. – PhiNotPi – 2012-08-02T22:28:52.627

1No problem. Random sequences tend to clump and not be uniformly distributed, this is a fact that is sometimes used in accounting to detect fraud. (Fraudulent numbers will often be too evenly distributed, because the people inventing them mistake uniformity for randomness) – Joel Cornett – 2012-08-02T22:45:56.663

Can I use built in crypto functions (Such as AES or SHA-2)? – CodesInChaos – 2012-08-06T12:05:14.273

There are standard randomness suites, such as diehard and dieharder. But I haven't used any of them.

– CodesInChaos – 2012-08-06T13:07:03.253

thanks for fixing up my question; much better :) - I did want to see some of the fun ideas on this one! – NRGdallas – 2012-08-09T17:30:54.557

@JoelCornett there was an early episode of Numb3rs about that i think... – acolyte – 2012-09-20T15:40:52.273

Answers

8

C, 61

main(s,n){for(n=1u<<31;putchar((s%=n)/(n/2)&1|48);s*=65539);}

Yeah, I know it's not code golf. This is obviously rather an anti-solution... but it sure enough fulfills your criteria.

$ ./a.out | head -c840 000000001000111101100110110011100100011101110001101001101110011101100111010111010010001001110001111110010001010101101000110101000011110010100101001101110100011111000101000110000111100100010101010100110101101010001110101000011011001010111011110110101000001110011000010100001111110110001000000101100011000110000010110011000010101101011100010011011111111100001010111110111010110001000010110111000010101001000011001001110110001001101011011110111011010100000000001001010100111000000100010011110001010111001110101111100110110111010001101001100000101000111000101000110100001111010110100001100110011011110001011011100010111011100001110100110001010011001110001010110100111111111010110101100110000101111111111000111101100011000111010111010100110001001110101110100011101110001100110110000001100101110100101000101011100000110111101101100110000000101110
$ ./a.out | head -c840 | perl tester.pl
Test 2: 1 (1) 2.93333333333333 (3) 3.1 (3) 0.966666666666667 (1)
Test 3: 214 99 71 24 7 5 1 1 2 2
Test 4: 0.495238095238095

Period length is 2²⁹.

ceased to turn counterclockwis

Posted 2012-08-02T21:36:38.547

Reputation: 5 200

6This goes to show how hard it is to tell randomness from something that is widely known to be one of the worst random number generators in existence. +1. – PhiNotPi – 2012-08-03T03:32:13.840

8

Mathematica 78 53 chars

The digits of the binary representation of Pi seem to behave as if they are chaotically produced although this is unproven.

The following simple routine deterministically returns as a string the binary digits of pi, corresponding to d decimal digits:

f[d_]:=ToString@FromDigits@RealDigits[N[Pi,d],2][[1]]

Usage

If we request the counterpart of 301 decimal digits of Pi, we receive 1000 binary digits.

f[301]
StringLength[%]

(* out *)
1100100100001111110110101010001000100001011010001100001000110100110001001100011001100010100010111000000011011100000111001101000100101001000000100100111000001000100010100110011111001100011101000000001000001011101111101010011000111011000100111001101100100010010100010100101000001000011110011000111000110100000001001101110111101111100101010001100110110011110011010011101001000011000110110011000000101011000010100110110111110010010111110001010000110111010011111110000100110101011011010110110101010001110000100100010111100100100001011011010101110110011000100101111001111110110001101111010001001100010000101110100110100110001101111110110101101011000010111111111101011100101101101111010000000110101101111110110111101110001110000110101111111011010110101000100110011111101001011010111010011111001001000001000101111100010010110001111111100110010010010010100001100110010100011110110011100100010110110011110111000010000000000111110010111000101000010110001110111111000001011001100011011010010010000011011000011100011

1000 (* characters *)

Because Pi is an irrational number, there is no period. However, there will be practical constraints due to the hardware one is running.

Test 1 Looks good to me.

Test 2

d=301;
Partition[RealDigits[N[Pi,d],2][[1]],{3}];
Tally[%]
(* out *)
{{{1,1,0},35},{{0,1,0},45},{{0,0,0},41},{{1,1,1},40},
{{0,1,1},50},{{1,0,1},32},{{1,0,0},43},{{0,0,1},47}}

More thorough check:

d=10^6;
Partition[RealDigits[N[Pi,d],2][[1]],{3}];
Tally[%]

{{{1,1,0},138565},{{0,1,0},138146},{{0,0,0},138260},{{1,1,1},138427},
{{0,1,1},139119}, {{1,0,1},138404},{{1,0,0},137926},{{0,0,1},138462}}

Test 3: Runs

d=10^6;
res3=SortBy[Tally@Split@RealDigits[N[Pi,d],2][[1]],Last]/.{a_,b_}:> {Length[a],b}
ListPlot[res3 ,AxesLabel-> {"Run Length","Runs"},AxesOrigin->{0,0}]

I ran a large number of cases to systematically check out the distribution of runs. In approximately 3 million binary digits, there were 830k runs of 1, 416k runs of 2, 208k runs of 3, 104k runs of 4, etc.

runs 2 Test 4: Matching of first and second half of data

The matches are the 212 cases of 0 and 2; the mismatches are the 208 cases where the sum of the respective digits is 1.

d=301;
Tally[Plus@@Partition[Take[RealDigits[N[Pi,d],2][[1]],840],420]]

(* out *)
{{1,208},{0,108},{2,104}}

Timing

It takes under two seconds to calculate 3321928 binary digits (corresponding to 10^6 decimal digits).

(r=f[10^6]);//AbsoluteTiming
StringLength[r]

(*out*)
{1.785928,Null}    
3321928

DavidC

Posted 2012-08-02T21:36:38.547

Reputation: 24 524

1I knew someone would do this... – ceased to turn counterclockwis – 2012-08-09T12:01:38.590

1Low-hanging fruit, right? – DavidC – 2012-08-09T12:19:48.397

Couldn't you use e instead of pi to save one byte? – pppery – 2015-12-03T02:15:53.677

Is e distributed chaotically? – DavidC – 2015-12-03T02:49:26.730

3

Python, 90

g=[19]
print(''.join("01"[(g.append((11*g[-1]+13)%1024)or g[-1])>512]for i in range(1000)))

g is the seed value. Random sampling exhibits a remarkably normal distribution repeated random sampling of sample means yielded a mean of 0.506 and a a standard deviation of .0473 (sample size of 1000). Unfortunately, randomness is highly sensitive to initial seed. The seed in the above code gave me the best randomness :p

UPDATE

Let's see how this code holds up to the OP's tests:

Test #1

This one's a bit subjective... but it looks pretty irregular to me.

Test #2

Three 1's: 0.141
Two 1's: 0.371
One 1: 0.353
Zero 1's: 0.135

Test #3

Runs by size:

8: 11
7: 3
6: 7
5: 13
4: 32
3: 67
2: 119
1: 216

Test #4

Ratio of equalities: 0.94 This is a typo. Will update with the correct number soon.

Joel Cornett

Posted 2012-08-02T21:36:38.547

Reputation: 361

1You can remove the whitespace before 'for'. – daniero – 2012-08-05T19:01:53.583

2

Haskell 74 58

main=print$iterate(read.take 9.show.(^3))7>>=show.(`mod`2)

Thanks to shiona for the simplification. Results:

$ ~/golf$ ./pseudorandom | head -c 1000 111110110101011110001011110101011110001111011001110101100100111000100100000011000100011101010111000000001110100111100010110000100000001001111011100001101101001111110100111011111110011011101111010110001110011100010000001011011100010111011110011110100110100000110110010011101010001111101011111110000110110110011111011101000011110010000011011010011110110100010011100011011111011010110001100111100100100001110100001110111011111000001101001010100001100111010111011011011001110000100000100000010111011010111011111011011101001100100011111101110101101010111101001111110101110101001011111011111100111110001001001010001011100100101000010101011001101011011001100111101010001110001011110001011111110110100110011000010101100010011011010100000110100001000010110110111101101010000001110001100011010100000011111010111100101110101001000111001000000101111110010110001011011000011111011001100001101000101001111000101011101010001010110010111111110111111101000111010001001010010011101011100101011101000101101000010101100

./pseudorandom | head -c 1000 | perl test.pl

Test 2: 0.966666666666667 (1) 2.4 (3) 3.3 (3) 1.33333333333333 (1)

Test 3: 260 108 66 33 15 11 5 2

Test 4: 0.495238095238095

This is also a terrible pseudo-random generator (similar to one used by von-Neuman). For those that were not aware concatMap == (=<<) == flip . (>>=) (for lists)

walpen

Posted 2012-08-02T21:36:38.547

Reputation: 3 237

You can replace \x->if odd x then"1"else"0" with show.(`mod`2). – shiona – 2012-08-06T17:48:55.190

1

Java, 371 317

Based on a 128 bit LFSR (bit taps are from xilinx app note 52)

EDIT: I was not satisfied with using BigInteger so this version does not. Saved some characters. Output might be a little less random as I could not think of a good 'seeding' method.

New Code: Arguments: BITS_TO_PRINT

class R{public static void main(String[]a){int L=65536;int[]v={0,128,126,101,99};int[]b=new int[L];for(int x=0;x<L;x++)b[x]=(x*x)&1;for(int i=0;i<Integer.parseInt(a[0])+L;i++){if(1!=(b[v[1]]^b[v[2]]^b[v[3]]^b[v[4]]))b[v[0]]=1;else b[v[0]]=0;if(i>L)System.out.print(b[v[0]]);for(int j=0;j<5;j++)v[j]=(v[j]-1)&(L-1);}}}

Old Version: Arguments: SEED, BITS_TO_PRINT

import java.math.BigInteger;class R{public static void main(String[]a){BigInteger v=new BigInteger(a[0]);BigInteger m=new BigInteger("ffffffffffffffffffffffffffffffff",16);for(int i=Integer.parseInt(a[1]);i>0;i--){v=v.shiftLeft(1);if(!(v.testBit(128)^v.testBit(126)^v.testBit(101)^v.testBit(99))){v=v.setBit(0);}v=v.and(m);java.lang.System.out.print(v.testBit(0)?1:0);}}}

New Version: Example output, bits=100:

011001100111000110010100100111011100100111000111001111110110001001100000100111111010111001100100011

Noah

Posted 2012-08-02T21:36:38.547

Reputation: 11

1

BTW, I assume that both Noah accounts from this post are the same person. If that's so, you can ask a moderator to merge them at http://meta.codegolf.stackexchange.com/

– Peter Taylor – 2012-09-20T07:05:46.120

1

The question is essentially equivalent to "implement a stream cipher". So I implement RC4, since it's relatively simple.

I use no key, and drop the first 100000 bits, because the beginning of RC4 is a bit biased, especially since I skipped the key schedule. But I'd expect it to pass your test even without that (saving 20 chars of code).

Normally one would output a full byte per cycle, but converting to binary is rather ugly in C#, so I simply discard everything except the least significant bit.

var s=Enumerable.Range(0,256).ToArray();
byte i=0,j=0;
for(int k=0;;k++)
{
    i++;
    j+=(byte)s[i];
    var t=s[i];s[i]=s[j];s[j]=t;
    if(k>99999)
        Console.Write(s[i]+s[j]&1);
}

Or without spaces:

var s=Enumerable.Range(0,256).ToArray();byte i=0,j=0;for(int k=0;;k++){i++;j+=(byte)s[i];var t=s[i];s[i]=s[j];s[j]=t;if(k>99999)Console.Write(s[i]+s[j]&1);}

C#, 156 chars, works in LinqPad's statement mode. For a full C# program add the usual boilerplate.


We could also use built in crypto primitives(Cheater solution):

var h=SHA256.Create();for(BigInteger i=0;;i++){Console.Write(h.ComputeHash(i.ToByteArray())[0]%2);}

(C#, 99 chars, works in LinqPad's statement mode. For the normal C# compiler you'll need to add a bit of boilerplate)

The output of cryptographic hash functions is designed to be indistinguishable from random data, so I expect it to pass all randomness tests(die harder,...) you throw at it, but I'm too lazy to test.

CodesInChaos

Posted 2012-08-02T21:36:38.547

Reputation: 111

1

C, 52 chars

main(a){for(a=1;putchar(48+a%2);a=a/2^-(a%2)&576);}

This is a 10 bit LFSR, test results:

$ ./a.out |head -c 1000 | perl randtest.pl
Test 2: 1.13333333333333 (1) 2.86666666666667 (3) 3.16666666666667 (3) 0.833333333333333 (1)
Test 3:  251 122 64 32 16 8 4 2  1
Test 4: 0.466666666666667

Hasturkun

Posted 2012-08-02T21:36:38.547

Reputation: 1 206

a should start out as 1, (assuming it's called with no arguments). Also you could stick the a= in the middle, something like a=a/2^-!putchar(49-a%2)%576 (taking some liberties with the algorithm) – walpen – 2012-08-09T01:47:29.857

@walpen: My initial implementation didn't set a, I changed it because of "The program must not take any input from any external sources" – Hasturkun – 2012-08-09T09:26:09.390

1

Sage/Python

This program prints the rightmost binary digits that are common to every sufficiently tall exponentiation tower of form 3333... For all that could ever be feasibly generated, these are the rightmost binary digits of Graham's number. The digit sequence is infinite, and is not periodic.

m = 1; x = 3; last = 0
while True:
    m *= 2; x = pow(3,x,m); l = len(bin(x))
    print '1' if l > last else '0',
    last = l

For 1000 digits, this took less than 2 seconds; however, the time will increase much faster than linearly in the number of digits.

The test results using the OP's program are

Test 2: 1.26666666666667 (1) 3.16666666666667 (3) 2.8 (3) 0.766666666666667 (1)
Test 3:  268 126 61 30 20 7 2  1 1
Test 4: 0.466666666666667

(See Are the rightmost digits of G random? for more than 32000 digits and additional statistical tests.)

r.e.s.

Posted 2012-08-02T21:36:38.547

Reputation: 2 872

0

perl, 44 bytes

I know this isn't code golf, but I've always been a fan of taking the low order bits of a simple quadratic function, eg:

$x=1/7;print substr($x*=4-4*$x,9,1)%2while 1

Period is longer than 3 billion, but I've run out of disk space to calculate more.

skibrianski

Posted 2012-08-02T21:36:38.547

Reputation: 1 197

1you can save 3 chars by juxtaposing numeric constants and keywords and also distributing that 4: $x=1/7;print substr($x*=4-4*$x,9,1)%2while 1 – ardnew – 2014-04-21T15:49:20.723

0

Python

import hashlib
x=''
while 1:
    h=hashlib.sha512()
    h.update(x)
    x=h.digest()
    print ord(x[0])%2

Should have a period of about 2^512.

Keith Randall

Posted 2012-08-02T21:36:38.547

Reputation: 19 865

0

JavaScript - 1ms to 2ms for 1000 pseudo-random bits (139ms to 153ms for 100000 bits)

This solution uses the fact that square roots are irrational, and thus pretty much random. Basically, it takes the square root of 2 to start, converts it to binary, throws out the leading part that matched the previous root, appends that to the random string, repeats with the next higher number (or back to 2 if the number repeated and was at least 30 bits long), and returns the random string once it is long enough.

var getDeterministicPseudoRandString = function(length){
    var randString = '';

    var i = 2;
    var prevRand = '';

    outerLoop:
    while(randString.length < length){
        var nextRand, nextFullRand = Math.sqrt(i++).toString(2).substring(1).replace('.', '');
        nextRand = nextFullRand;
        for(var j = prevRand.length; j > 0; j--){
            var replaceString = prevRand.substring(0, j);

            nextRand = nextFullRand;

            if(nextFullRand.indexOf(replaceString) == 0){
                if(j == prevRand.length && j > 30){
                    //start i over at 2
                    console.log('max i reached: ' + i);

                    i = 2;
                    continue outerLoop;
                } else {
                    nextRand = nextFullRand.replace(replaceString, '');
                }

                break;
            }
        }
        prevRand = nextFullRand;

        randString += nextRand;
    }

    return randString.substring(0, length);//Return the substring with the appropriate length
};

I haven't ran it through the tests yet, but I imagine it will do well on them. Here's a fiddle so you can see it in action. For my times, I just ran the program several times and took the fastest and slowest values as the ranges.

Briguy37

Posted 2012-08-02T21:36:38.547

Reputation: 2 616