Rational Number RNG

27

5

The Objective

Build a random number generator whose range is \$\left[0,1\right) \cap \mathbb{Q} .\$

This is, build a random number generator that can produce any value that's:

  1. at least \$0 \,;\$

  2. not \$1\$ or greater; and

  3. expressible as the ratio of two integers.


What's the fuss?

This RNG is weird. Its range is nowhere connected, but is dense in the interval!


The probability mass function

We define an ordering for all values that the RNG can return, such that each return value \$ x\$ can be associated with an index \$ i .\$ Then, the probability of returning \$x_i\$ should be \$P\left(x_i\right) = {\left(\frac{1}{2}\right)}^{i} .\$ Since $$ \sum\limits_{i=1}^{\infty}{{\left(\frac{1}{2}\right)}^{i}} = 1 \,, $$ this yields a total probability of \$1 ,\$ where the numbers earlier in the ordering are significantly more likely than numbers later in the ordering.

The ordering is:

  1. Start with the first value, \$ \frac{0}{1}.\$

  2. Increment the numerator.

  3. If the numerator equals the denominator, then:

    • reset the numerator to \$ 0 ;\$ and

    • increment the denominator.

  4. Add the value to the ordering if it's not equivalent to a value already in the ordering.

    • For example, \$\frac{2}{4}\$ is constructed after \$\frac{1}{2} ,\$ so it's not added to the ordering.
  5. Go back to Step (2).

For example, the first few values and their corresponding probabilities are: $$ \begin{array}{r|c|c|c|c|c|c|c|c|c|c} \textbf{Index}~\left(i\right) & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & \cdots \\\hline \textbf{Value}~\left(x\right) & \frac{0}{1} & \frac{1}{2} & \frac{1}{3} & \frac{2}{3} & \frac{1}{4} & \frac{3}{4} & \frac{1}{5} & \frac{2}{5} & \frac{3}{5} & \frac{4}{5} & \cdots \\\hline \textbf{Probability}~\left(P\right) & \frac{1}{ 2} & \frac{1}{ 4} & \frac{1}{ 8} & \frac{1}{ 16} & \frac{1}{ 32} & \frac{1}{ 64} & \frac{1}{ 128} & \frac{1}{ 256} & \frac{1}{ 512} & \frac{1}{1024} & \cdots \end{array} _{\large{.}} $$


Rules

  1. There is no input. The RNG outputs a number upon invocation.

  2. You must use rational number arithmetic. In particular, X must be represented as a rational number. If your language doesn't natively support it, use a pair of integers. For example, in Haskell, Rational is a valid type.

  3. As long as the numerator and the denominator are distinguishable, the output format doesn't matter.

  4. As this is a code-golf, the shortest code in bytes wins.

Dannyu NDos

Posted 2019-10-16T08:59:27.203

Reputation: 2 087

6

Please specify how to compute the probability for each given X, not just an examples from which we are supposed to infer that. See Things to avoid when writing challenges.

– flawr – 2019-10-16T09:14:36.820

@Night 0 is the infimum. If you are refering to the denominator, there is no limit. – Dannyu NDos – 2019-10-16T09:14:39.193

5@Arnauld The challenge is solvable without using FP at all, so that is forbidden. – Dannyu NDos – 2019-10-16T10:47:18.430

@Arnauld Though it is forbidden only when it relates to X's precision. – Dannyu NDos – 2019-10-16T10:48:36.450

2/7 and 4/14 are guaranteed to give the same (imprecise) floating point value. So floating-point values could be safely used for deduplication – Luis Mendo – 2019-10-16T15:09:06.143

3@LuisMendo Right, but (2^100)/(2^100+3) and (2^100+1)/(2^100+3) are hardly distinguishable through single- or double-precision FP. – lisyarus – 2019-10-17T12:08:05.823

2@lisyarus Even if you use uint64, there is a value above which integer numbers are indistinguishable. Using float instead of uint64 simply moves down the threshold from 2^64 to 2^53 – Luis Mendo – 2019-10-17T14:46:49.460

This has a low nonzero probability of taking an exponentially long time to generate since deduplication requires factoring large numbers. It also has a nonzero probability of crashing the program with an out of memory error. – Beefster – 2019-10-17T23:03:39.177

@Beefster That's inevitable anyway. In Haskell, even though Rational has unbounded precision, that problem will occur. – Dannyu NDos – 2019-10-17T23:07:20.740

@LuisMendo: Fun fact: x87 80-bit float (64-bit significand) can exactly represent every int64_t. (But not every uint64_t, IIRC). Bonus fun fact: gcc -m32 -mno-sse will use x87 fild / fistp to implement atomic<int64_t> load/store because it's a 64-bit copy that preserves any possible integer. – Peter Cordes – 2019-10-18T04:08:34.063

Answers

8

APL (Dyalog Unicode), 27 bytes

i⊃⍸1=∘.(>×∨)⍨⍳2+i←1+⍣{?2}¯1

Try it online!

Thanks to @ngn for this function. Since this is not strictly my answer, I'll keep the older answer below this one. Uses ⎕IO←0.

The function returns a pair in the form denominator numerator.

How:

i⊃⍸1=∘.(>×∨)⍨⍳2+i←1+⍣{?2}¯1 ⍝ Dfn
                         ¯1 ⍝ Literal -1
                     {?2}   ⍝ Roll. Since IO is 0, this will roll either 0 or 1
                    ⍣       ⍝ Power. Repeats the function to the left until the right argument is 0.
                  1+        ⍝ 1 plus. The whole expression above will result in 1+1+1+...-1; 
                            ⍝ each iteration has a 50% chance of being the last one.
                i←          ⍝ Assign the result to i
             ⍳2+            ⍝ Generate the vector [0..2+i]
            ⍨               ⍝ Use it as both arguments (⍺ and ⍵) to the fork:
     ∘.(>×∨)                ⍝ For each element of the left argument (⍺), apply: (⍺ > ⍵) × (⍺ ∨ ⍵); 
                            ⍝ This will result in a matrix of 0s (if ⍵>⍺) and the GCDs between ⍺ and ⍵.
   1=                       ⍝ Check which elements are 1 (where ⍺ is coprime with ⍵)
  ⍸                         ⍝ Find the coordinates (which will be denominator and numerator) of truthy values
i⊃                          ⍝ pick the i-th element.

Old answer:

APL (Dyalog Unicode), 55 bytes

{(?0)<2÷⍨∨/⍵:∇∊((⊂0,2⊃⍵+1),⊂1 0+⍵)[1+(⊃⍵)<¯1+2⊃⍵]⋄⍵}0 1

Try it online!

Statistics

Terribly inefficient port of @Arnauld's answer, so go upvote him.

J. Sallé

Posted 2019-10-16T08:59:27.203

Reputation: 3 233

Actually 45 bytes because of the unicode characters. Still glorious! – Mason – 2019-10-19T21:39:13.057

@Mason Actually, we use Adám’s SBCS to count APL’s Unicode characters as a single byte each.

– J. Sallé – 2019-10-19T22:22:04.857

I see, my mistake! – Mason – 2019-10-19T22:33:00.710

20

JavaScript (ES6), 81 bytes

Returns \$p/q\$ as [p,q].

f=(p=0,q=1)=>Math.random()<(g=(a,b)=>b?g(b,a%b):a)(p,q)/2?f(++p-q?p:!++q,q):[p,q]

Try it online!

Or Try some statistics!

How?

We recursively build the fractions \$p/q\$ in the relevant order, but without any explicit deduplication, until a random number picked in \$[0,1[\$ is greater than or equal to \$gcd(p,q)/2\$.

If \$p\$ and \$q\$ are not coprime, we have \$gcd(p,q)/2\ge1\$, which always triggers the recursive call. This is the expected behavior because the corresponding fraction must be discarded.

If \$p\$ and \$q\$ are coprime, we have \$gcd(p,q)/2=1/2\$, which is the desired probability of triggering the next call.

Commented

f = (                 // f is a recursive function taking:
  p = 0,              //   p = numerator, starting at 0
  q = 1               //   q = denominator, starting at 1
) =>                  //
  Math.random() <     // pick a random number in [0,1[
  (g = (a, b) =>      // g is a recursive function computing the gcd of 2 integers
    b ?               //   if b is not equal to 0:
      g(b, a % b)     //     recursive call with (b, a mod b)
    :                 //   else:
      a               //     return a
  )(p, q) / 2 ?       // if our random number is less than gcd(p, q) / 2:
    f(                //   do a recursive call:
      ++p - q ? p     //     increment p if p < q - 1
              : !++q, //     otherwise: set p to 0 and increment q
      q               //
    )                 //   end of recursive call
  :                   // else:
    [p, q]            //   stop recursion and return [p, q]

Arnauld

Posted 2019-10-16T08:59:27.203

Reputation: 111 334

2This is a brilliant insight. Nice job. – Jonah – 2019-10-16T16:43:53.453

10

Ruby, 74 67 bytes

Probably not the golfiest solution for Ruby, but it was the first thing that sprung to mind. Increment the counter $. (starts at 0) each time rand is at least 0.5 to determine the index, then determine the list of rationals and output the right one.

-7 bytes from histocrat.

d,*a=1r
$.+=1until rand<0.5
a|=(0...d+=1).map{|n|n/d}until p *a[$.]

Try it online!

Value Ink

Posted 2019-10-16T08:59:27.203

Reputation: 10 608

I'm not that knowledgable But I'm assuming a |= ... appends a list and removes duplicates? – Kroppeb – 2019-10-16T10:24:45.957

@Kroppeb yes, absolutely. | when operating on arrays more-or-less acts as a set union operator. – Value Ink – 2019-10-16T10:33:47.213

2Elegant solution! You can get an empty array a little more tersely with d,*a=1r or using the arguments variable $*, and you can save having to write a[$.] twice with either an inline assignment until o=a[$.] or moving the print itself into the conditional with until p *a[$.]. The * converts the a[$.] into an argument list, which will be empty at first so p will do nothing and return a falsey value, then when it actually outputs something it becomes truthy. – histocrat – 2019-10-17T16:23:59.800

1@histocrat thanks for the tips! I did think about using $* earlier, but it errors when I try to run $* |= ... because it's a read-only variable so that's a no-go. – Value Ink – 2019-10-17T19:48:38.313

9

Julia, 61 49 71 bytes

p>q=rand()<.5 ? (p<q-1 ? (while gcd(p+=1,q)!=1 end;p>q) : 1>q+1) : p//q

In un-golfed form, this reads

function f(p,q)                      # define the function
    if rand() < 0.5                  # flip a coin. 
        if p < (q - 1)               # if the coin gives heads, check if p < (q - 1)
            while gcd(p+1, q) != 1   # if the greatest common divisor or p+1 and q is not equal to 1
                p = p + 1            # we increment p, because otherwise we'd get a fraction we already saw.
            end             
            return f(p, q)           # recursively call f(p, q) (where p has been incremented)
        else
            return f(1, q+1)         # if p is not greater than q - 1, we recursively call f(1, q+1)
        end
    else
        return p//q                  # If the coin flip gives tails, we return the Rational number p//q
    end
end

Now let's check to make sure we did this right:

julia> using UnicodePlots; data = [0>1 for _ in 1:10_000_000]; histogram(data)
                ┌                                        ┐ 
   [0.0 , 0.05) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5000106   
   [0.05, 0.1 ) ┤ 0                                        
   [0.1 , 0.15) ┤ 1168                                     
   [0.15, 0.2 ) ┤▇ 82600                                   
   [0.2 , 0.25) ┤ 0                                        
   [0.25, 0.3 ) ┤▇▇ 313684                                 
   [0.3 , 0.35) ┤▇▇▇▇▇▇▇▇ 1250427                          
   [0.35, 0.4 ) ┤ 39124                                    
   [0.4 , 0.45) ┤ 310                                      
   [0.45, 0.5 ) ┤ 0                                        
   [0.5 , 0.55) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 2498798                   
   [0.55, 0.6 ) ┤ 168                                      
   [0.6 , 0.65) ┤ 19403                                    
   [0.65, 0.7 ) ┤▇▇▇▇ 625484                               
   [0.7 , 0.75) ┤ 77                                       
   [0.75, 0.8 ) ┤▇ 166131                                  
   [0.8 , 0.85) ┤ 2473                                     
   [0.85, 0.9 ) ┤ 47                                       
                └                                        ┘ 
                                Frequency

julia> for frac in [(0//1), (1//2), (1//3), (2//3), (1//4), (3//4), (1//5), (2//5), (3//5), (4//5), (1//6), (5//6)]
           freq = count(x -> x == frac, data)/length(data)
           println("P($(frac)) ≈ $freq")
       end
P(0//1) ≈ 0.5000106
P(1//2) ≈ 0.2498798
P(1//3) ≈ 0.1250427
P(2//3) ≈ 0.0625484
P(1//4) ≈ 0.0313065
P(3//4) ≈ 0.0156477
P(1//5) ≈ 0.0077772
P(2//5) ≈ 0.0039116
P(3//5) ≈ 0.0019398
P(4//5) ≈ 0.0009654
P(1//6) ≈ 0.0004828
P(5//6) ≈ 0.0002473

Notice that P(5//6) ≈ 0.5 * P(1//6) as required since 2//6, 3//6, and 4//6 are reducible fractions.

Try it online!

Mason

Posted 2019-10-16T08:59:27.203

Reputation: 191

save 1 byte by replacing rand(Bool) with 2rand()>1 or rand()>.5 – Lyndon White – 2019-10-19T11:05:53.780

Save 6 bytes by redefining an infix operator. Hard part is you need on that binds less tightly than + which mean you need a relation optator. so: >(p=0,q=1)=2rand()<1 ? (p<q-1 ? p+1>q : 1>q+1) : (p,q) – Lyndon White – 2019-10-19T11:13:19.603

@LyndonWhite You're right!. Can be even shorter if I use infix definition syntax as well: p>q = .... Not sure if it's considered cheating to get rid of the p=0, q=1. – Mason – 2019-10-19T21:04:26.923

3

PHP, 139 bytes

for($r=rand()/(getrandmax()+1),$b=$p=1;$r<1/$p*=2;)for($f=0;!$f;)for(++$a-$b?:++$b&&$a=1,$f=$i=1;++$i<=$a;$f&=0<$b%$i+$a%$i);echo+$a."/$b";

Try it online!

Night2

Posted 2019-10-16T08:59:27.203

Reputation: 5 484

3

J, 41 bytes

[:({[:~.@,(i.%x:)"0@i.@+&2)>:^:(?@2)^:_@0

Try it online!

At a high-level, I stole ValueInk's idea for first flipping a coin until it comes up heads, and using that number to index into the ordered list of rationals.

Jonah

Posted 2019-10-16T08:59:27.203

Reputation: 8 729

3

MATL, 31 bytes

`rEk}@EU:2M&\v1M/t1<fw6#uX&@)Z)

Output is numerator, then denominator. Try it online!

Or check relative frequencies in 1000 realizations (header and footer added; one more byte necessary in main code; TIO running time about 30 seconds). Format is: num den freq, each row corresponding to a number. Note that relative frequencies are increasingly imprecise down the table, because the corresponding numbers occur less often.

Here is a set of relative frequencies obtained from 100000 realizations (offline compiler):

0       1 0.49828
1       2 0.25022
1       3 0.12643
2       3  0.0611
1       4 0.03128
3       4 0.01657
1       5 0.00814
2       5 0.00383
3       5 0.00206
4       5 0.00099
1       6 0.00047
5       6 0.00024
1       7  0.0002
2       7   6e-05
3       7   6e-05
4       7   5e-05
5       7   2e-05

Luis Mendo

Posted 2019-10-16T08:59:27.203

Reputation: 87 464

2

WDC 65816 machine code, 77 bytes

Hexdump:

00000000: 6407 6405 e607 3b85 09e2 31a2 039a 6869  d.d...;...1...hi
00000010: b248 6300 ca10 fac2 30a5 091b a501 102c  .Hc.....0......,
00000020: e605 a505 c507 d006 6405 e605 80d6 a607  ........d.......
00000030: 8609 c509 f00f 9004 e509 80f4 488a 38e3  ............H.8.
00000040: 01aa 6880 ebc9 0100 d0d6 80ba 60         ..h.........`

Assembly:

    STZ $07
    STZ $05
earlier_loop:
    INC $07
loop:
    TSC
    STA $09
    SEP #$31
    LDX #$03
    TXS
    PLA
    ADC #$B2
rngloop:
    PHA
    ADC $00,s
    DEX
    BPL rngloop
    REP #$30
    LDA $09
    TCS
    LDA $01
    BPL done
after_rng:
    INC $05
    LDA $05
    CMP $07
    BNE someplace
    STZ $05
    INC $05
    BRA earlier_loop
someplace:
    LDX $07
gcd:
    STX $09
    CMP $09
    BEQ gcd_end
    BCC gcd_else
    SBC $09
    BRA gcd
gcd_else:
    PHA
    TXA
    SEC
    SBC $01,s
    TAX
    PLA
    BRA gcd
gcd_end:
    CMP #$0001
    BNE after_rng
    BRA loop
done:
    RTS

I/O:

  • RNG seed in $01-$04 (32 bits, anything but 0)
  • output is in $05 (numerator), $07 (denom.)
  • overwrites $00, $09
  • assumes 16-bit accumulator on entry
  • messes with the stack so best called with all interrupts disabled

This was more difficult than I first expected. I misread the challenge description and skipped the part about repeats not being counted. The 65816 doesn't have any built-in RNG or floating point math, and it doesn't even have integer multiply/divide. I used the RNG algorithm from cc65, a 6502 C compiler, and adapted it a bit (made it smaller). Then I used the z80 GCD algorithm from Rosetta Code and translated it into 65816 assembly.

Here's a sample of outputs after running for some number of iterations:

0/1: 0x60e4 = 24804
1/2: 0x3122 = 12578
1/3: 0x187c = 6268
2/3: 0x0c7d = 3197
1/4: 0x05bd = 1469
3/4: 0x0313 = 787

As you can see, each next fraction is about half as likely as the previous one. There isn't a good online 65816 "IDE", but I'm working on making one. For now, if you want to test this, you can write a small bit of wrapper code which logs outputs of the subroutine into a table in RAM and view that table with a debugging SNES emulator (I recommend bsnes-plus).

Big thanks to @p4plus2 and @Alcaro for help with optimizing the code.

randomdude999

Posted 2019-10-16T08:59:27.203

Reputation: 789

1

Javascript ES6, 98 chars

g=(x,y)=>y?g(y,x%y):x
for(x=y=1;y;++x)for(y=0;y<x;++y)if(Math.random()<.5==g(x,y))y=alert(y+"/"+x)

Demo with console.log instead of alert:

g=(x,y)=>y?g(y,x%y):x
for(x=y=1;y;++x)for(y=0;y<x;++y)if(Math.random()<.5==g(x,y))y=console.log(y+"/"+x)

Probability check:

g=(x,y)=>y?g(y,x%y):x
f=()=>{for(x=y=1;y;++x)for(y=0;y<x;++y)if(Math.random()<.5==g(x,y))return y+"/"+x}
r={}
for(q=0;q<32768;++q)r[x=f()]=~~r[x]+1
console.log(Object.keys(r).sort((x,y)=>r[y]-r[x]).map(x=>x+" "+r[x]).join`
`)
.as-console-wrapper.as-console-wrapper{max-height:100vh}

Qwertiy

Posted 2019-10-16T08:59:27.203

Reputation: 2 697

1

ink, 137 bytes

=r
~temp e=0
-(d)~e++
~temp n=-1
-(l)~n++
{e-n:{RANDOM(0,c(e,n)<2):{n}/{e}->->}->l}->d
==function c(a,b)
{b:
~return c(b,a%b)
}
~return a

Try it online!

Okay, strap in, this was an adventure.

My first attempts at solving this with a full program rather than a function (well, it's a stitch rather than a function, but the point still stands) failed horribly. The logic was sound, but for some reason I was getting 1/3 more than twice as often as 1/2, and that pattern held for more than 2000 test runs - that wasn't the only place where it failed, but it was the most noticable.
So I broke the code down into parts. I ungolfed it, I added debug messages, it didn't work. I ported the core logic to python, as faithfully as I possibly could, and it worked fine. To simplify the testing process, I changed a minor detail to allow me to call my code multiple times during a single run of a program (used a separate variable rather than a readcount for some logic, because readcounts can't be reset). I ran my code a few hundred times as part of a single program, rather than running the program a few hundred times.
And suddenly it worked.

So I figured the problem was related to the readcount I was using - those have been known to act up at times. So I went back to having my code be a full program, but kept the new variable. And it went back to not working.

So I figured, okay, maybe the RNG isn't uniform over the course of multiple runs of a program, but tends towards uniformity if called repeatedly in a single program. That sounded vaguely reasonable - well it kinda didn't, but I was getting desperate. So I wrote two small programs to test my hypothesis - one that generated 100 random values (which I ran once), and one which generated a single random value (which I ran 100 times). And they both gave me about 50 0s and 50 1s.

At my wits' end at this point, I had no idea what was going on, and frankly I still have no idea where to even begin when it comes to figuring it out.
So I'm posting a stitch. It would be 18 bytes shorter if I could have it be a full program (code is here), but even though the logic would be entirely identical, that appears to not work as it should, and any change I try to make only serves to break it in a different way. Only the gods know why, and they surely envy my ignorance.

With that rant out of the way, there are some neat tricks in the code, so let's look at it.

The basic idea is that after each number is generated, we have a 50% chance of outputting it, and a 50% chance of continuing to the next number - this strategy wasn't obvious to me at first, but looking at it it becomes obvious that each output will be half as likely as the one before, twice as likely as the next one, and equally as likely as the all the subsequent values combined - which is exactly what we want.

Now that RANDOM call doesn't obviously look like a 50/50 thing. Well, let's look a bit deeper. You'll notice that its second parameter is actually a boolean expression - meaning it will only ever be 0 or 1. If it's 1, RANDOM will output either 0 or 1 with equal probability, meaning we have a 50% chance of outputting the current numerator and denominator (it happens if RANDOM would generate a non-zero value), while if it's 0 we effectively skip a number.
So what is that c function then? Well, if you look further down, you'll find it's Euclid's algorithm for finding the greatest common divisor of two numbers. If that's greater than 1, we won't want to output the number - there's another representation of that number with a strictly smaller numerator and denominator, so we've already skipped that number. But if it is 1, it's a number we may want to output, so we check if the GCD is 1, plug the result of the comparison into the RANDOM call, and call it a day.
conveniently enough, anything is a divisor of 0, so if the numerator is 0 we will skip the number if the denominator isn't 1. Which happens to be exactly what we want.

Sara J

Posted 2019-10-16T08:59:27.203

Reputation: 2 576

1

05AB1E, 29 28 bytes

®1‚[0L+Ð5°D<Ýs/Ωs¿;@#`αi>®0ǝ

Inspired by @Arnauld's approach.

Try it online or see the counts of 100 outputs (it's a bit too slow to do more than 100 in TIO.. >.>).

Explanation:

NOTE: 05AB1E doesn't have a builtin to get a decimal in the range \$[0,1)\$, so instead I create a list in the range \$[0,1)\$ in increments of \$0.00001\$ manually, and get a random value from this list.

®1‚           # Push -1 and 1, and pair them: [-1,1]
   [          # Start an infinite loop:
    0L+       #  Increase only the first value by 1 (by adding [1,0])
       Ð      #  Triplicate this pair
    5°        #  Push 10 to the power 5: 10,000
      D<      #  Duplicate it, and decrease the copy by 1 to 9,999
        Ý     #  Create a list in the range [0,9999]
         s/   #  Swap to get the duplicated 10,000, and divide each item by it:
              #   [0, 0.00001, 0.00002, 0.00003, ..., 0.99997, 0.99998, 0.99999]
           Ω  #  Pop and push a random item from this list
    s         #  Swap to get one of the pairs at the top again
     ¿        #  Get the greatest common divisor (gcd) of the values in this pair
      ;       #  Halve it
    @         #  If the random value is larger than or equal to this:
     #        #   Stop the infinite loop
    `αi       #  If the absolute difference between the two values in the pair is exactly 1:
       >      #   Increase both values in the pair by 1
        ®0ǝ   #   And then replace the first value with -1
              # (after the loop, output the pair at the top of the stack implicitly)

Kevin Cruijssen

Posted 2019-10-16T08:59:27.203

Reputation: 67 575