Random numbers with fixed sum

33

4

Your task is to write a program or a function that outputs n random numbers from interval [0,1] with fixed sum s.

Input

n, n≥1, number of random numbers to generate

s, s>=0, s<=n, sum of numbers to be generated

Output

A random n-tuple of floating point numbers with all elements from the interval [0,1] and sum of all elements equal to s, output in any convenient unambiguous way. All valid n-tuples have to be equally likely within the limitations of floating point numbers.

This is equal to uniformly sampling from the intersection of the points inside the n-dimensional unit cube and the n-1-dimensional hyperplane that goes through (s/n, s/n, …, s/n) and is perpendicular to the vector (1, 1, …, 1) (see red area in Figure 1 for three examples).

Examples with n=3 and sums 0.75, 1.75 and 2.75

Figure 1: The plane of valid outputs with n=3 and sums 0.75, 1.75 and 2.75

Examples

n=1, s=0.8 → [0.8]
n=3, s=3.0 → [1.0, 1.0, 1.0]
n=2, s=0.0 → [0.0, 0.0]
n=4, s=2.0 → [0.2509075946818119, 0.14887693388076845, 0.9449661625992032, 0.6552493088382167]
n=10, s=9.999999999999 → [0.9999999999999,0.9999999999999,0.9999999999999,0.9999999999999,0.9999999999999,0.9999999999999,0.9999999999999,0.9999999999999,0.9999999999999,0.9999999999999]

Rules

  • Your program should finish under a second on your machine at least with n≤10 and any valid s.
  • If you so wish, your program can be exclusive on the upper end, i.e. s<n and the output numbers from the half-open interval [0,1) (breaking the second example)
  • If your language doesn't support floating point numbers, you can fake the output with at least ten decimal digits after the decimal point.
  • Standard loopholes are disallowed and standard input/output methods are allowed.
  • This is , so the shortest entry, measured in bytes, wins.

Angs

Posted 2018-05-18T12:11:48.530

Reputation: 4 825

Related. – Martin Ender – 2018-05-18T12:15:48.580

When you say This is equal to uniformly sampling from the intersection - i can see a program choosing randomly from just the corners of that intersection. Would that be valid ? – JayCe – 2018-05-18T12:43:12.120

Because not all languages have floating point support, (at least) how many digit of accuracy/precision must be supported? (the suggested test case above implies 13, but you may want a softer restriction) – user202729 – 2018-05-18T12:47:24.610

2@KevinCruijssen No, that's only true for s==0 or s==3. For all other values of s, the plane has nonzero area and you have to uniform-randomly choose a point on that plane. – user202729 – 2018-05-18T13:05:49.133

@JayCe that would be invalid, since all valid n-tuples have to be equally likely. – Angs – 2018-05-18T13:08:24.740

One way to do this is to use this approach mirrored at the centre and then using rejection sampling

– Angs – 2018-05-18T14:07:39.943

"Integer"? But the question ask for floating point output. – user202729 – 2018-05-18T14:15:31.037

All valid n-tuples have to be equally likely within the limitations of floating point numbers. do you mean same probably for same size? (usually such definition, but not always) – l4m2 – 2018-05-18T16:08:11.023

@l4m2 when the parameters n and s are fixed, the valid outputs should be about equally likely. The order matters, so for instance (0.2, 0.3, 0.1) and (0.3, 0.2, 0.1) both should be as common as (0.6, 0.0, 0.0) for n=3, s=0.6 – Angs – 2018-05-18T16:37:15.677

@Angs so any distribution is allowed as the possibility are all zero? – l4m2 – 2018-05-18T16:47:34.287

@l4m2, no, equally likely == uniform distribution, of course with the understanding that there isn't a total uniformness with floating point errors and the like. – Angs – 2018-05-18T16:54:49.287

@Angs or should same size area near zero have higher possibility as float precision is higher near zero? – l4m2 – 2018-05-18T17:10:16.380

3Requiring the interval to be closed or half-closed (as opposed to open) is a theoretically unobservable requirement. Many random number generators give the output in (0,1). How to test that the output interval is [ 0,1) and not (0,1)? The value 0 "never" occurs anyway – Luis Mendo – 2018-05-18T22:29:24.183

2Is it OK if our code uses rejection sampling, and so takes very long for test cases likes=2.99999999999, n=3? May we generate random reals in multiples of, say, 1e-9? – xnor – 2018-05-19T00:11:35.140

@xnor Only if you have a supercomputer ("1 second"). | OP said 10 digits. – user202729 – 2018-05-19T03:27:26.740

@xnor none of the test cases may take longer than a second on your machine, so at least the most naive rejection sampling is out of the question. The implementation details don't matter if the output is correct. And like user202729 mentioned, 10 digits of precision is needed so multiples of 1e-10 or smaller work. – Angs – 2018-05-19T07:10:41.320

@LuisMendo the same applies to any number really, not just for 0. For instance, Java.Random has a 48-bit seed, so it can't produce all double-precision floating-point numbers ∈[0,1] since they have 52 bits of precision. Some common sense is required since it's hard to make a rule that would cover all cases. Maybe rounded to 10 digits after the comma, all cases should be equally common with an error margin of 1%. – Angs – 2018-05-19T07:23:04.397

Answers

1

Wolfram Language (Mathematica), 92 90 bytes

If[2#2>#,1-#0[#,#-#2],While[Max[v=Differences@Sort@Join[{0,#2},RandomReal[#2,#-1]]]>1];v]&

Try it online!

Un-golfed code:

R[n_, s_] := Module[{v},
  If[s <= n/2,             (* rejection sampling for s <= n/2:                        *)
    While[
      v = Differences[Sort[
            Join[{0},RandomReal[s,n-1],{s}]]];         (* trial randoms that sum to s *)
      Max[v] > 1           (* loop until good solution found                          *)
    ];
    v,                     (* return the good solution                                *)
    1 - R[n, n - s]]]      (* for s > n/2, invert the cube and rejection-sample       *)

Here is a solution that works in 55 bytes but for now (Mathematica version 12) is restricted to n=1,2,3 because RandomPoint refuses to draw points from higher-dimensional hyperplanes (in TIO's version 11.3 it also fails for n=1). It may work for higher n in the future though:

RandomPoint[1&~Array~#~Hyperplane~#2,1,{0,1}&~Array~#]&

Try it online!

Un-golfed code:

R[n_, s_] :=
  RandomPoint[                           (* draw a random point from *)
    Hyperplane[ConstantArray[1, n], s],  (* the hyperplane where the sum of coordinates is s *)
    1,                                   (* draw only one point *)
    ConstantArray[{0, 1}, n]]            (* restrict each coordinate to [0,1] *)

Roman

Posted 2018-05-18T12:11:48.530

Reputation: 1 190

8

JavaScript (Node.js), 122 115 bytes

N=>W=S=>2*S>N?W(N-S).map(t=>1-t):(t=(Q=s=>n?[r=s-s*Math.random()**(1/--n),...r>1?[++Q]:Q(s-r)]:[])(S,n=N),Q?t:W(S))

Try it online!

l4m2

Posted 2018-05-18T12:11:48.530

Reputation: 5 985

6

Python 2, 144 128 119 bytes

from random import*
def f(n,s):
 r=min(s,1);x=uniform(max(0,r-(r-s/n)*2),r);return n<2and[s]or sample([x]+f(n-1,s-x),n)

Try it online!


  • -20 bytes, thanks to Kevin Cruijssen

TFeld

Posted 2018-05-18T12:11:48.530

Reputation: 19 246

@LuisMendo Should be fixed now – TFeld – 2018-05-18T14:27:14.880

they still seem not uniform – l4m2 – 2018-05-18T15:15:44.243

@l4m2 I ran g(4, 2.0) 1,000 times to get 4,000 points and the results look like this which appears fairly uniform.

– Engineer Toast – 2018-05-18T15:45:27.467

122 bytes? – Giuseppe – 2018-05-18T15:47:56.420

@EngineerToast I ran g(3,1.0) and find the first coord strongly near 1 – l4m2 – 2018-05-18T16:02:16.213

and their averages are 0.5, 0.25, 0.25 – l4m2 – 2018-05-18T16:02:49.800

Yeah, chaining uniform for each coordinate will never be uniform in later dimensions – michi7x7 – 2018-05-18T18:01:21.660

@l4m2 Good catch. I got 9,000 data points and it is definitely skewed.

– Engineer Toast – 2018-05-18T18:02:34.253

You don't need the variable l: r=min(s,1);x=uniform(max(0,r-(r-((s-s/n)/~-n))*2),r); (-4 bytes). – Kevin Cruijssen – 2018-05-18T20:20:55.853

And ((s-s/n)/~-n) can be s/n for another -10 bytes. Try it online: 114 bytes.

– Kevin Cruijssen – 2018-05-18T21:12:58.203

One more thing, you can combine both return-statements: Try it online: 108 bytes.

– Kevin Cruijssen – 2018-05-18T21:15:10.473

2Still wrong – l4m2 – 2018-05-23T01:51:18.307

I fell you are shuffling your old answer? – l4m2 – 2018-05-23T01:52:11.023

4

Java 8, 194 188 196 237 236 bytes

n->s->{double r[]=new double[n+1],d[]=new double[n],t;int f=0,i=n,x=2*s>n?1:0;for(r[n]=s=x>0?n-s:s;f<1;){for(f=1;i-->1;)r[i]=Math.random()*s;for(java.util.Arrays.sort(r);i<n;d[i++]=x>0?1-t:t)f=(t=Math.abs(r[i]-r[i+1]))>1?0:f;}return d;}

+49 bytes (188 → 196 and 196 → 237) to fix the speed of test cases nearing 1, as well as fix the algorithm in general.

Try it online

Explanation:

Uses the approach in this StackoverFlow answer, inside a loop as long as one of the items is still larger than 1.
Also, if 2*s>n, s will be changed into n-s, and a flag is set to indicate we should use 1-diff instead of diff in the result-array (thanks for the tip @soktinpk and @l4m2).

n->s->{              // Method with integer & double parameters and Object return-type
  double r[]=new double[n+1]
                     //  Double-array of random values of size `n+1`
         d[]=new double[n],
                     //  Resulting double-array of size `n`
         t;          //  Temp double
  int f=0,           //  Integer-flag (every item below 1), starting at 0
      i=n,           //  Index-integer, starting at `n`
      x=             //  Integer-flag (average below 0.5), starting at:
        2*s>n?       //   If two times `s` is larger than `n`:
         1           //    Set this flag to 1
        :            //   Else:
         0;          //    Set this flag to 0
  for(r[n]=s=        //  Set both the last item of `r` and `s` to:
       x>0?          //   If the flag `x` is 1:
        n-s          //    Set both to `n-s`
       :             //   Else:
        s;           //    Set both to `s`
      f<1;){         //  Loop as long as flag `f` is still 0
    for(f=1;         //   Reset the flag `f` to 1
        i-->1;)      //   Inner loop `i` in range (n,1] (skipping the first item)
      r[i]=Math.random()*s;
                     //    Set the i'th item in `r` to a random value in the range [0,s)
    for(java.util.Arrays.sort(r);
                     //   Sort the array `r` from lowest to highest
        i<n;         //   Inner loop `i` in the range [1,n)
        ;d[i++]=     //     After every iteration: Set the i'th item in `d` to:
          x>0?       //      If the flag `x` is 1:
           1-t       //       Set it to `1-t`
          :          //      Else:
           t)        //       Set it to `t`
      f=(t=Math.abs( //    Set `t` to the absolute difference of:
            r[i]-r[i+1])) 
                     //     The i'th & (i+1)'th items in `r`
        >1?          //    And if `t` is larger than 1 (out of the [0,1] boundary)
         0           //     Set the flag `f` to 0
        :            //    Else:
         f;}         //     Leave the flag `f` unchanged
  return d;}         //  Return the array `d` as result

Kevin Cruijssen

Posted 2018-05-18T12:11:48.530

Reputation: 67 575

Time out for test(10, 9.99); – l4m2 – 2018-05-18T15:33:48.287

@l4m2 Yeah, noticed the same with 10, 9.0 right after I edited to fix the n=10, s=9.999999999999 test case.. Not sure if there is a fix in Java while still holding its uniformly randomness.. Will have to think about it for a while. For now I'll edit it to state it times out. – Kevin Cruijssen – 2018-05-18T16:52:39.053

If n-s<1 you can call f(n,n-s) and flip every number over 1/2 (i.e. replace x with 1-x) like l4m2 did. This might solve the issue for numbers where s is close to n. – soktinpk – 2018-05-19T02:58:09.767

@soktinpk Thanks for the tip. It's actually s+s>n instead of n-s<1, but when I looked at the other JavaScript answers it indeed made sense. Everything is fixed now, including another bug that was still present. Bytes went up quite a bit, but every works now. Will work the byte-count down from here. :) – Kevin Cruijssen – 2018-05-19T11:31:37.457

I don't know of a general proof but I believe this algorithm works because an N-dimensional hypercube can be cut into N N-dimensional hyperpyramids. – Neil – 2018-07-22T10:48:24.717

3

JavaScript (Node.js), 153 bytes

(n,s)=>s+s>n?g(n,n-s).map(r=>1-r):g(n,s)
g=(n,s)=>{do(a=[...Array(n-1)].map(_=>Math.random()*(s<1?s:1))).map(r=>t-=r,t=s);while(t<0|t>=1);return[...a,t]}

Try it online!

Neil

Posted 2018-05-18T12:11:48.530

Reputation: 95 035

3

C++11, 284 267 bytes

-17 bytes thanks to Zacharý
Uses C++ random library, output on the standard output

#include<iostream>
#include<random>
typedef float z;template<int N>void g(z s){z a[N],d=s/N;int i=N;for(;i;)a[--i]=d;std::uniform_real_distribution<z>u(.0,d<.5?d:1-d);std::default_random_engine e;for(;i<N;){z c=u(e);a[i]+=c;a[++i]-=c;}for(;i;)std::cout<<a[--i]<<' ';}

To call, you just need to do that :

g<2>(0.0);

Where the template parameter ( here, 2 ) is N, and the actual parameter ( here, 0.0 ) is S

HatsuPointerKun

Posted 2018-05-18T12:11:48.530

Reputation: 1 891

I think you can remove the space between <z> and u – Zacharý – 2018-05-23T12:34:56.983

I got it down further: typedef float z;template<int N>void g(z s){z a[N],d=s/N;int i=N;for(;i;)a[--i]=d;std::uniform_real_distribution<z>u(.0,d<.5?d:1-d);std::default_random_engine e;for(;i<N;){z c=u(e);a[i]+=c;a[++i]-=c;}for(;i;)std::cout<<a[--i]<<' ';}. A newline doesn't have to be the separator between items – Zacharý – 2018-05-23T13:36:25.860

1Suggest get rid of d completely by changing d=s/N to s/=N Suggest reworking the second loop for(z c;i<N;a[++i%N]-=c)a[i]+=c=u(e); instead of for(;i<N;){z c=u(e);a[i]+=c;a[++i]-=c;} (note the added %N in order to make the program calculate the first number correctly) – Max Yekhlakov – 2018-09-07T19:21:18.507

3

C, 132 127 125 118 110 107 bytes

-2 bytes thanks to @ceilingcat

i;f(s,n,o,d)float*o,s,d;{for(i=n;i;o[--i]=d=s/n);for(;i<n;o[++i%n]-=s)o[i]+=s=(d<.5?d:1-d)*rand()/(1<<31);}

Try it online!

OverclockedSanic

Posted 2018-05-18T12:11:48.530

Reputation: 51

Unfortunately, this answer does not meet the challenge specification. The output random numbers are not restricted to [0,1], and their joint distribution is not uniform. – Nitrodon – 2019-07-16T21:24:32.260

@Nitrodon hey, could you please provide an input for which the output is not restricted to [0,1]? I tried a couple different examples and they all seemed to be correct, unless I misunderstood the objective. – OverclockedSanic – 2019-07-17T12:16:50.967

With the RNG state on TIO, and keeping your n=4, the values s=3.23 and s=0.89 gives outputs outside of the range. More to the point, the distribution of X-s/n should depend on s, but it doesn't. – Nitrodon – 2019-07-17T18:12:30.197

@Nitrodon Oops, my bad. I was converting some parts from the C++ answer above and forgot to add a thing. It should be fixed now? Also golfed a few bytes in the process. – OverclockedSanic – 2019-07-18T00:01:51.990

2

Clean, 221 201 bytes

Clean, code-golf, or random numbers. Pick two.

import StdEnv,Math.Random,System._Unsafe,System.Time
g l n s#k=toReal n
|s/k>0.5=[s/k-e\\e<-g l n(k-s)]
#r=take n l
#r=[e*s/sum r\\e<-r]
|all((>)1.0)r=r=g(tl l)n s

g(genRandReal(toInt(accUnsafe time)))

Try it online!

Partial function literal :: (Int Real -> [Real]). Will only produce new results once per second.
Accurate up to at least 10 decimal places.

Οurous

Posted 2018-05-18T12:11:48.530

Reputation: 7 916

2

R, 99 bytes (with gtools package)

f=function(n,s){if(s>n/2)return(1-f(n,n-s))
while(any(T>=1)){T=gtools::rdirichlet(1,rep(1,n))*s}
T}

Try it online!

We wish to sample uniformly from the set \$\tilde{\mathcal A}=\{w_1,\ldots,w_n: \forall i, 0<w_i<1; \sum w_i=s\}\$. I'll divide all the \$w_i\$ by \$s\$ and instead sample from \$\mathcal A=\{w_1,\ldots,w_n: \forall i, 0<w_i<\frac1s; \sum w_i=1\}\$.

If \$s=1\$, this is easy: it corresponds to sampling from the \$Dirichlet(1, 1, \ldots, 1)\$ distribution (which is the uniform over the simplex). For the general case \$s\neq 1\$, we use rejection sampling: sample from the Dirichlet distribution until all entries are \$<1/s\$, then multiply by \$s\$.

The trick to mirror when \$s>n/2\$ (which I think l4m2 was the first to figure out) is essential. Before I saw that, the number of iterations in the rejection sampler exploded for the last test case, so I spent a lot of time trying to sample efficiently from well-chosen truncated Beta distributions, but it is not necessary in the end.

Robin Ryder

Posted 2018-05-18T12:11:48.530

Reputation: 6 625

1

Haskell, 122 217 208 bytes

import System.Random
r p=randomR p
(n#s)g|n<1=[]|(x,q)<-r(max 0$s-n+1,min 1 s)g=x:((n-1)#(s-x)$q)
g![]=[]
g!a|(i,q)<-r(0,length a-1)g=a!!i:q![x|(j,x)<-zip[0..]a,i/=j]
n%s=uncurry(!).(n#s<$>).split<$>newStdGen

Try it online!

Sometimes the answers are slightly off due, I assume, to floating point error. If it's an issue I can fix it at a cost of 1 byte. I'm also not sure how uniform this is (pretty sure it's fine but I'm not all that good at this kind of thing), so I'll describe my algorithm.

Basic idea is to generate a number x then subtract it from s and recur until we have n elements then shuffle them. I generate x with an upper bound of either 1 or s (whichever is smaller) and a lower bound of s-n+1 or 0 (whichever is greater). That lower bound is there so that on the next iteration s will still be less than or equal to n (derivation: s-x<=n-1 -> s<=n-1+x -> s-(n-1)<=x -> s-n+1<=x).

EDIT: Thanks to @michi7x7 for pointing out a flaw in my uniformity. I think I've fixed it with shuffling but let me know if there's another problem

EDIT2: Improved byte count plus fixed type restriction

user1472751

Posted 2018-05-18T12:11:48.530

Reputation: 1 511

3Chaining uniform samples will never lead to a uniform distribution (the last coordinate is almost always bigger than 0.99 in your example) – michi7x7 – 2018-05-18T18:05:58.547

@michi7x7 I see your point. What if I shuffled the order of the list after generating it? I should've taken more stats classes – user1472751 – 2018-05-18T18:35:12.487

The numbers doesn't look very uniform. Here, 8 results are >0.99, 1 is 0.96, and the last is 0.8. This is what it looks like.

– Stewie Griffin – 2018-05-18T20:33:18.290

@user1472751 there are several good answers here: https://stackoverflow.com/q/8064629/6774250

– michi7x7 – 2018-05-19T11:02:00.447

1

Uniformness still has some problem, see here - there are too many zeroes generated (plot of sorted values from 1000%500)

– Angs – 2018-05-29T06:45:16.543

1

Haskell, 188 bytes

import System.Random
import Data.List
n!s|s>n/2=map(1-)<$>n!(n-s)|1>0=(zipWith(-)=<<tail).sort.map(*s).(++[0,1::Double])<$>mapM(\a->randomIO)[2..n]>>= \a->if all(<=1)a then pure a else n!s

Ungolfed:

n!s
 |s>n/2       = map (1-) <$> n!(n-s)       --If total more than half the # of numbers, mirror calculation 
 |otherwise   = (zipWith(-)=<<tail)        --Calculate interval lengths between consecutive numbers
              . sort                       --Sort
              . map(*s)                    --Scale
              . (++[0,1::Double])          --Add endpoints
              <$> mapM(\a->randomIO)[2..n] --Calculate n-1 random numbers
              >>= \a->if all(<=1)a then pure a else n!s   --Retry if a number was too large

Try it online!

Angs

Posted 2018-05-18T12:11:48.530

Reputation: 4 825