Sample the Pareto Distribution

22

2

The Pareto Distribution is a probability distribution that comes up a lot in nature. It has lots of special properties, such as an infinite mean. In this challenge, you will output a number sampled from this distribution.

The Pareto Distribution is defined to be greater than or equal to x with probability 1/x, for all x greater than or equal to 1.

Therefore, a number sampled from this distribution is greater than or equal to 1 with probability 1, greater than or equal to 2 with probability exactly 1/2, greater than or equal to 3 with probability exactly 1/3, greater than or equal to 11.4 with probability exactly 1/11.4, and so on.

Since you will sample this distribution, your program or function will take no input, and output a random number, with the above probabilities. However, if your program doesn't perfectly match the above probabilities due to floating-point impression, that's OK. See the bottom of the challenge for more details.

(This is called the Pareto Distribution with alpha 1 and lower bound 1, to be exact)

Here's 10 example draws from this distribution:

1.1540029602790338
52.86156818209856
3.003306506971116
1.4875532217142287
1.3604286212876546
57.5263129600285
1.3139866916055676
20.25125817471419
2.8105749663695208
1.1528212409680156

Notice how 5 of them are below 2, and 5 are above 2. Since this is the average result, it could have been higher or lower, of course.

Your answer only needs to be correct up to the limits of your floating point type, real number type, or whatever else you use, but you must be able to represent numbers at at least 3 decimal digits of precision, and represent numbers up to 1,000,000. If you're not sure whether something is OK, feel free to ask.

This is code golf.


Details about imprecision:

  • For each range [a, b], where 1 <= a < b, the ideal probability that the sample would fall in that range is 1/a - 1/b. The probability that your program produces a number in that range must be with 0.001 of 1/a - 1/b. If X is the output of your program, it is required that |P(a <= X <= b) - (1/a - 1/b)| < 0.001.

  • Note that by applying the above rule with a=1 and b sufficiently large, it is the case that your program must output a number greater than or equal to 1 with at least probability 0.999. The rest of the time it may crash, output Infinity, or do whatever else.

I'm fairly certain that the existing submissions of the form 1/1-x or 1/x, where x is a random float in [0, 1) or (0, 1) or [0, 1], all satisfy this requirement.

isaacg

Posted 2017-12-15T06:56:59.820

Reputation: 39 268

Let us continue this discussion in chat.

– isaacg – 2017-12-16T05:46:29.183

2Note to everyone: issacg has added some rules that allow some imprecisions, therefore most answers here are longer than necessary. [sorry for comment abuse too, but that is what would happen when OP change question significantly] – user202729 – 2017-12-16T15:16:53.140

Answers

6

MATL, 3 bytes

1r/

Try it online! Or estimate the resulting probabilities by running it 10000 times.

Explanation

1    % Push 1
r    % Push random number uniformly distributed on the open interval (0,1)
/    % Divide. Implicitly display

Luis Mendo

Posted 2017-12-15T06:56:59.820

Reputation: 87 464

After the clarification and edit in the OP, this answer conforms to the rules of the challenge – Luis Mendo – 2017-12-17T04:11:46.647

5

Actually, 4 bytes

G1-ì

Try it online!

Explanation:

G1-ì
G     random()
 1-   1-random()
   ì  1/(1-random())

Mego

Posted 2017-12-15T06:56:59.820

Reputation: 32 998

5

R, 10 bytes

1/runif(1)

Pretty straightforward.

plannapus

Posted 2017-12-15T06:56:59.820

Reputation: 8 610

2

Note that runif never returns 0 or 1 in the default case so there are no problems with this.

– Giuseppe – 2017-12-15T12:31:58.083

Yes, thanks. And I didn't thought of it when entering this answer but you can indeed verify the distribution if needed.

– plannapus – 2017-12-15T12:50:33.037

This doesn't work - a Pareto distribution produces 1 with positive probability. Since runif(1) won't ever produce a 1, 1/runif(1) will never be 1. – Mego – 2017-12-15T13:34:54.080

2@Mego that is incorrect. The Pareto distribution is absolutely continuous and thus has measure 0 for any number. – Therkel – 2017-12-15T13:38:23.450

@Therkel Theoretically, yes. However, because we are in the realm of floats, it's a discrete distribution over the set of floats in [1, +Inf). – Mego – 2017-12-15T13:40:34.290

@Mego Am I misreading OP "This random number must be above 1 with probability 1"? – plannapus – 2017-12-15T14:12:46.190

@plannapus I commented on the OP with a request for clarification. By the definition of a Pareto distribution with x_m = alpha = 1, the support (the range for which the probability density function has positive values, aka the possible outputs for the distribution) is [1, +Inf). – Mego – 2017-12-15T14:15:07.890

3@Mego OK that may be quicksand for me (given i know close to nothing about floating point), but i actually think that while the probability of runif giving 1 is null, the probability of 1/runif to give 1 is not, because of floating point accuracy (i. e. typically 1/0.9999999 returns 1 in R). – plannapus – 2017-12-15T14:59:13.753

1@plannapus Hmm... That's a good point. Floats make this entirely too complicated. – Mego – 2017-12-15T15:00:41.427

@Mego 1 must appear with probability zero: The Pareto Distribution is defined to be above x with probability 1/x, for all x greater than or equal to 1. i.e., Pr(X>x)=1/x which implies Pr(X>1)=1, and in turn Pr(X==1)=0. – Giuseppe – 2017-12-15T19:47:47.483

@Giuseppe For any continuous distribution, Pr(X=x) = 0. That is not the same thing as "x will never be output". However, since we're working with floats, this is a discrete distribution, and so all floats x in [1, +Inf) have Pr(X=x) > 0. – Mego – 2017-12-16T02:09:25.887

4

TI-Basic, 2 bytes

rand^-1      (AB 0C in hex)

For anyone who is wondering, rand returns a random value in (0,1]. "Due to specifics of the random number generating algorithm, the smallest number possible to generate is slightly greater than 0. The largest number possible is actually 1 ..." (source). For example, seeding rand with 196164532 yields 1.

Timtech

Posted 2017-12-15T06:56:59.820

Reputation: 12 038

Strangely, the equivalent code wouldn't work on a TI-89 series calculator. Even though their random number generators are nearly identically implemented, a TI-89 will return 0 whenever a TI-83+ would return 0.99999999999889. – Misha Lavrov – 2017-12-15T17:17:26.457

2TI-Basic developers knew in advance this challenge will happen...? It seems to win this time. – user202729 – 2017-12-16T06:13:58.290

@user202729 Avoiding 0 and 1 makes rand more useful as a subroutine for the calculator's other commands, which is probably why TI made this design decision. For example, randNorm(0,1 returns -7.02129... with seed 196164532. Using the RNG algorithm without the adjustment would give a value of 1e99, which is an unreasonable value for a normally-distributed variable to have. – Misha Lavrov – 2017-12-16T14:55:44.720

@user202729 Yeah, actually I just time traveled a bit to get it all done. Definitely worth it for these upvotes. – Timtech – 2017-12-16T15:01:13.650

4

R, 12 bytes

exp(rexp(1))

Try it online!

Verify the distribution

This takes a different approach, exploiting the fact that if Y~exp(alpha), then X=x_m*e^Y is a Pareto with parameters x_m,alpha. Since both parameters are 1, and the default rate parameter for rexp is 1, this results in the appropriate Pareto distribution.

While this answer is a fairly R- specific approach, it's sadly less golfy than plannapus'.

R, 14 bytes

1/rbeta(1,1,1)

Try it online!

Even less golfy, but another way of getting at the answer.

Another property of the exponential distribution is that if X ~ Exp(λ) then e^−X ~ Beta(λ, 1), hence 1/Beta(1,1) is a Pareto(1,1).

Additionally, a keen observer would recall that if X ~ Beta(a,b) and a=b=1, then X~Unif(0,1), so this truly is 1/runif(1).

Giuseppe

Posted 2017-12-15T06:56:59.820

Reputation: 21 077

I have no idea. But the reality is, there is a huge confusion about what is allowed and what is not on this challenge. – user202729 – 2017-12-15T15:22:21.417

@user202729 that's fair, but those who have been raising concerns with regards to that would at least have commented, so the downvote is (in my opinion) unlikely to be related to that. EDIT: mystery downvoter has removed the downvote. – Giuseppe – 2017-12-15T15:25:13.897

I downvoted because I thought using R on a challenge like this was trivial, but I was a little trigger-happy. I realize that this uses a different method than most of the other answers, so I removed my downvote. – KSmarts – 2017-12-15T15:27:31.553

@KSmarts The "trivial" answer in R wasn't used by anybody actually: actuar::rpareto(1,1,1), because it is longer :) – plannapus – 2017-12-15T15:31:14.727

For info, there are ca. 20 distributions hard-coded in base R, but Pareto is not one of them, hence the need to either use a work-around or an additional package.

– plannapus – 2017-12-15T15:34:40.607

@KSmarts ah, thanks. It would have been trivial if R had a pareto distribution built-in, but apparently the developers didn't feel the need to include one, since this is a perfectly good workaround, for anyone who bothers to learn distributions. :) – Giuseppe – 2017-12-15T15:39:20.550

3

Charcoal, 10 bytes

I∕Xφ²⊕‽Xφ²

Try it online!

Link is to the verbose version:

Print(Cast(Divide(Power(f, 2), ++(Random(Power(f, 2))))));

Comments:

  • Charcoal only has methods to get random integer numbers, so in order to get a random floating-point number between 0 and 1 we have to get a random integer between 0 and N and divide by N.
  • Previous version of this answer that used the 1/(1-R) formula: In this case, N is set to 1000000 as the OP asks it to be the minimum. To get this number Charcoal provides a preset variable f=1000. So just calculating f^2 we get 1000000. In the event that the random number is 999999 (the maximum), 1/(1-0.999999)=1000000.
  • Neil's tip (saving 3 bytes): If I have 1/(1-R/N) where R is a random number between 0 and N, it is the same as just calculate N/(N-R). But considering that the random integers N-R and R have the same probability to occur, that is the same as just calculating N/R (being R in this last case a number between 1 and N inclusive to avoid division by zero).

Charlie

Posted 2017-12-15T06:56:59.820

Reputation: 11 448

10 bytes – Neil – 2017-12-15T09:23:01.170

@Neil please wait a moment while I try to understand what your code does... :-) – Charlie – 2017-12-15T09:25:24.870

Actually I don't need MapAssignRight any more, 10 bytes! works.

– Neil – 2017-12-15T09:27:24.943

@Neil assimilation of your code completed! Answer edited. :-D – Charlie – 2017-12-15T09:43:11.150

3

Mathematica, 10 bytes

1/Random[]

Try it online!

-4 bytes from M.Stern

J42161217

Posted 2017-12-15T06:56:59.820

Reputation: 15 931

This may be invalid... – user202729 – 2017-12-15T09:47:35.330

2This has the potential to fail, because RandomReal outputs a real number in the closed range [0, 1]. Thus, division by 0 is possible. You’ll need to manipulate the random value in order to remove that possibility. – Mego – 2017-12-15T11:42:45.610

2@Mego where exactly did you find that info? – J42161217 – 2017-12-15T11:45:52.993

@Jenny_mathy The linked Mathematica.SE question in the first comment. – Mego – 2017-12-15T12:09:56.673

1@Mego what is the probability to get 0? – J42161217 – 2017-12-15T12:20:09.857

4

Jenny_mathy: According to the proposal on meta, the burden of proof should be on the person claiming to have a valid answer - it's your work to prove that it's valid, not to ask @Mego to provide an invalid test case. Also because float are discrete the probability to get 0 is nonzero.

– user202729 – 2017-12-15T12:52:54.463

@Jenny_mathy Mathematica uses floating-point numbers for Reals, according to the documentation. The precision of the floats is unspecified, but assuming double-precision (the norm for many langauges), there is a 1/(2^62 - 2^52 - 1) = 1/4607182418800017407 chance of failure (as I computed in chat). The reason this answer appears singled out is because other answers that use randoms from [0, 1] have been fixed already (like the JS one). I've commented on all others that can fail to my knowledge.

– Mego – 2017-12-15T13:30:40.887

Let us continue this discussion in chat.

– Mego – 2017-12-15T13:56:31.107

1Back to the topic, I do not believe there is a possibility of getting a zero using this function. Mathematica will in fact produce numbers less than $MinMachineNumber. Try this: Table[RandomReal[{0, $MinMachineNumber}], 100]. Turns out Mathematica is smart enough to abandon machine numbers and switch over to arbitrary precision numbers. LOL. – Kelly Lowder – 2017-12-15T15:11:28.830

Min@Table[RandomReal[{0,$MinMachineNumber}],10^7] yields 3.750405396*10^-315 – Kelly Lowder – 2017-12-15T15:18:05.790

What do you think about Random? It's superseded but still working. – M. Stern – 2017-12-15T19:34:13.940

3

Haskell, 61 56 bytes

The function randomIO :: IO Float produces random numbers in the interval [0,1), so transforming them using x -> 1/(1-x) will produce pareto realizations.

import System.Random
randomIO>>=print.(1/).((1::Float)-)

Try it online!

flawr

Posted 2017-12-15T06:56:59.820

Reputation: 40 560

Moving the type annotation saves a few bytes: randomIO>>=print.((1::Float)/) – Laikoni – 2017-12-15T11:55:49.173

And as functions are allowed, I'd say you can drop the main=. – Laikoni – 2017-12-15T11:57:02.900

Appranetly the range is [0,1) according to this answer

– flawr – 2017-12-15T13:30:29.637

@flawr Whoops, you're right! I forgot how floats work temporarily. – Mego – 2017-12-15T13:32:53.223

Well anyway, thanks for commenting, I would not have had any idea:) – flawr – 2017-12-15T13:35:44.200

3

Excel, 9 bytes

=1/rand()

Yay, Excel is (semi-)competitive for a change!

Therkel

Posted 2017-12-15T06:56:59.820

Reputation: 131

Also works in LibreOffice Calc :) – ElPedro – 2017-12-16T07:59:19.007

You can change this to google sheets for -1 Bytes (=1/Rand() – Taylor Scott – 2017-12-18T15:04:04.230

2

Ruby, 14 8 bytes

p 1/rand

Trivial program, I don't think it can get any shorter.

G B

Posted 2017-12-15T06:56:59.820

Reputation: 11 099

Note to everyone: issacg has added some rules that allow some imprecisions, therefore most answers here are longer than necessary. – user202729 – 2017-12-16T15:20:45.817

2

Excel VBA, 6 Bytes

Anonymous VBE immediate window function that takes no input and outputs to the VBE immediate window

?1/Rnd

Taylor Scott

Posted 2017-12-15T06:56:59.820

Reputation: 6 709

1

JavaScript REPL, 15 19 bytes

1/Math.random()

l4m2

Posted 2017-12-15T06:56:59.820

Reputation: 5 985

3

This will not yield correct results if Math.random() returns 0

– Mr. Xcoder – 2017-12-15T07:33:15.293

1Probably 1/(1-Math.random())? – user202729 – 2017-12-15T07:35:22.517

Fixed using u*29's solution – l4m2 – 2017-12-15T07:38:18.027

You need _=> at the start to make this a function; snippets aren't allowed. – Shaggy – 2017-12-15T08:07:33.380

It's a full program using console running – l4m2 – 2017-12-15T08:19:40.673

@Mr. Xcoder be above x with probability 1/x the 'above' not sure if include itself. If not when rnd=0 it return Inf, seems allowed – l4m2 – 2017-12-15T09:53:15.893

@l4m2 According to the definition of a Pareto distribution, the support is [1, +Inf). Therefore, all reals >= 1 must have a positive probability. – Mego – 2017-12-15T13:36:22.273

Changed back ^^^ – l4m2 – 2017-12-17T04:14:48.800

1

Python, 41 bytes

lambda:1/(1-random())
from random import*

Try it online!


Using the builtin is actually longer:

Python, 43 bytes

lambda:paretovariate(1)
from random import*

Try it online!

Both solutions work in both Python 2 and Python 3.

Mego

Posted 2017-12-15T06:56:59.820

Reputation: 32 998

1Full programs are shorter for tasks which don't use input, using print saves a byte. – Erik the Outgolfer – 2017-12-15T13:57:31.837

1

J, 5 bytes

%-.?0

How ot works:

?0 generates a random value greater than 0 and less than 1

-. subtract from 1

% reciprocal

Try it online!

Galen Ivanov

Posted 2017-12-15T06:56:59.820

Reputation: 13 815

Note to everyone: issacg has added some rules that allow some imprecisions, therefore most answers here are longer than necessary. – user202729 – 2017-12-16T15:18:47.727

1

Japt, 6 bytes

1/1-Mr is the same length but this felt a little less boring!

°T/aMr

Try it


Explanation

Increment (°) zero (T) and divide by (/) its absolute difference (a) with Math.random().

Shaggy

Posted 2017-12-15T06:56:59.820

Reputation: 24 623

Note to everyone: issacg has added some rules that allow some imprecisions, therefore most answers here are longer than necessary. – user202729 – 2017-12-16T15:17:58.210

1

Red, 19 bytes

1 /(1 - random 1.0)

Try it online!

Galen Ivanov

Posted 2017-12-15T06:56:59.820

Reputation: 13 815

Note to everyone: issacg has added some rules that allow some imprecisions, therefore most answers here are longer than necessary. – user202729 – 2017-12-16T15:18:30.447

1

Java 8, 22 18 bytes

v->1/Math.random()

(Old answer before the rules changed: v->1/(1-Math.random()))

Try it here.

Kevin Cruijssen

Posted 2017-12-15T06:56:59.820

Reputation: 67 575

1

APL (Dyalog), 5 bytes

÷1-?0

Try it online!

How?

 ÷   1-     ?0
1÷  (1-  random 0..1)

Uriel

Posted 2017-12-15T06:56:59.820

Reputation: 11 708

Note to everyone: issacg has added some rules that allow some imprecisions, therefore most answers here are longer than necessary. – user202729 – 2017-12-16T15:18:03.700

1

Jelly, 5 bytes

Jelly also doesn't have random float, so this uses x/n where x is an random integer in range [1, n] (inclusive) to emulate a random float in range (0, 1]. In this program n is set to be 108.

ȷ8µ÷X

Try it online!

Explanation

ȷ8     Literal 10^8.
  µ    New monad.
   ÷   Divide by
    X  random integer.

Enlist, 3 bytes

ØXİ

Try it online!

Enlist beats Jelly! (TI-Basic not yet)

Explanation

  İ    The inverse of...
ØX     a random float in [0, 1)

Of course this has nonzero probability of take the inverse of 0.

user202729

Posted 2017-12-15T06:56:59.820

Reputation: 14 620

Would the Enlist solution not fail if ØX returned 0? (Disclaimer: I don't know Enlist at all!) – Shaggy – 2017-12-16T23:34:24.267

@Shaggy your program must output a number greater than or equal to 1 with at least probability 0.999. The rest of the time it may crash (from the challenge rules) – user202729 – 2017-12-17T02:49:55.623

1

IBM/Lotus Notes Formula, 13 bytes

1/(1-@Random)

Sample (10 runs)

enter image description here

ElPedro

Posted 2017-12-15T06:56:59.820

Reputation: 5 301

Note to everyone: issacg has added some rules that allow some imprecisions, therefore most answers here are longer than necessary. – user202729 – 2017-12-16T15:17:03.113

Not sure I could make this much shorter whatever rule changes are made :) – ElPedro – 2017-12-16T22:41:32.763

1

Pyt, 2 bytes

ṛ⅟

Explanation:

ṛ           Random number in [0,1)
 ⅟          Multiplicative inverse
            Implicit print

Try it online!

mudkip201

Posted 2017-12-15T06:56:59.820

Reputation: 833

0

Pyth, 4 bytes

c1O0

Try it here!

Alternative: c1h_O0.

Mr. Xcoder

Posted 2017-12-15T06:56:59.820

Reputation: 39 774

c1tOZ is 5, does it not work? – Dave – 2017-12-15T12:56:33.387

@Dave Doesn’t work, that returns negative values. I need 1-n not n-1 – Mr. Xcoder – 2017-12-15T12:58:41.360

Does Pyth not have a constant for 100? – Shaggy – 2017-12-17T00:00:52.213

@Shaggy I wish it did. Unfortunately, no constant for 100 AFAIK – Mr. Xcoder – 2017-12-17T06:20:33.593

0

J, 9 Bytes

p=:%@?@0:

I couldn't figure out how to make it take no input, since p=:%?0 would evaluate immediately and remain fixed. Because of this its sort of long.

How it works:

p=:        | Define the verb p
       0:  | Constant function. Returns 0 regardless of input.
     ?@    | When applied to 0, returns a random float in the range (0,1)
   %@      | Reciprocal

Evaluated 20 times:

    p"0 i.20
1.27056 1.86233 1.05387 16.8991 5.77882 3.42535 12.8681 17.4852 2.09133 1.82233 2.28139 1.58133 1.79701 1.09794 1.18695 1.07028 3.38721 2.88339 2.06632 2.0793

Bolce Bussiere

Posted 2017-12-15T06:56:59.820

Reputation: 970

0

Clean, 91 bytes

import StdEnv,Math.Random,System.Time
Start w=1.0/(1.0-hd(genRandReal(toInt(fst(time w)))))

Clean doesn't like random numbers.

Because the random generator (a Mersenne Twister) needs to be given a seed, I have to take the system timestamp to get something that differs passively per-run, and to do anything IO-related I need to use a whole Start declaration because it's the only place to obtain a World.

Try it online!

Οurous

Posted 2017-12-15T06:56:59.820

Reputation: 7 916