41

I was reading up on the documentation for Math.random() and I found the note:

Math.random() does not provide cryptographically secure random numbers. Do not use them for anything related to security. Use the Web Crypto API instead, and more precisely the window.crypto.getRandomValues() method.

Is it possible to predict what numbers a call to random will generate? If so - how could this be done?

Antonius Bloch
  • 507
  • 2
  • 9
Abe Miessler
  • 8,155
  • 10
  • 44
  • 72

3 Answers3

41

Indeed, Math.random() is not cryptographically secure.


Definition of Math.random()

The definition of Math.random() in the ES6 specification left a lot of freedom about the implementation of the function in JavaScript engines:

Returns a Number value with positive sign, greater than or equal to 0 but less than 1, chosen randomly or pseudo randomly with approximately uniform distribution over that range, using an implementation-dependent algorithm or strategy. This function takes no arguments.

Each Math.random function created for distinct code Realms must produce a distinct sequence of values from successive calls.

So let's have a look at how the most popular JavaScript engines implemented it.


Xorshift128+ is one of the XorShift random number generators, which are among the fastest non-cryptographically-secure random number generators.

I don't know if there's any attack on any of the implementations listed above, though. But those implementations are very recent, and other implementations (and vulnerabilities) existed in the past, and may still exist if your browser / server haven't been updated.

Update: douggard's answer explains how someone can recover the state XorShift128+ and predict Math.random() values.


V8's MWC1616 algorithm

On November 2015, Mike Malone explained in a blog post that V8's implementation of the MWC1616 algorithm was somehow broken: you can see some linear patterns on this test or on this one if you're using a V8-based browser. The V8 team handled it and released a fix in Chromium 49 (on January 15th, 2016) and Chrome 49 (on March 8th, 2016).

This paper pulished in 2009 explained how to determine the state of the PRNG of V8's based on the previous outputs of Math.random() (the MWC1616 version).

Here's a Python script which implements it (even if the outputs are not consecutive).

This has been exploited in a real world attack on CSGOJackbot, a betting site built with Node.js. The attacker was fair enough to just make fun of this vulnerability.


Lack of compartmentalization

Before ES6, the Math.random() definition didn't specify that distinct pages had to produce distinct sequences of values.

This allowed an attacker to generate some random numbers, determine the state of the PNRG, redirect the user to a vulnerable application (which would use Math.random() for sensitive things) and predict which number Math.random() was going to return. This blog post presents some code about how to do it (Internet Explorer 8 and below).

The ES6 specification (which had been approved as a standard on June 17, 2015) makes sure that browsers handle this case correctly.


Badly-chosen seed

Guessing the seed chosen for initializing the sequence can also allow an attacker to predict the numbers in the sequence. It's also a real world scenario, since it has been used on Facebook in 2012.


This paper published in 2008 explains different methods to leak some information thanks to the browsers' lack of randomness.


Solutions

First of all, always make sure that your browsers / servers are updated regularly.

Then, you should use cryptographic functions if needed:

Both rely on OS-level entropy, and will let you get cryptographically random values.

Benoit Esnard
  • 13,942
  • 7
  • 65
  • 65
27

You can attack these using the Z3 theorem prover. I've implemented such an attack in Python in order to predict values in a lottery simulator.

As mentioned previously, XorShift128+ is used in most places now, so that is what we are attacking. You begin by implementing the normal algorithm so you can understand it.

def xs128p(state0, state1):
    s1 = state0 & 0xFFFFFFFFFFFFFFFF
    s0 = state1 & 0xFFFFFFFFFFFFFFFF
    s1 ^= (s1 << 23) & 0xFFFFFFFFFFFFFFFF
    s1 ^= (s1 >> 17) & 0xFFFFFFFFFFFFFFFF
    s1 ^= s0 & 0xFFFFFFFFFFFFFFFF
    s1 ^= (s0 >> 26) & 0xFFFFFFFFFFFFFFFF 
    state0 = state1 & 0xFFFFFFFFFFFFFFFF
    state1 = s1 & 0xFFFFFFFFFFFFFFFF
    generated = (state0 + state1) & 0xFFFFFFFFFFFFFFFF

    return state0, state1, generated

The algorithm takes in two state variables, XORs and shifts them around, then returns the sum of the updated state variables. What is also important is how each engine takes the uint64 returned and converts it to a double. I found this info by digging through each implementation's source code.

# Firefox (SpiderMonkey) nextDouble():
# (rand_uint64 & ((1 << 53) - 1)) / (1 << 53)

# Chrome (V8) nextDouble():
# ((rand_uint64 & ((1 << 52) - 1)) | 0x3FF0000000000000) - 1.0

# Safari (WebKit) weakRandom.get():
# (rand_uint64 & ((1 << 53) - 1) * (1.0 / (1 << 53)))

Each is a little different. You then can take the doubles produced by Math.random() and recover some lower bits of the uint64 produced by the algorithms.

Next is implementing the code in Z3 so that it can be symbolically executed and the state can be solved for. See the Github link for more context. It looks pretty similar to the normal code except that you tell the solver that the lower bits must equal the recovered lower bits from the browser.

def sym_xs128p(slvr, sym_state0, sym_state1, generated, browser):
    s1 = sym_state0 
    s0 = sym_state1 
    s1 ^= (s1 << 23)
    s1 ^= LShR(s1, 17)
    s1 ^= s0
    s1 ^= LShR(s0, 26) 
    sym_state0 = sym_state1
    sym_state1 = s1
    calc = (sym_state0 + sym_state1)

    condition = Bool('c%d' % int(generated * random.random()))
    if browser == 'chrome':
        impl = Implies(condition, (calc & 0xFFFFFFFFFFFFF) == int(generated))
    elif browser == 'firefox' or browser == 'safari':
        # Firefox and Safari save an extra bit
        impl = Implies(condition, (calc & 0x1FFFFFFFFFFFFF) == int(generated))

    slvr.add(impl)
    return sym_state0, sym_state1, [condition]

If you supply 3 consecutively generated doubles to Z3 it should be able to recover your state. Below is a snippet from the main function. It is calling the symbolically executed XorShift128+ algorithm on two of Z3's 64 bit integers (the unknown state variables), providing the lower (52 or 53) bits from the recovered uint64s.

If that is successful the solver will return SATISFIABLE and you can get the state variables that it solved for.

    for ea in xrange(3):
        sym_state0, sym_state1, ret_conditions = sym_xs128p(slvr, sym_state0, sym_state1, generated[ea], browser)
        conditions += ret_conditions

    if slvr.check(conditions) == sat:
        # get a solved state
        m = slvr.model()
        state0 = m[ostate0].as_long()
        state1 = m[ostate1].as_long()

There is a slightly more detailed writeup here that focuses on using this method to predict winning lottery numbers in a powerball simulator.

douggard
  • 367
  • 3
  • 5
  • 2
    This is not an area I have much/any skill in, so sorry if the answer is obvious, but would this allow solving for an implementation that uses it in an algorithm such as: `Math.floor(Math.random()*9000) + 1000`, or would it lose too much to be retrievable? – Glenn 'devalias' Grant Aug 07 '17 at 06:44
  • Hey! I actually had an awesome opportunity to do research into this problem and modified this solver to work with Math.floor(CONST * Math.random()). In my testing, if CONST is 10,000 it’ll take ~14 values to produce the solution. – d0nut Jan 17 '20 at 01:54
  • @d0nut could you provide the link of your work? – kelalaka Mar 24 '20 at 19:49
  • 2
    Sure @kelalaka! Here's a recording of the talk I ended up giving for a virtual conference earlier this month: https://www.youtube.com/watch?v=_Iv6fBrcbAM – d0nut Apr 18 '20 at 11:18
  • Can I use this to predict a coin flipper, because it only outputs 2 values. Please help me! You said you used it to predict a lottery simulator but a lottery simulator doesn't return a double/number. So how did you recover the state then? – Sujal Motagi Jul 03 '20 at 04:16
  • How can I use this to predict a coin flipper? – Sujal Motagi Jul 03 '20 at 09:22
  • I've just checked on v8 and it's [now](https://chromium.googlesource.com/v8/v8/+/refs/heads/master/src/base/utils/random-number-generator.h#111) doing `bit_cast((rand_uint64 >> 12) | 0x3FF0000000000000) - 1` which seems much better. this drops the lower entropy lower order bits of XorShift128+, rather than dropping the high entropy high bits as was previously done. this apparently changed back in [2018](https://chromium.googlesource.com/v8/v8/+/ac66c97cfddc1e9fd89b494950ecf8a1a260bc80) – Sam Mason Apr 28 '21 at 18:17
  • The repo should still be up to date then :) [relevant commit at line 43 in the update](https://github.com/TACIXAT/XorShift128Plus/commit/f2ae4a3f9e5db51555d3633dc3bab08ae9aa2120#diff-ffbed23d6bda0b3e20169a683b1f76e55ab5c36b8b990d2476c963e4c9b5b176R43). – douggard Apr 29 '21 at 06:51
  • @douggard I was pointing this out to update the "digging through each implementation's source code" in the answer. wasn't sure about editing your answer – Sam Mason Apr 29 '21 at 10:58
3

Math.random (and similar other functions) start from a seed and create a new number. It seems random to the user because of course the algorithm is tuned in such a way that it appears so. But the matter of fact is that there is no real source of randomness anywhere. If you know the internal state of the generator (which is 100% deterministic software and nothing special at all), you know all future (and depending on the algorithm maybe also past) numbers generated by it.

A "real" random source would be stuff like measuring decay of a radioactive particle, or in more real-world terms, any kind of electrical white noise, or more practically, things like the user moving the mouse around or minute deviations between key presses and such things. Nothing whatsoever is in Math.random.

Math.random is (like the random function of most languages/libraries) designed in this way, and the property that you can recover a string of "random" numbers from the same seed is actually a useful feature in many cases. Just not for security.

AnoE
  • 2,370
  • 1
  • 8
  • 12