237

The JavaScript Math.random() function is designed to return a single IEEE floating point value n such that 0 ≤ n < 1. It is (or at least should be) widely known that the output is not cryptographically secure. Most modern implementations use the XorShift128+ algorithm which can be easily broken. As it is not at all uncommon for people to mistakenly use it when they need better randomness, why do browsers not replace it with a CSPRNG? I know that Opera does that*, at least. The only reasoning I could think of would be that XorShift128+ is faster than a CSPRNG, but on modern (and even not so modern) computers, it would be trivial to output hundreds of megabytes per second using ChaCha8 or AES-CTR. These are often fast enough that a well-optimized implementation may be bottlenecked only by the system's memory speed. Even an unoptimized implementation of ChaCha20 is extremely fast on all architectures, and ChaCha8 is more than twice as fast.

I understand that it could not be re-defined as a CSPRNG as the standard explicitly gives no guarantee of suitability for cryptographic use, but there seems to be no downside to browser vendors doing it voluntarily. It would reduce the impact of bugs in a large number of web applications without violating the standard (it only requires the output be round-to-nearest-even IEEE 754 numbers), decreasing performance, or breaking compatibility with web applications.


EDIT: A few people have pointed out that this could potentially cause people to abuse this function even if the standard says you cannot rely on it for cryptographic security. In my mind, there are two opposing factors that determine whether or not using a CSPRNG would be a net security benefit:

  1. False sense of security - The number of people who otherwise would use a function designed for this purpose, such as window.crypto, decide instead to use Math.random() because it happens to be cryptographically secure on their intended target platform.

  2. Opportunistic security - The number of people who don't know any better and use Math.random() anyway for sensitive applications who would be protected from their own mistake. Obviously, it would be better to educate them instead, but this is not always possible.

It seems safe to assume that the number of people who would be protected from their own mistakes would greatly exceed the number of people who are lulled into a false sense of security.

* As CodesInChaos points out, this is no longer true now that Opera is based off of Chromium.


Several major browsers have had bug reports suggesting to replace this function with a cryptographically-secure alternative, but none of the suggested secure changes landed:

The arguments for the change essentially match mine. The arguments against it vary from reduced performance on microbenchmarks (with little impact in the real world) to misunderstandings and myths, such as the incorrect idea that a CSPRNG gets weaker over time as more randomness is generated. In the end, Chromium created an entirely new crypto object, and Firefox replaced their RNG with the XorShift128+ algorithm. The Math.random() function remains fully predictable.

Glorfindel
  • 2,235
  • 6
  • 18
  • 30
forest
  • 64,616
  • 20
  • 206
  • 257
  • 1
    Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/74736/discussion-on-question-by-forest-why-is-math-random-not-designed-to-be-cryptog). – Jeff Ferland Mar 19 '18 at 08:55
  • 15
    Saying that a feature has certain qualities implies obligation. Obligation incurs cost, first to implement the obligation, second to keep it up to date, and third when you find out that you have not lived up to your obligation. This is especially so when you have obligated to deliver **secure** functionality. – MichaelK Mar 19 '18 at 12:42
  • @MichaelK But there are so many other examples where that is not true. You do not have to say that a feature has certain qualities anywhere. Keep the standard as it is and opportunistically improve security. A good example is any modern C compiler. Would you claim that it is foolish for the compiler to support `FORTIFY_SOURCE`? Why not just educate people so they don't make vulnerable programs? Why have GCC protect them? I don't know of anyone who is sloppy in their code because they think GCC will protect them, but I know of many people who have been saved by GCC's security measures. – forest Mar 20 '18 at 03:00
  • In other words, you are saying that _fail-safe design_ is a bad thing. – forest Mar 20 '18 at 03:05
  • 1
    @forest Opportinistic security? Words I never thought anyone would say. – MichaelK Mar 20 '18 at 05:10
  • 9
    Performance maybe? Obtaining a cryptographically secure random value is way more CPU heavy than a pseudo random value. While designing a game (in C++) I specifically had to choose a random algorithm that was offering decent performance. – Rolf ツ Mar 20 '18 at 14:12
  • Blink is a rendering engine. V8 is the JavaScript implementation used by Chrome and Opera. – ArrowCase Mar 20 '18 at 16:27
  • @Rolfツ This issue has been discussed extensively in answers and comments. The performance of XorShift128+ compared to, say ChaCha8 as used for returning individual floating point numbers. – forest Mar 21 '18 at 01:41
  • 1
    Comparing the performance XorShift128+ to ChaCha8 is only part of the performance question. A CSPRNG must collect sufficient entropy before it is ready to emit any secure random bits. This can take significant wall-clock time. – President James K. Polk Mar 21 '18 at 19:41
  • @JamesKPolk You only need 128 bits, once (ChaCha takes 256 bits but it's perfectly acceptable to repeat the key twice, as long as you change the constant). By the time any browser loads, plenty of entropy will be available to the system. On microcontrollers that run a JavaScript interpreter for whatever reason and have no source of good entropy, they could just use XorShift128+ (since I'm not suggesting a change to the standards). – forest Mar 22 '18 at 00:05
  • 1
    @forest: fair enough, it should indeed not be a problem on any modern platform, desktop or mobile. – President James K. Polk Mar 22 '18 at 02:18

11 Answers11

456

I was one of the implementers of JScript and on the ECMA committee in the mid to late 1990s, so I can provide some historical perspective here.

The JavaScript Math.random() function is designed to return a floating point value between 0 and 1. It is widely known (or at least should be) that the output is not cryptographically secure

First off: the design of many RNG APIs is horrible. The fact that the .NET Random class can trivially be misused in multiple ways to produce long sequences of the same number is awful. An API where the natural way to use it is also the wrong way is a "pit of failure" API; we want our APIs to be pits of success, where the natural way and the right way are the same.

I think it is fair to say that if we knew then what we know now, the JS random API would be different. Even simple things like changing the name to "pseudorandom" would help, because as you note, in some cases the implementation details matter. At an architectural level, there are good reasons why you want random() to be a factory that returns an object representing a random or pseudo-random sequence, rather than simply returning numbers. And so on. Lessons learned.

Second, let's remember what the fundamental design purpose of JS was in the 1990s. Make the monkey dance when you move the mouse. We thought of inline expression scripts as normal, we thought of two-to-ten line script blocks as common, and the notion that someone might write a hundred lines of script on a page was really very unusual. I remember the first time I saw a ten thousand line JS program and my first question to the people who were asking me for help because it was so slow compared to their C++ version was some version of "are you insane?! 10KLOC JS?!"

The notion that anyone would need crypto randomness in JS was similarly insane. You need your monkey movements to be crypto strength unpredictable? Unlikely.

Also, remember that it was the mid 1990s. If you were not there for it, I can tell you it was a very different world than today as far as crypto was concerned... See export of cryptography.

I would not have even considered putting crypto strength randomness into anything that shipped with the browser without getting a huge amount of legal advice from the MSLegal team. I didn't want to touch crypto with a ten foot pole in a world where shipping code was considered exporting munitions to enemies of the state. This sounds crazy from today's perspective, but that was the world that was.

why do browsers not replace it with a CSPRNG?

Browser authors do not have to provide a reason to NOT do a change. Changes cost money, and they take away effort from better changes; every change has a huge opportunity cost.

Rather, you have to provide an argument not just why making the change is a good idea, but why it is the best possible use of their time. This is a small-bang-for-the-buck change.

I understand that it could not be re-defined as a CSPRNG as the standard explicitly gives no guarantee for suitability for cryptographic use, but there seems to be no downside to doing it anyway

The downside is that developers are still in a situation where they cannot reliably know whether their randomness is crypto strength or not, and can even more easily fall into the trap of relying on a property that is not guaranteed by the standard. The proposed change doesn't actually fix the problem, which is a design problem.

Eric Lippert
  • 4,386
  • 2
  • 16
  • 12
  • 23
    (Bit off topic) Would you mind providing a reference for "the .NET Random class can trivially be misused in multiple ways to produce long sequences of the same number is awful"?: I've not heard of this before, or are you referring to the classic "create a thousand Random instances in a tight loop"? – VisualMelon Mar 15 '18 at 18:12
  • 47
    @VisualMelon: Create a thousand instances in a tight loop is the classic. But there are also failure modes when you use one instance of Random on two threads at the same time. Random is not threadsafe and there is a scenario where a race can cause it to return zero forever! – Eric Lippert Mar 15 '18 at 18:36
  • 17
    @VisualMelon: There are also more subtle scenarios. Suppose you have two instances of Random with different seeds. Seems fine, right? But suppose you then combine those two instances of Random in some way. Maybe you are using them to each produce a sequence of die rolls and add them together pairwise. **Are the two sequences correlated with each other in some non-random way**? It seems plausible. After all, they're running the same algorithm "in parallel", just with a different seed. – Eric Lippert Mar 15 '18 at 18:41
  • Thanks for elaborating: I suppose I'm wary enough to not try using _any_ source from multiple threads unless it is clearly documented for that purpose so I'd never notice. I'm not sure that the last point is a specific concern for any 'implementation' rather than the algorithm behind it: I don't think we can hold the API designers accountable for that! – VisualMelon Mar 15 '18 at 18:48
  • 10
    I think this is the best answer here so far, especially as it's from such an authoritative source. I'll mark this answer as accepted for now unless a better answer comes along. – forest Mar 16 '18 at 03:35
  • @EricLippert I don't understand your 2 streams with different keys added together example. If it is a CSPRNG, then I would think that should be ok. If it is not a CSPRNG (just a plan PRNG) then you should never use it for security, even if you just have a single instance. – Buge Mar 16 '18 at 10:52
  • 2
    @Buge: It is quite easy to imagine a (not very good) RNG which generates integers, and where the bottom bit of the output is random, but independent of the seed. If you add the output of two such instances (independently seeded), the result would always be even. (Eric's point had nothing to do with security - I don't know what made you think it had.) – Martin Bonner supports Monica Mar 16 '18 at 11:19
  • 1
    Yes! Why are *all* random APIs on *all* platforms so messed up? I guess teams go "let the intern do this". – usr Mar 16 '18 at 11:40
  • 13
    @EricLippert I think it's worth adding a concrete example to this answer of C#'s abusable `Random`. Something like this (pseudo)code: `while (something) { int rand = new Random().nextInt(); doSomething(rand); }` -- That fails because C#'s `Random` uses the current time with course resolution as the seed, so if `doSomething` is fast, you'll get the first number in the sequence with whatever that time's seed is, over and over again, rather than different ones. (I know you know this; I'm explaining for the people who didn't work on C# with Microsoft) – Nic Mar 16 '18 at 21:39
  • @Buge Actually, sometimes mixing multiple bad PRNGs together can actually give you a decent cipher, at least if done correctly. For example E0 (the cipher used by in all but the newest Bluetooth protocols) involves four LFSRs. Each individual LFSR is trivial to break, but E0 itself is a good bit stronger (but still not great). – forest Mar 17 '18 at 00:31
  • @MartinBonner I thought Eric's comment was talking about CSPRNGs, because this is security.stackexchange.com and both the question and Eric's answer were talking about CSPRNGs. – Buge Mar 18 '18 at 04:22
  • 1
    @forest I should have limited my statement by saying "If it is not a CSPRNG (just a plain PRNG) then you should never use it for security unless you are a cryptography expert." Just as a crypto expert can create a new cipher using insecure primitives such as addition, xor, and multiplication, a crypto expert can also create a new CSPRNG using PRNGs. And just as a layperson should never create a new cipher using insecure primitives, a layperson should never use a PRNG for security. – Buge Mar 18 '18 at 04:28
  • 1
    An excellent, informative answer. The “opportunity cost” is probably the only real explanation needed, but everything else is a real treat –  Mar 20 '18 at 14:33
  • I was going to provide a very similar answer that played on the same themes mentioned here, basically the short, unsatisfying version of which is "the specification doesn't require it, so implementations don't need to make it cryptographically random" – Patrick Roberts Mar 21 '18 at 11:08
  • @Buge The reasoning Martin uses applies to CSPRNGs. Not particularly good ones mind you, but then most CSPRNGs used by mainstream languages are anything but amazing. – Voo Mar 22 '18 at 10:12
  • @Voo If " the bottom bit of the output is random, but independent of the seed" then it isn't a CSPRNG. A CSPRNG [requires](https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator) that if you have no knowledge of the seed, then the output of the CSPRNG is indistinguishable from true randomness. But this bad PRNG is fairly easily distinguishable from true randomness even with no knowledge of the seed. Simply check if the bottom bit follows the pattern. – Buge Mar 23 '18 at 08:18
  • @Buge I was thinking of a shared source of randomness (say a hardware module). But you're right that that wouldn't work if you had two sequences. – Voo Mar 23 '18 at 09:16
118

Because there actually is a cryptographically secure alternative to Math.random():

window.crypto.getRandomValues(typedArray)

This allows the developer to use the right tool for the job. If you want to generate pretty pictures or loot drops for your game, use the fast Math.random(). When you need cryptographically secure random numbers, use the more expensive window.crypto.

Philipp
  • 48,867
  • 8
  • 127
  • 157
  • 10
    Why wouldn't you use a secure random function for loot drops? – xDaizu Mar 15 '18 at 11:15
  • 54
    @xDaizu Because not every loot drop is a loot box where someone is going to make a real life lottery out of it. – DonFusili Mar 15 '18 at 11:16
  • 276
    If your loot needs to be cryptographically secure, *and* you're using a client-side RNG to generate the loot (rather than asking the server "what did I get?"), you're very likely doing something wrong. – Soron Mar 15 '18 at 12:51
  • 12
    This doesn't answer the question...forest is asking about why `Math.Random()` is implemented the way it is, not how to use a CSPRNG in Javascript – Qwerty01 Mar 15 '18 at 16:02
  • 29
    To be more explicit, `window.crypto` is largely useless if you don't trust the user. In this specific case they (or their user-agent, or any proxies they trust) can simply replace it with `() => 0.42` before it is executed. These are utilities to enable the user to establish security, not you. – OrangeDog Mar 15 '18 at 16:10
  • @xDaizu your question is answered in the post you replied to: it’s a matter of “fast vs [slow and] expensive” –  Mar 20 '18 at 14:36
  • @OrangeDog, that is why you always have to use the `const` keyword with it. – NH. Mar 23 '18 at 15:32
  • 2
    @NH. nothing that you write in the script is going to help, the user has complete control of it – OrangeDog Mar 23 '18 at 15:43
  • @OrangeDog, is that a CVE? If not, can you post your concerns as a separate post (we are beginning to diverge from the subject...) – NH. Mar 23 '18 at 15:45
  • @OrangeDog That's true, but it's true because the question is about client side scripting. It's unrelated to the fact that `Math.random` is not cryptographically secure. – jpmc26 Mar 24 '18 at 01:21
72

JavaScript (JS) was invented in 1995.

  1. Potentially illegal: cryptography was still under tight export control in 1995, so a good CSPRNG might not even have been legal to distribute in a browser.
  2. Performance: historically, CSPRNGs (cryptographically secure pseudo-random number generators) are much slower than PRNGs, so why use a CSPRNG by default?
  3. No security mindset: in 1995, SSL had just been invented. Virtually no servers supported it yet, the internet consisted of phone lines and it was used for public forums (BBSes) and MUDs. Encryption was something for intelligence agencies.
  4. No need: web applications did not exist yet because JavaScript was just being invented. It was designed as an interpreted (thus slow) language to aid pages in being more dynamic. It wouldn't have crossed anyone's mind to use a slow (and barely-existent) CSPRNG by default for a random function.
  5. So little need, in fact, that there was no alternative: JavaScript did not even have a generally supported API for a CSPRNG until December 2013, so proper cryptography in web applications was hardly possible until a few years ago.
  6. Consistency: rather than changing an existing function to have a different meaning, they made a new function with a different name. You can access the CSPRNG now through crypto.getRandomValues.

In summary: legacy, but also speed and consistency. Insecure PRNGs are still much faster because you can't assume all hardware has AES support, nor depend on the availability or security of RDRAND.

Personally, I think it's time to swap out all random functions with CSPRNGs and rename the faster, insecure functions to something like fast_insecure_random(). They should only be needed by scientists or other people who do simulations that need lots of random numbers, but where predictability of the RNG is not an issue. But for a function with two decades of history, where an alternative has existed for only four years now (in 2018), I can understand why we are not at that point just yet.

Luc
  • 31,973
  • 8
  • 71
  • 135
  • 4
    Also Javascript is quite a bit older than most people think. and the math.random function has been in it since atleast 1997 (version 1.1) A time at wich no one would even attempt something like cryptographics in javascript (SSL it self did not see public light untill 1995 a only a few years before) – LvB Mar 15 '18 at 11:18
  • 8
    I disagree with the last sentence in this. While `crypto.getRandomValues` might have been a better name in isolation, with `math.Random` already in existence for a number of years something like `math.RandomCrypto` would have been a much better choice because it'd be listed next to the insecure call in any API listings and good autocomplete would bring both versions up whenever a developer typed `math.rando` making it far more likely they'd realize they should be using the new one when they need to. – Dan Is Fiddling By Firelight Mar 15 '18 at 13:56
  • 2
    semantically your proposal is wrong. you want to get Random values that are cryptographically sound. not get a random crypto function / ciper / etc. Also, the Math library is only about Mathematical functions. a CSPRNG is a Cryptographic function and therefor belongings in the crypto library. (and do nto forget that adding to the standard requires consensus off all the different engine builders.) – LvB Mar 15 '18 at 14:29
  • @DanNeely I see your point but am unsure how best to change it. Feel free to edit! – Luc Mar 15 '18 at 15:31
  • 1
    @LvB I'm open to different names for the function (`math.RandomSecure` ?) or even having the implementation in `crypto.*` with just an alias under `math.*` (is modern js able to inline the alias to avoid overhead from an extra functional call to implement the alias?). But with the huge number of different tools people use to edit javascript putting the old name as the first part of the new one maximizes the likelihood that an editor will be able to suggest the secure random source to developers. – Dan Is Fiddling By Firelight Mar 15 '18 at 15:48
  • 1
    @DanNeely you are trying to solve a human problem by technical means here. e.a. the human should learn what tool to use where, not the IDE using some convenience feature to suggest the correct one. (it would be beter to have the IDE than suggest the proper one but that's IDE Dependant). It also would not help all those people that write javascript in browser / vim / nano / notepad / stone tablet / punch card / or any other text editor or direct input methodology. Also you would need to get a lot of the engine developers on your hand, people that are not easy to convince. – LvB Mar 15 '18 at 16:15
  • @DanNeely you can disagree all you want, you're not on the TC39 committee, nor is this the correct forum in which to argue with their decisions. FWIW, you can polyfill your own suggestion by simply assigning `Math.randomCrypto = function () { const u = new Uint32Array(1); crypto.getRandomValues(u); return u[0] / 2 ** 32; }`. Of course there are possible optimizations, but this is just an example implementation. – Patrick Roberts Mar 21 '18 at 11:21
19

This is too long for a comment.

I believe there is a flawed premise in your question:

on modern (and even not so modern) computers, it would be trivial to output hundreds of megabytes per second using ChaCha8 or AES-CTR

You are thinking of a desktop browser on an AC connected machine or a laptop with a big honkin 10Ah battery.

We live in an increasingly mobile-first world, and while mobile devices these days are quite powerful they have two important constraints: heat and battery life. Unlike a desktop processor which can easily reach 100C, you can't burn the hand off the smartphone user. And phone batteries typically only hold maybe 1/3 as much as a laptop (if you're lucky). There's simply no good reason to add the extra heat-generation/power-consumption if you don't need it.

Luc
  • 31,973
  • 8
  • 71
  • 135
Jared Smith
  • 1,978
  • 1
  • 10
  • 12
  • 6
    I doubt you'd notice the performance/power consumption difference between crypto and non crypto RNG for typical code, even on mobile. You'd have to write a monte carlo simulation in javascript. We're talking about something like 30 CPU cycles for each number. – CodesInChaos Mar 15 '18 at 13:51
  • 5
    @CodesInChaos that may be true. I'd want to see analytics (especailly on e.g. a port of a CRPG) before making that call. But it's the implicit premise of "computers are fast now so it doesn't matter" that is an issue. – Jared Smith Mar 15 '18 at 14:21
  • ChaCha8 can provide around 3 cycles per byte on even old processors, and under 1 cycle per byte with AVX512. XorShift128+ is under 1 cycle per byte on a high-end Haswell processor, so the difference is not large. I imagine the mere act of turning the random value into a proper floating point and returning it takes more cycles than generating the random value in the first place. – forest Mar 16 '18 at 03:49
  • 2
    @CodesInChaos People write Monte Carlo Simulations in JavaScript. – Derek Elkins left SE Mar 16 '18 at 05:35
  • Come to think about it, I'm pretty sure a desktop processor reaching 100°C would be _very bad_. Laptop processors can typically handle it (if just barely), but desktops? – forest Mar 17 '18 at 02:37
  • @CodesInChaos Technically you don't need cryptographically strong RNG for monte carlo. Any decent RNG will work (like Mersenne Twister). – jb. Mar 19 '18 at 16:08
  • 6
    @jb. My point is that crypto RNGs are fast enough for practically any use-case and thus should be the default. Monte Carlo simulations are one of the few cases where you might want to trade quality for performance. – CodesInChaos Mar 19 '18 at 16:13
  • 1
    @CodesInChaos Given that `Math.random()` only returns individual floating point values and has to be called for each new value, it would be a rather terrible way to get randomness for Monte Carlo simulations, regardless of whether or not XorShift128+ or ChaCha is used. Also the LSB of XorShift128+ is an LFSR which is probably not very good for such simulations. – forest Mar 20 '18 at 05:40
  • @CodesInChaos: The difference in *generating* bytes is probably small, but no one seems to have mentioned the harder part of a CSPRNG: collecting entropy. – President James K. Polk Mar 21 '18 at 19:38
  • browser can happily use the random data produced by OS – akostadinov Mar 22 '18 at 13:13
  • @JamesKPolk: If there's any possibility a CSPRNG's state has been exposed, it's necessary to reseed it with a source of entropy independent of any previous one, but if one has a good CSPRNG with e.g. 4096 bits of state seeded with e.g. 160 bits of "pure" entropy, and never gets exposed, are there any attacks that would be practical that would not be practical if it were seeded with 4096 bits of entropy? – supercat Mar 22 '18 at 18:14
  • 1
    @forest: It isn't that a desktop CPU can safely tolerate more temperature (it can't, although the limit is usually similar to a mobile CPU, say 90, 95, or 100C). It's that most desktop CPUs are installed with efficient cooling such that they can run full tilt running cryptographic operations and still stay relatively cool, perhaps 60C. You simply can't fit that cooling solution in a handheld device, among other reasons because it weighs more and is larger than the rest of the phone and only works with air vents and airflow, unacceptable for a water resistant handheld that may be in a bag. – Ben Voigt Apr 02 '18 at 04:33
  • @BenVoigt Oh huh, I could have sworn that the Tj Max for desktop CPUs is much lower than that of laptops. I must have remembered incorrectly. – forest Apr 02 '18 at 11:24
16

The larger reason is that there is an alternative to Math.random(): see Philipp's answer. So whoever needs strong crypto may have it, and those who don't can save time and (battery) power.

But supposing you had asked, "why, even if there is a stronger alternative, did the developers not update Math.random() just the same -- i.e., made random() a derivative of getRandomValues() - in order to automatically strengthen a lot of apps out there?" - then I don't think this is really answerable with any confidence, except by those developers who took the decision (update: and as Fate would have it, we do have such an answer).

In principle - as you already said - there is no strong reason.

Moreover, most development teams have a significant backlog of things that are more urgent to do; and even an apparently small change like this requires testing, regression, and going against the golden "If it ain't broken, don't fix it" rule, a stronger form of the YAGNI criterion.

LSerni
  • 22,521
  • 4
  • 51
  • 60
  • I'm sure there has been discussion on this on the Mozilla and Google trackers. Perhaps you could provide some links to show this? I would be content with the answer simply being "because no one has done it yet", but I would like to see some examples of this having come up before and developers shooting the idea down as unnecessary. – forest Mar 15 '18 at 07:18
  • Downvote because I think it is answerable and it's silly to state that, in bold on top, as the summary of your answer. – Luc Mar 15 '18 at 11:08
  • @Luc I've amended the answer to clarify my meaning - as well as upvoting Philipp's answer, which I believe is the correct answer to the question as asked. – LSerni Mar 15 '18 at 11:58
14

Random numbers and cryptorandom bits are completely different animals. They are not even used for the same purpose. If you want a random number evenly distributed between 0 and 42, then you want an even distribution without an obvious pattern. Note that if you mod a larger number with a smaller one, then it's not exactly an even distribution. This example is easy to see for a random number from 0 to 31 taken mod 27. 0 through 4 show up twice as often as 5 to 31.

Until you talk about cryptorandom, the notion of entropy isn't even discussed. A bit of entropy doubles the search space to guess the number (for the intended user of the numbers).

When you ask for cryptorandom bits, you are asking for N bits of entropy. It isn't good enough to have a non-obvious pattern, because if a function that generates it is discovered (no matter how convoluted), then there are actually 0 bits of entropy from the point of view of whoever knows this function.

A good example of this is a Fortuna-like pseudo-random number generator. You encrypt a number 1 with a key for the first random number (where the cipher block is a big number), then encrypt number 2 with a key for the second random number, and so on. With respect to the user who does not know the key (of K bits) to the cipher, a perfect N-bit cipher block will have N bits of entropy for that block.

If you expand out to a million bits of pseudo-random data from it, then you still only have K bits of entropy if you keep going with the same key K. In other words: if you had a book of 1 million bits that you know were generated with a single cipher under K, then don't try to guess all cipher stream bits. Just stick to guessing the key and generate the cipher stream from it.

So a random number generator is often a cipher that keeps getting re-seeded with more randomness, as it can be attained. By comparison, a simple [0,1] random number generator can't have more entropy than the number of bits in the number; and will typically have an odd distribution that isn't exactly what you want as well. Crypto needs hundreds of bits, when floating point numbers are only 32 or 64 bit, and the algorithm itself takes away much of the entropy .... presuming that you want something evenly distributed from [0..1], rather than say a floating point representation made of random bits. I don't even know what distribution that would even have.

Rob
  • 639
  • 3
  • 9
8

Youve kind of answered the question yourself:

the standard explicitly gives no guarantee for suitability for cryptographic use

So rather than change the implementation, the focus should be on educating developers to choose the 'right tool for the job' ™.

Given that, and the technical overhead of altering the implementation of a commonly used function, aswell as the fact that there are already specific solutions to this problem (see @Philipps answer), there is no compelling reason to make the change.

richzilla
  • 189
  • 3
  • Why the DV? Anonymous downvotes are no use to anyone... – richzilla Mar 19 '18 at 10:34
  • 1
    I downvoted because this answers a different question. Naturally, the standard doesn't guarantee cryptographic security. However my question was specifically about opportunistic security. The standards would not need to be touched. The technical overhead should be low, as browser already link against a TLS library. I could probably edit a browser to use the library for `Math.random()` in less than 5 lines worth of diff. – forest Mar 22 '18 at 05:17
5

Programming language design needs to consider many things. Browsers are very powerful and optimize javascript a lot today. But when you consider embedded systems, you may not have any good source of randomness. For example there are microcontrollers running a nodeJS (like) environment.

Such a microcontroller does not have random sources which guarantee cryptographically secure random numbers. So you would need to require connecting some device which can provide random input to a pin to be able to implement a programming language which makes strong guarantees about random numbers. And you would need quite some knowledge to build a device which provides enough randomness and to process the input from the device in an appropriate way as well.

allo
  • 3,173
  • 11
  • 24
  • A browser running JavaScript will usually have a good source of randomness in order to provide TLS. And as I said that the standard would not need to be changed, there _does not need to be_ a guarantee that the API will output cryptographically secure randomness. – forest Mar 16 '18 at 04:21
  • 1
    If the API does not guarantee true randomness, you should as a developer not rely on it. If the developers do not rely on it, you don't need to implement it. Imagine most browsers would have true randomness there. Websites would build stuff on it, because it is there. And then the next browser does not have true randomness. oops. Another reason may be that you exhaust the entropy pool more and need more time if you output true random values in each API call which does not require them. – allo Mar 16 '18 at 08:28
  • But some developers _do_ rely on it, not because they think it is cryptographically secure, but because they don't know how randomness works. Also there is no such thing as "exhausting entropy". A single 128 bit seed can output an effectively unlimited amount of unpredictable data. – forest Mar 16 '18 at 08:32
  • 1
    @forest While it's true that a sufficiently random 128-bit key combined with sufficiently good cryptographic primitives can generate more cryptographically secure pseudorandom data than anybody is likely to need in our lifetimes, there's a lot of caveats in there. The simple fact is that, in some CSPRNG implementations including many Linux kernels, entropy pools are *not* an unlimited resource and can be depleted/exhausted if you query them hard enough. Encourage people to use things like `/dev/urandom` and not `/dev/random`, if you want, but don't pretend `/dev/random` can't run out. – CBHacking Mar 17 '18 at 01:03
  • @CBHacking They actually are unlimited. And in fact `/dev/random` never "runs out". Rather, it intentionally blocks for rather silly legacy reasons. Read the code at [`drivers/char/random.c`](https://github.com/torvalds/linux/blob/master/drivers/char/random.c) to understand this in more detail (it's literally a call to `wait_event_interruptible()` for the estimate count). As long as ChaCha20 (which `/dev/urandom` uses) is not fatally broken in many ways, then there is no reason to use the blocking pool over the non-blocking pool. Note that browsers do not use the blocking pool. – forest Mar 17 '18 at 01:12
  • 1
    @forest Pedantry does not become you. The entropy pool is self-refilling, but to claim it "never runs out" is like saying I can't "run out" of milk so long as there's a reachable grocery store. Similarly, claiming the reasons `/dev/random` may block are "silly" and "legacy" doesn't change the fact that reads from it may block for arbitrarily long times, and indeed have been observed to do so on things like VMs without user interaction or real hardware (though today the host will often feed entropy to the VM). As for browsers, is the non-blocking entropy source required by any spec? – CBHacking Mar 17 '18 at 02:08
  • @CBHacking When you drink milk, the milk vanishes from the jug. When you use entropy, even from the blocking pool, the pool data is still there. The _only_ thing that changes is a single integer which keeps track of an estimate of entropy. As for browsers, yes it is required. Many operations that require an unpredictable stream of random numbers are required to occur within a certain period of time according to the specifications. A TCP implementation which risks waiting an indefinite period of time for randomness for the sequence number would be a broken implementation. – forest Mar 17 '18 at 02:13
  • If you were to patch the kernel to remove the blocking behavior of `/dev/random` by force, it would not choke up or turn into `/dev/zero` like you would expect if the entropy were actually used up. It would still be a perfectly secure CSPRNG, albeit one that relies on the preimage resistance of SHA1 (the blocking pool and hidden input pool use SHA1 mixing, whereas a modern non-blocking pool uses ChaCha20 seeded with 48 bytes of random data from the input pool). So I suppose you can use the term "runs out", but only in a very liberal sense. It is technically inaccurate and promotes myths. – forest Mar 17 '18 at 02:16
  • 1
    @forest Fair enough, but the behavior still resembles "running out" of entropy to a client application, and I wouldn't count on the preimage resistance of SHA1 to last too much longer, at which point a non-blocking `/dev/random` would not be a "perfectly secure CSPRNG" any more. As for browsers, I was talking about the JS API, not the stuff that the kernel networking stack (not the browser!) does with TCP sequence numbers. Obviously for this question the browser would not implement `Math.random()` using the blocking pool, but only because it would be foolish, not impossible. – CBHacking Mar 17 '18 at 02:34
  • @CBHacking The way SHA1 is used in the random driver is simply to mix the data up and to hash the entire 4096 bit buffer. A preimage against SHA1 as used in the kernel would be... very unexpected (I'd wager that ChaCha20 will be broken before then). As for the JS specifications, I don't know. It probably doesn't have any real requirement for cryptographic randomness, but I'm sure it does for non-blocking behavior. The point is that all it needs is 128 bits of randomness to make `Math.random()` opportunistically secure, without making that part of the standard. – forest Mar 17 '18 at 02:41
  • The point is, that when the state of your CSPRNG leaks, you will need entropy to seed it to a random value again. And when you now look at your microcontroller, it is very determinstic. Its memory will look always the same after booting, depending on the chip it will even have had the same number of clock cycles at the point the random number is requested. So you need to seed it, else your random sequence is always the same. – allo Mar 19 '18 at 09:50
  • And with too little entropy, someone can find the state of your CSPRNG by brute-force using the random sequence for validation if you do not seed it often enough. Keep in mind, that not only the state of the MC is very limited but the program itself cannot be too complex as there is little space for your program and you probably need most of it for the rest of your code. – allo Mar 19 '18 at 09:52
  • If you use `getrandom()`, it will not return unless the entropy pool has been initialized. Once that is done, you can be sure that the pool cannot be brute forced. – forest Mar 22 '18 at 05:20
  • @forest We're not talking about linux here. Your microcontroller with a simple javascript implementation does not have a kernel (in the sense like a linux kernel) at all. It boots directly into its nodejs (like) environment, which does not have a ``getrandom()`` call. You would need to implement the call yourself, with all the mentioned considerations. – allo Mar 22 '18 at 10:31
  • In that case, you can just fall back to whatever crappy PRNG you want (XorShift128+, LFSR, LCG, whatever). No need to change the standard to provide opportunistic security on one platform. – forest Apr 02 '18 at 11:44
5

Like others, I'd point out that Math.random() is not cryptographically secure because its typically not needed. But I'd go further and argue that it's not wise to write a cryptographically secure algorithm into a spec unless you have a very good reason.

What does it mean to be cryptographically secure? Well, there's always the boring definition of "nobody knows how to break it yet." But what happens when someone does break it? If you have specified a CSPRNG, you also have to include a way to query which algorithm is in use, or otherwise make it so that the end user can be certain of what they are getting.

This also likely leads to the need to be able to support multiple generators, so that the user can select one they trust. This adds enormous complexity. Suddenly a 1 line function in an API became a suite.

Also, when you start talking crypto, you start talking about trying to be secure in the generator. You mention using AES to generate random numbers: does your AES implementation need to be immune to side channel attacks? When you're writing a library with the dedicated purpose of providing cryptographic guarantees, it's not all that unreasonable to have to ask that question. For a spec, it could be terribly unreasonable. Immunity to side channel attacks is a very difficult thing to phrase in the language of specifications.

And what have you accomplished by putting it in a spec? Most users of PRNGs do not need cryptographic guarantees at all, so you just wasted CPU cycles for them. Those who want cryptographic guarantees are likely going to seek out a library which supports the full suite of functionality needed to be comfortable with such cryptogaphy, so they wont be trusting Math.random() anyways. All that is left is the demographic you mention: people who made a mistake and used a tool when they shouldn't. Well, I can tell you from experience, major programming languages are not a place to look for an API that you can't use incorrectly by mistake. They're full of "if you do this, it's your own fault" phrasings.

Also, consider this: if you use Math.random() and assume cryptographic guarantees, what are the odds that you're going to make another fatal crypto mistake somewhere in your algorithm? A CSPRNG Math.random() may provide a false sense of security, and we may find even more mistakes!

Cort Ammon
  • 9,206
  • 3
  • 25
  • 26
  • Obviously you should not use it an _assume_ cryptographic properties. If you need that, there are dedicated crypto APIs (that's also why I suggested ChaCha8 instead of the full-round ChaCha20. Note that a cryptographic break can be performed against a 7 round ChaCha). – forest Mar 17 '18 at 00:29
  • See my edit to the original question which clarifies this. – forest Mar 19 '18 at 04:26
  • 2
    @forest I see your edit, and I think my last paragraph still holds. If one thinks their software is secure because they got lucky and someone else thought about security for them, they're in for a rude awakening. – Cort Ammon Mar 19 '18 at 15:05
3

Everyone seems to have missed a bit of a nuance here: Cryptographic algorithms require a number to be mathematically and statistically random over all executions of the algorithm. This means for example during a game or an animation, that you could use a psuedorandom sequence of numbers and this would be perfectly fine for a "kind of random" number.

However if this number could be manipulated or predicted, for instance a seeded random number (Which is the default behaviour of the windows random functions) then this seed is actually predictable. If I can manipulate your application to restart and then use a seeded random number I can predict what "random" number you will choose. If this is possible, then the cryptography can be defeated. A secondary concern may also be that some algorithms require a guaranteed distribution of numbers across the spectrum, which certain psuedorandom generators cannot guarantee.

Cryptographically random number generators have a large set of inputs to create entropy, for example measuring shot noise from the microphone input, time of day ticks, checksum of ram registers, serial numbers etc. As many inputs as possible to make it if not impossible, then incredibly hard to manipulate and predict. In a cryptographic sense performance is not the objective, but "True" randomness.

So depending on your use case you might want a reasonably random, performant implementation of a random number, but if you are doing a diffie-hellman key exchange you need a cryptographically secure algorithm.

Spence
  • 131
  • 2
-1

Another consideration that I did not see anyone mention (or I just overlooked) is that some uses of the Math.random() function actually depend on repeatability when seeded the same as it was a previous time. Changing it now would break those uses.

forest
  • 64,616
  • 20
  • 206
  • 257