6

So, I have this:

I know that some code was used to generate a random sequence, and it looked roughly like this:

#include <iostream>
#include <string>

int main() {
    const std::string alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    std::string temp = "1234567890";
    srand(MAGICNUMBER);
    for (int i = 0;; ++i) {
        for (int j = 0; j < 10; ++j) {
            temp[j] = alphabet[rand() % alphabet.size()];
        }
        std::cout << temp << std::endl;
    }
}

It was used to generate a sequence of 124660967 strings, the last of which was "2lwd9JjVnE", and then stopped. The compilator used was 64-bit g++ 4.8 for Linux. What I want is to find the 124660968th string – that is, the one that would have been printed next. The caveat, of course, is that I don't know the MAGICNUMBER. I'm pretty sure that it's possible to brute-force all possibilities, but it would take millennia, it seems. I've tried to snoop around in rand() source code, but I don't really understand it, much less exploit it. Is it possible to find that string in more or less reasonable time?

UPD Is it even possible to generate what is supposed to go after my string without finding out the seed?

Akiiino
  • 161
  • 5
  • The compiler is (or *should be*) irrelevant in this case. The interesting part is the libc, because that's where rand() is implemented. And the seed is "only" an `unsigned int`... – user Mar 30 '16 at 14:20
  • @MichaelKjörling Well, the standard does _not_ provide any details on how random should be implemented, so the compiler seems to be quite relevant, or am I wrong? – Akiiino Mar 30 '16 at 14:25
  • You are confusing the compiler with the standard library. The compiler only cares about (in this case) C++ *keywords*, whereas the standard library builds on top of that to make it easier to implement useful functionality by providing things like `cout`, `rand()`, `std::string` and so on. – user Mar 30 '16 at 14:30
  • @MichaelKjörling, ah, I see, okay; learned something new today, I guess – Akiiino Mar 30 '16 at 14:31
  • Related, except about C, not C++: [How to predict C rand()?](http://security.stackexchange.com/q/31643/2138) – user Mar 30 '16 at 14:33
  • **Why** do you want to find the one that comes next? Why can't you just generate a new random value? Note that the method isn't completely secure anyways, since `rand()` doesn't return crypto-level results, and the mod afterwards will skew to one side (since the range of the random is not a multiple of the alphabet size). – Clockwork-Muse Mar 30 '16 at 18:32
  • Repost? https://stackoverflow.com/questions/35299865/is-there-a-way-to-find-the-next-item-in-random-sequence Anyway, `srand()` takes an unsigned integer, so why not just search the binary or memory for the number? Once you know the *MAGICNUMBER*, you can easily predict what comes next because it follows a set formula to generate random numbers. However, if you are wanting to reverse the PRNG without *MAGICNUMBER*, using only the known outputs, I suppose it is possible, but difficult. You would need to determine the internal state when the last random number was generated – Spencer D Apr 11 '16 at 21:23

2 Answers2

7

The compiler itself is irrelevant; the rand() function is implemented in libc.

The glibc implementation uses a linear congruential generator (LCG) or a linear feedback shift register (LFSR) for its rand(). These can be quite easily cracked given some of the outputs (which it seems you have). The details can be found in the answer to another question already asked here, but the crux is that LCGs and LFSRs are not secure and some simple maths and probability can gain you the right answer pretty quickly.

You'll need to work out which version of glibc was used, but the compiler version should give you a rough idea of the patch state of the system, so you can infer it from that.

You can find more resources on Google, but here's a quick set of links to help:

Polynomial
  • 132,208
  • 43
  • 298
  • 379
  • ...and this is why you should use a cryptographically secure RNG for any situation where people guessing the output is a problem. This doesn't just apply to cryptography, but also to online gaming and especially online gambling. – Philipp Mar 30 '16 at 14:32
  • I suspect this is a challenge of sorts, which is why I didn't provide a "this is how you win" walkthrough answer. And also because that's a lot of effort. But regardless, you're right - LCG/LFSR designs simply aren't of the right quality for security purposes. – Polynomial Mar 30 '16 at 14:36
1

Google says:

In order to generate random-like numbers, srand is usually initialized to some distinctive runtime value, like the value returned by function time (declared in header <ctime\>). This is distinctive enough for most trivial randomization needs.

If this technique is used and you know a more or less close estimate of the time (like, if you can narrow it down to a day), a brute force attack on the left input space will probably be feasible.

Other than that, if you follow google some more you find a series of blog entries that might lead you down the right track.

Tobi Nary
  • 14,302
  • 8
  • 43
  • 58
  • Nope, I know for granted that the seed was not time-related; so that won't work. Secondly, as I understood, `rand()` can use Mersenne Twister, but by default it is "Linear feedback shift register" as google and source code tell me, so that blog post doesn't seem to help much as well – Akiiino Mar 30 '16 at 14:24
  • The other posts in that series handle the linear congruential PRNG. – Tobi Nary Mar 30 '16 at 14:30