1

I'm using php, but the general question applies to any confirmed cryptographically secure pseudo-random string concatenated with a non-cryptographically secure string.

I know random_bytes generates a cryptographically secure string
I know converting with bin2hex results in a usable cryptographically secure string
But what happens when I shuffle in my own shit non-cryptographically secure strings?

Sample code:

<?php
    
    function randspec($len){
        $chars = str_split("!@#$%^&*()_+=-`~<>?:{}|\][;/.,");
        $result = "";
        for ($i=0; $i<$len; $i++) {
            $result .= $chars[mt_rand(0, count($chars)-1)];
        };
        return $result;
    };
    
    $count = 1;
    
    for($i = 0; $i < $count; $i++){
        $var_len = mt_rand(2,8);
        echo str_shuffle(bin2hex(random_bytes(12)).randspec($var_len));
        echo "\r\n";
    }
    die;

?>

Breakdown for my non-php friends: get a random length between 2-8[mt_rand()], generate a 24 character cryptographically alphanumeric string[bin2hex(random_bytes())], concatenate 2-8 special characters [randspec()], shuffle the special characters into the alphanumeric string [str_shuffle()] - in the entire process, random_bytes() is the only function returning cryptographically secure pseudo-random anything.

My gut tells me that because a large portion of the string is cryptographically secure, and that the addition should only increase complexity, it should only make it harder for a third party to determine the result of the pseudo-random operation - but I also know cryptography is... let's say 'like karma', and I'm no expert. Could the addition of the poorly-generated special characters create a pattern that is (relatively) trivial to predict, or does the quality of the "base" string keep it as random as pseudo-random gets?

Adding an example of Gh0stFish's recommendation for good measure - this is a far better (and simpler) method to generate the same type of varied length string - using this makes my question irrelevant. Open to any input on expanding the possible char set, and whether that's worthwhile, practically.

<?php

    function rand_string($len){
        $chars = str_split("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890!@#$%^&*()-_=+[]}{\\|;:/?.>,</`~");
        $r_string = "";
        for($i = 0; $i < $len; $i++){
            $r_string .= $chars[random_int(0,82)];
        }
        return $r_string;
    }

    $strlen = random_int(26,32);
    echo rand_string($strlen);

    die;

?>
TCooper
  • 336
  • 1
  • 8
  • Your new string will have at least as much entropy as the cryptographically secure part of it. Whether or not that is sufficient depends on what you are doing with it. – nobody Aug 28 '21 at 08:36
  • @nobody I think one part that is unclear is why he is doing this in the first place. Another thing that is unclear is the question in the title of the post: "Does concatenating/shuffling a non-cryptographically secure string to a cryptographically secure string result in a cryptographically secure?" To me, that doesn't even parse as an English sentence... it's missing a noun at the end? Also, even if I try to make sense of that sentence it doesn't seem possible without context. "Secure" for what purpose? – hft Aug 30 '21 at 18:36
  • @hft, I recently modified the title, it was actually grammatically correct until a few minutes ago - will update again. Thanks for the input. As mentioned in below comments, I do this for personal passwords, which I have no doubt it's fine for - but was curious about the implications if it were to be used for something more broadly. In that sense, it's a general question/theoretical. – TCooper Aug 30 '21 at 18:40
  • @hft I should say if it were used for encryption, not "more broadly". – TCooper Aug 30 '21 at 18:46
  • For you generating your own personal passwords, sure there is not much of a problem. The password is seemingly at least as strong as the random password from random_bytes, and if that is strong enough for you, then go for it. But for "encryption," no--no, it doesn't make sense to use this scheme for encryption. – hft Aug 30 '21 at 19:22

1 Answers1

3

This type of approach can reduce the overall randomness of the string, but whether or not this will be significant depends on exactly how you're using the string.

We know that str_shuffle() not cryptographically secure (because it tells us in the documentation), so imagine it has a bias that means the first character in the output string will always be one of the added symbols. And we know that mt_rand() is also not cryptographically secure, so imagine that the first element (the ! symbol) is far more likely to be chosen than any other. This would mean that the first character of your generated string is more likely to be a ! than any other character - so you have lost randomness by doing this compared to the original string generated from random_bytes().

It would still be difficult for them to guess the entire string (because bits of it are securely generated), but there are cases where knowing the first character is still very useful (such as if you were using it as a key to XOR something).

Obviously the biases in these two functions aren't that simple, but it should give an illustration about why added less-random data in a less-random way can reduce the overall entropy. Cryptography is really hard to get right, and you can get all kinds of subtle and unintuitive issues, so it's really best to avoid rolling your own wherever you can.


If you're trying to securely generate a random string with a mixture of hex characters and symbols, a much simpler approach would be:

  • Create an array of the characters you want (a-f, 0-9 and the symbols).
  • Use random_int() to select as elements from the array and concatenate them into a string.
Gh0stFish
  • 4,664
  • 14
  • 15
  • *"an reduce the overall randomness of the string"* - reduce it compared to what? Compared to a 24 byte secure random string or a 26 byte secure random string? – nobody Aug 28 '21 at 08:21
  • @nobody the overall entropy will be lower than a 26 character random string, and the entropy per character will be lower than both. Which of those is more important will depend on what OP is doing with the string. – Gh0stFish Aug 28 '21 at 09:02
  • @nobody if (for example) the insecure symbol generation/shuffling always adds `!!` at the start of the string (followed by the 24 securely random characters) then the overall entropy of the string is significantly less than it would be for 26 random characters, and the entropy of those first two characters would be zero. Although obviously the PHP functions aren't actual that bad. – Gh0stFish Aug 28 '21 at 09:35
  • Wait, sorry, I got confused between the 24 and the 26 by the time I was writing my second comment. :) – nobody Aug 28 '21 at 09:45
  • 2
    Ah, that makes more sense. You're correct that the *total* entropy will never drop below that of the 24 characters random string, but the presence of the non-random characters might end up being dangerous in interesting ways. – Gh0stFish Aug 28 '21 at 09:54
  • 1
    Thanks for the explanation, my gut told me one thing, but still had that uneasy feeling something was off... luckily I've just been updating this lil script to generate personal passwords for myself, so I don't doubt they're fine for my purposes, but curious about the implications for more serious applications. Will definitely use the random_int() suggestion if I need this in the future, and will probably swap it out for mt_rand() in any/all use cases. – TCooper Aug 30 '21 at 18:16