13

One of our applications uses SQL Server for the back end, and that application requires a 4 digit numerical PIN for employees to be created that works with a key (more or less). The system itself defaults all PINs to 0000 because that's the magic value for "don't require a PIN" if the device you're using doesn't have a keypad.

We had an issue with employees not updating their PIN (and grew tired of waiting for the vendor to correct the issue) so we generate our own PINs with SQL Server using a scheduled job to update any PINs of 0000.

The PINs don't need to be secure (how secure is a 4-digit PIN, after all) they just need to not be trivial. However, there's no reason not to think about the problem with security in mind, and I like to take the time to understand what I'm doing at least semi competently:

The current solution (not by me!) was:

replace(str(round(rand(checksum(newid()))*10000,0),4), ' ','0')

There's a couple problems with this. Of course, neither rand() nor newid() are cryptographic functions, but the issue that actually requires the change is that this will occasionally generate a pin of **** because of quirks with the str() function.

Now, SQL Server does have the crypt_gen_random() function, which returns a fixed number of bytes of secure cryptographically generated numbers. That makes me think something like this should work:

format(cast(crypt_gen_random(2) as int) % 10000,'0000')

The problem that I see is that crypt_gen_random(2) is basically, "Pick a number between 0 and 65535." In that space, however, there's a bias for modulo 10000. There's 7 instances of every result between 0 and 5535, but only 6 instances of every result between 5536 and 9999. That's a 16% difference.

Now, I can think of a possible easy fix, but I'm not sure if there's a preferred method. I'm not a security programmer. Can simply increase the number of bytes generated?

format(cast(crypt_gen_random(7) as bigint) % 10000,'0000')

[Note: I'm using bigint instead of int because it supports numbers of up to 8 bytes long. I'm only generating 7 bytes instead of 8 to prevent negative numbers.]

If I generate 7 bytes instead of 2, I'm generating a number between 0 and 72057594037927935 (16^14). The problem is still there -- there's 7205759403793 values between 0 and 7935, but only 7205759403792 between 7936 and 9999 -- but that's so much dilution that it doesn't really matter. We're at like 10^-12.

Like I said, though, I'm not a security programmer. I'm a DBA, analyst, and administrator, and all I know about security programming is that I'm not qualified to know when I'm making a security mistake.

Is there a better way of doing this?


My solution for T-SQL essentially ends up looking something like this:

declare @candidate_pin int = cast(crypt_gen_random(2) as int)
while @candidate_pin not between 1 and 9999
    set @candidate_pin = cast(crypt_gen_random(2) as int)

select format(@candidate_pin, '0000') as [pin]

I was hoping to avoid using procedural logic instead of a simpler declarative statement, but I'm not going to try to force it into a recursive CTE or other declarative solution.

I'm not particularly concerned with performance here; if we need 100 new PINs in a quarter something strange is going on.

Bacon Bits
  • 233
  • 1
  • 6

2 Answers2

14

Yes, you need at least 14 bits to generate a number between 0 and 9999.

In order to avoid bias, you can generate 14 bits of random data using a good source of randomness. If the result is <= 9999, then you can use it as your PIN. If the result is bigger, throw it away and try again.

How much does this waste?

In the best possible case, your bits would align perfectly with your answer space, and you would not waste anything. You could generate a random value between 0 and 255 and all you had to do was to generate 8 random bits. Any result would be valid, assuming your random generator is good.

In the worst possible case, you would require a random number between 0 and 256, which means you need 9 bits, and thus have 512 possible values. In approximately 50% of these cases you would have to throw the result away.

In our example, our lower bound with 13 bits would only give us 8196 possible values, and our next possible target is 16384 possible values. Given that you want 10000 possible values, your possible space is more than one and a half times bigger than your target space.

But fear not, you are working with relatively small amounts of data. If you query 7 random bytes, you get 56 random bits, which means 4 "tries" to get a "good" value. Even if we would assume that you had a 50% chance (the worst case) to get a good value, in 4 tries you have a 93.75% chance of generating a good PIN. If you would be unlucky, you would have to query another 7 bytes, which would boost your chance to 99.6% to pick a good PIN.

Cool! Can I have a demo?

function GeneratePIN()
{
    // The chance to fail after 1000 rounds is 7.5 * 10^-1203
    for(int round = 0; round < 1000; round++)
    {
        // We generate 7 random bytes, which would give us 4 "tries"
        ulong randomness = SecureRandom.GenerateBytes(7);

        // We can iterate this 4 times before our 7 bytes are exhausted
        for (int try = 0; try < 4; try++)
        {
            ushort possiblePIN = randomness & 0x3FFF; //0x3fff is 0011 1111 1111 1111 in binary, so 14 bits set to 1

            // If we succeed, we return the PIN. Else we try again
            if (possiblePIN < 10000)
            {
                return possiblePIN;
            }
            else
            {
                // =>> is the right shift, assign operator
                // This will shift randomness 14 bits to the right
                randomness =>> 14;
            }
        }

        // If we reached this part, we were unlucky and need to try again.
    }

    // If we reach this part, we flipped a coin 1000 times and always got heads.
    // Maybe buy a lottery ticket
    throw new Exception("Buy a lottery ticket!");
}
  • Gotcha. Generate the minimum number of bits you need to fill the address space, and if your result is invalid, try a new set of bits. That does make more sense. I'll updated my question with my solution in T-SQL. – Bacon Bits Jul 17 '19 at 19:36
  • 2
    It would probably be better to remove the exception and keep this a (theoretically) infinite loop, rather than setting the upper limit at 1000 rounds. The chance that it will appear to lock up for any appreciable amount of time is infinitesimally small, but that's still better than throwing an exception. – forest Jul 18 '19 at 01:37
  • 6
    If you leave it in, you might consider changing the exception text to "This will only happen once every 50000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 times" to avoid confusing folks who (eventually) get the exception. – A C Jul 18 '19 at 05:37
  • @forest Yes, probably. I just felt bad at making an infinite loop. It's all theoretical anyways, i doubt that this code should be used for anything prod-related and just purely demonstrates the idea. But if it leaves you no rest, feel free to make it an infinite loop :D –  Jul 18 '19 at 07:19
  • `=>>` --> `>>=` – RiaD Jul 18 '19 at 08:44
  • 3
    @RiaD Not in my pseudo-language :D –  Jul 18 '19 at 08:49
  • 1
    @AC `throw new Exception("Buy a lottery ticket!");` – smitelli Jul 18 '19 at 11:51
  • I think there is a typo: You need to shift "randomness" and not "possiblePIN" – Falco Jul 18 '19 at 12:40
  • @Falco Fixed it –  Jul 18 '19 at 12:50
9

First of all, I totally agree with MechMK1 on the proper algorithm to generate cryptographically secure uniformly distributed random number between 0 and 10000. However:

there's no reason not to think about the problem with security in mind

Lets. Random number between 0 and 10000 contains 13.29 bits of entropy. Random number between 0 and 2^14 % 10000 contains 13.22 bits of entropy, we lost 0.067. This is equivalent to reducing the number of pins from 10000 to 9546. Continuing the calculations:

  • CSPRNG(0, 2^14) % 10000 equivalent to CSPRNG(0, 9546), 0.067 bits of entropy lost
  • CSPRNG(0, 2^16) % 10000 equivalent to CSPRNG(0, 9971), 0.004 bits of entropy lost
  • CSPRNG(0, 2^24) % 10000 equivalent to CSPRNG(0, 9999.99964) [now the difference is less than 1 key, so purely statistical in nature], 0.00000005 bits of entropy lost
  • CSPRNG(0, 2^32) % 10000 equivalent to CSPRNG(0, 9999.999999995), 0.0000000000008 bits of entropy lost [order of magnitude accuracy here, I'm afraid float rounding could've played larger role than calculations]

In other words, use format(cast(crypt_gen_random(3) as int) % 10000,'0000') and regain the lost 51 nanobits of security by using saved implementation time to check that admin's password isn't somehow left blank.

All of this to say that:

  • 13.29 bit keys offer laughable security in the first place. I'd say just leave them as 0000 and spend that saved time and mental capacity to perform a tiiiiny internal security audit? I bet you'll find less secure passwords in more important places.
  • KISS really does work
Andrew Morozko
  • 1,759
  • 7
  • 10
  • It's really not about security at all, is the thing. The PIN almost a *username*, not a password. You scan the keycard/fob, enter your PIN, the system confirms that that PIN and that keycard are authorized for that device, and then you're in. It's intended to limit the problem of people finding lost fobs and using them. However, the devices don't log anything about the fob, but they do log the PIN. Certain PINs are approved to work with certain devices, and we can't disable PIN 0000. PINs also aren't unique (remember employees can reset their PIN) but it does help narrow down who used what. – Bacon Bits Jul 17 '19 at 18:01
  • There are other systems in place that cover security, this just helps with auditing to some degree. We audit authentication at the authentication system; we just audit the device logs to see what PINs are being used, if anybody is trying to brute force PINs, etc. Truthfully, I'm just trying to think about the problem and wondering what I'm missing. – Bacon Bits Jul 17 '19 at 18:11
  • To think about the problem in security terms, that is. It's obviously complete overkill to go much beyond what we've already got. – Bacon Bits Jul 17 '19 at 18:19
  • @BaconBits OK, I got it. Well, there's no real reason beyond academic interest to use secure uniform random numbers for your purposes, that's my point. – Andrew Morozko Jul 17 '19 at 19:20
  • I'm more disturbed about the security implications of a system where the users can reset their PIN to something they choose, combined with logging of that PIN. Inevitably, someone is going to use their bank/credit PIN, and there will be a log of that, linked to a specific person. – Miral Jul 18 '19 at 05:17
  • 1
    @Miral and what *is* the security impact of that? Now coworkers can pickpocket each other to try the pin from system on a credit card? – Andrew Morozko Jul 18 '19 at 12:36
  • Or a visitor, or drive-by network hacker, yes. – Miral Jul 19 '19 at 00:24
  • @Miral I’d argue that if a visitor has access to your databases or a network hacker has access to your coworker's personal belongings then credit card theft is the least of your problems. Protecting users credit card PINs is the job of the users and their banks, not third parties. – Andrew Morozko Jul 19 '19 at 01:20