6

I'm wondering about some of the semantics and security implications of using something like scrypt or bcrypt to "enhance" a password protecting a PGP private key. Essentially, I'm asking about the implications of using the scrypt of a password as the password for my PGP private key.

From my understanding, a PGP keypair is essentially just two large primes and their product, and the private key is symmetrically encrypted.

My current thoughts are that this is a very good idea, for two main reasons. First, this makes your password unsusceptible to dictionary attacks, as the output of scrypt is VERY unlikely to be a dictionary word (and you could check it).

The second, and more complex reason, is the computational power required in a brute-force attempt.

Let's assume that the cost of checking a single guess for the (decrypted) prime stored in the private key is A. Assume that, given a compromised encrypted private key, the cost of checking a single guess is B. Assume that the cost of computing scrypt is C.

Brute-forcing a 2048-bit prime would cost, at most, pi(2^2048) * A, where pi(x) is the number of primes less than x. Assuming the private key is compromised, and assuming a not-unreasonable upper-limit of a 20-character (160 bit) password, the maximum cost to brute force the password is 2^160 * B. Assuming a compromised key with a password known to be the result of a N-character (N*8 bit) scrypt, the maximum cost to brute force the password is 2^160 * C * B OR 2^(N*8) * B (because the attacker could either attempt to use scrypt or just crack the hash itself).

Note that using standard estimators, pi(2^2048) ~= 2.28*10^613.

Let's fix A = 1 for the purpose of computation, and we'll assume B >= 2 which seems reasonable given the low-cost of integer multiplication.

With scrypt, C is variable. If we consider C = 10 (it is 10 times harder to compute scrypt than to multiply two large numbers), then:

  • Cost to brute force the key: 2.28*10^613
  • Cost to brute force the password: 2^160 * 10 * 2 = 2.9 * 10^49 OR 2^(N*8) * 2 = 2^(8N +1)

Here, 2^(8N +1) = 2.9 * 10^49 at around N = 21, which means that if C = 10, you only need a 21-character hash to make it easier to crack the input to the hash than the hash. Of course, it would still be beneficial to have the private key itself.

However, scrypt has varying difficulty, and 10 is probably a SUPER-conservative estimate. If we instead fix N = 128:

  • Cost to brute force the key: 2.28 * 10^6131
  • Cost to brute force the password: 2^160 * C * 2 or 2^1024 * 2

Here, 2^160 * C * 2 = 2^1024 * 2 at around C = 1.23 * 10^200, which is obviously completely unreasonable. This means that, for a reasonable hash length, it'll probably always be better to crack the input to the hash than the hash itself.

Obviously, if I use a value of N = 256, then as long as C > 1, it'll actually be easier to crack the key than it would be to crack the hash.

I think it would be reasonable to assume C >= 1000. With this assumption, and fixing N=256, and assuming our password is P characters:

  • Cost to brute force the key: 2.28 * 10^6131
  • Cost to brute force the password: 2^(8P) * 1000 * 2 or 2^2048 * 2

One would need around P = 255 to make cracking the password harder than cracking the key, which seems unreasonable.

So it seems that using a tough hash function like scrypt or bcrypt could greatly improve key-guessing difficulty, even with conservative estimates for their toughness. It also appears that hashing a password could help in a key-compromised situation, but not all that much.

Are there any other concerns I should consider? I'm currently considering a sort of "dual password" approach, where the key password is passphrase + hash(another passphrase). This seems like it would mitigate any bizarre mathematical properties of encrypting a prime with the result of a hash I may have overlooked.

Edit: Just preemptively, I'm not (entirely) an idiot. I am aware that PGP defines it's own hashing mechanism, S2K, but almost no one uses the "good" version of the hash (it isn't enabled by default), it suffers from a lack of good configurability, and it is easily crackable on accelerators like GPUs. So, yes, I'm suggesting "double hashing" here to a degree -- but scrypt uses a specific hash-and-grab approach with preset data, and S2K uses the good ol' fashion exponent-and-bitwise approach. If I remember my cryptology classes, there's no issue with the combining the functions as long as the output of one is within the domain of the other (which, in this case, it is).

Ryan Marcus
  • 163
  • 4
  • The biggest problem with this is that you will have to store your PGP password somewhere in a reversible form - it's *very* unlikely that you will be able to memorize and reliably reproduce the scrypt or bcrypt output you used as the password. Chances are you'll end up storing that password on a device that is regularly connected to the same system that houses your PGP keys. Then, if they have access to your PGP keys, attackers will just as easily have access to the file containing your PGP password... – Iszi Jul 25 '13 at 18:27
  • Oops, I wasn't clear super clear about this. I did not mean to suggest that one would memorize the output of `bcrypt` or `scrypt` -- that would serve no purpose, as memorizing random characters would be more effective. I met that a PGP frontend would take in one (or two, in my potential case) password and calculate the hash and use it as the password. – Ryan Marcus Jul 25 '13 at 18:29
  • ...Now, the protection on your PGP keys effectively boils down to whatever password or key you used to protect the PGP password - and *this* has to be a human-memorable one. At this point, you may as well have just used *that* password/key for the PGP keys in the first place. – Iszi Jul 25 '13 at 18:29
  • @Iszi - right, that makes complete sense. Above, I show that it will still (generally) be easier to crack the input to the hash than the hash itself. But the slowness of algorithms like `scrypt` should add additional cost the brute-force, I believe. – Ryan Marcus Jul 25 '13 at 18:30
  • Okay, I guess I was a bit confused (perhaps by an overdose of maths in your post) as to what sort of protection you're expecting to get from this method. You're not actually adding any *entropy* to your PGP password by this method. All you're doing is increasing the time required to test each possible password. Not necessarily a bad idea, but I thought the PGP symmetric key was *already* a hashed version of the password? – Iszi Jul 25 '13 at 18:44

1 Answers1

5

The usual problem with cascading password hashing functions is that they compete for CPU. Each password hashing function with a configurable iteration count should be set as high as possible, the limit being the combination of your available processing power and your patience, as a user. If you combine bcrypt (or scrypt) with what PGP will do (its S2K transform), then you cannot crank both settings as high as you would have been able to do with either one alone, because you have to share your CPU and patience resources between both functions. However, if you use a very imbalanced setup (e.g. a very small iteration count for S2K, or a non-iterated S2K, reporting the whole protection on the bcrypt/scrypt layer) then there is nothing wrong with using the output of bcrypt or scrypt as "password" for PGP, as long as no unduly space reduction occurs (not a problem here: PGP accepts very long passwords without truncating them).

There can be practical issues, though. In particular, bcrypt and scrypt are salted, which means that their output is a deterministic function of both the password and the salt. You need to keep the salt value somewhere, otherwise you won't be able to recompute your encryption key, even if you remember the password. The OpenPGP format does not include an easy-to-use emplacement for your extra salt.

I suppose you could take an opensource OpenPGP library (like GnuPG) and modify it to integrate bcrypt or scrypt, as a new kind of "S2K" (so not in addition to the usual S2K, but as a replacement). The format has room for 256 types of S2K with 11 slots reserved precisely for that kind of private experiment.


Your lengthy computations on RSA primes and so on are wildly off, for two main reasons:

  1. You don't "brute force" a RSA key, especially by trying possible prime divisors. A RSA key has a lot of mathematical structure, and it is broken by unravelling that structure, namely with integer factorization. This is widely more efficient than enumerating all primes (that would be a very crude factorization algorithm, known as "trial division", and we know much better). That's the very reason why we use RSA keys with more than 1000 bits, instead of sticking to smaller keys of 200 bits or so.

  2. There is no security level beyond "uncrackable with existing and foreseeable technology". The current threshold (assuming the whole of Mankind collaborating in a huge worldwide cracking effort) is probably around 2100 operations, i.e. close to 1030. Energy is the bottleneck here. Costs beyond such levels are just meaningless numbers. It makes no sense, or at least no practical sense, to paly with cost values of 10631 or 22048.

What people usually do is to aim at a "reasonable" security level, i.e. something which will not be crackable by Mankind, with some comfortable margin, for the next 40 years -- but not beyond. This points to 2048-bit RSA keys and password entropies of 80 bits (with suitable slowdown factors, i.e. what bcrypt or scrypt are about).

See this site to explore the cross-estimates of key strengths, depending on type and size, made by various bodies as a guidance for selecting key size. Oversized keys or passwords may incur serious overheads (if you double the size of a RSA key, private key operations are eight times slower; if you double the size of a password, then you have to remember twice as many characters).

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • Excellent response! Thank you! I might take a look at modifying some PGP implementations to use `scrypt` instead of simply a wrapper. The main reason I'm not satisfied with "maxing out" `S2K`is that it doesn't seem to scale in *space*, only in *compute*. Or am I mistaken? – Ryan Marcus Jul 25 '13 at 21:00
  • 1
    Indeed, the "iterated and salted S2K" is not a memory-hard function, so it is comparable to PBKDF2 in that respect. Attackers with GPU can get a boost. I could argue, however, that the main defence against attacks is to have a strong password; no space-hard functions with salt will save a weak password. A strong password with S2K is preferable over a weak password with scrypt. – Tom Leek Jul 25 '13 at 21:03
  • As for the math, I was attempting a bit of a back-of-the-envelope style calculation -- there obviously aren't 2^2048 possible 2048-bit RSA keys. However, I believe the overall premise -- that it'll always be easier to brute force the `scrypt` output rather than the input -- is correct. – Ryan Marcus Jul 25 '13 at 21:04
  • thanks again, that's definitely something I'll consider. – Ryan Marcus Jul 25 '13 at 21:06