18

I'm considering encrypting some small files (hundreds of kb apiece) using the AES Crypt reference implementation. Looking at the source, it seems that the encryption key is derived from the IV and the password by concatenating the two into a buffer which is then repeatedly hashed. So the IV is acting as the "salt" for the hash. By repeatedly, I mean 8192 times.

I understand that the benefit of this is to increase the time required to generate the key, making it more expensive to perform brute-force attacks to discover the password. I also understand that the drawback is that it takes longer to do the legitimate encryption and decryption tasks for the real users. Further, as both the users and attackers buy faster machines over time, the benefit and drawback will tend to zero.

So my question is, given current computer capability, and assuming a motivated attacker who doesn't own a dedicated cluster, is 8192 iterations insufficient, overkill, or "just right"? Also, have a missed anything in my analysis of this key generation: is there some other reason for choosing such a number of iterations that makes it a good choice?

kalina
  • 3,354
  • 5
  • 20
  • 36
  • 3
    What led you to aescrypt? The format is non-standard. It is really hard to get all the details right when doing crypto. Are you aware of any expert vetting of aescrypt? I don't know anything good or bad about it, but I wonder, have you considered pgp/gpg, which has gone thru the IETF to design a standard format http://www.ietf.org/rfc/rfc4880.txt and is flexible enough to handle other needs that are likely to arise? – nealmcb Jan 08 '11 at 01:50

5 Answers5

19

RSA's standard for password-based cryptography recommends "at least 1000" iterations, so the factor seems to be in the ballpark.

With 8192 iterations, you are essentially increasing the time needed to complete a brute-force attack by a factor equivalent to adding 13 bits of entropy to the password or 2 alphanumeric characters (with respect to brute-force attacks only). This is a good way of thinking about it because it is independent of increasing computational power.

The question is: given the security of the passwords you or your users will be using, is it feasible to brute-force it? If you add 13-bits to it, is it feasible to brute-force it?

If your users are using strong passwords, the answer is probably no to both. If your users are using a dictionary word, 13-bits is probably not enough to push it into secure territory. Your answers to these questions will change over time as computers become more powerful.

PulpSpy
  • 2,204
  • 15
  • 19
  • Thanks @PulpSpy, do you have a reference for calculating the effective extra entropy? –  Nov 20 '10 at 12:58
  • 5
    Just take the logarithm of the number of iterations to the base 2 (for bits) or base 62 (for alphanumeric). Essentially, you are assuming that in the time taken to complete 8192 iterations, you could have instead searched through 8192 possible passwords if there were no iterations (this is slightly wrong because the hash is only one part of the decryption--running decrypt also contributes a one-time cost). – PulpSpy Nov 20 '10 at 20:49
  • BTW, I realize this doesn't answer your request for a reference. It is a short computation and so I am not sure if there really is a reference or if it is just taken as standard. I checked Handbook of Applied Cryptography which includes a short discussion of iteration but doesn't do this computation. – PulpSpy Nov 20 '10 at 21:02
  • 3
    It is a rare user that uses alphanumerics that are so well distributed and independent as to add 13 bits of entropy with two characters. I'd say it adds more like 8 characters-worth, according to the NIST study cited at http://en.wikipedia.org/wiki/Password_strength#Human-generated_passwords – nealmcb Dec 23 '10 at 00:05
5

In case the password is a dictionary word, rainbow tables can be generated quickly and efficiently. With salted passwords, the size/time requirements to generate useful rainbow tables goes up. Multiple iterations of hash(salt+pass) is an easy way to greatly diminish the usefulness of rainbow tables. I think Apache's salted MD5 scheme does it 1000 times. The number is completely arbitrary, it's just significantly raising the bar for the amount of hardware you'd have to throw at the problem to defeat the hashing mechanism. Using a better hashing mechanism (and that means going beyond SHA-1, notice that's not even in NSA's Suite B anymore) is one way. There are also algorithms that let you pick how long you want it to take to hash something, that'd be directly addressing the 'crypto defeating speed creep' as a result of Moore's Law.

Marcin
  • 2,508
  • 1
  • 15
  • 14
4

If you are choosing encryption packages, I would recommend GPG or PGP over AES Crypt. GPG and PGP have received a lot of careful vetting from the cryptography community. As far as I know, AES Crypt has not. Therefore, I would trust GPG or PGP more highly than AES Crypt.

As for the number of times to iterate the password hashing, that's based upon how long you are willing to wait during encryption/decryption and how fast the hash function is. First, decide how long you are willing to wait: say, 100 msec. Then, choose a number of iterations that will cause the iterated-hashing process to take 100 msec on the computers you use. Done.

As @PulpSpy has accurately explained, if you iterate the hashing N times, then this increases the effective entropy of the secret by approximately lg(N) bits, where lg represents the base-2 logarithm. For instance, 8192 = 2^13 iterations corresponds to 13 more bits of effective entropy. It's not easy to translate this into an effective passphrase length. (Contrary to what @PulpSpy wrote, 13 bits of entropy does not translate directly into 2 characters, since not every character of every passphrase has the same amount of entropy. As @nealmcb correctly explains, the entropy per character of a passphrase varies, but is probably typically in the range of 1-2 bits per character, so this might correspond to an increase in passphrase length somewhere in the rough vicinity of 8 characters.)

Lastly, I will warn you that choosing your cryptographic key from a passphrase is generally regarded as risky practice. Human-chosen passphrases often do not provide sufficient entropy to protect against brute-force attacks on the encryption. Therefore, I do not recommend human-chosen passphrases for high-security measures (or, if you must use a human-chosen passphrase, use software to help you choose a high-entropy passphrase). The best cryptographic software tends to use true randomness for its crypto keys where ever possible, rather than human-chosen passphrases.

D.W.
  • 98,420
  • 30
  • 267
  • 572
  • 1
    Why specifically do you not recommend the aescrypt implementation? It does offer integrity checking. –  Jan 08 '11 at 10:33
  • 1
    Graham, in the average case, the security of a scheme is a monotonically increasing function of the number of smart people who've tried to break it. Since "aescrypt" returns 5,000 hits on google, and "gpg" returns 3,000,000, the conservative conclusion is that gpg is more secure. – user502 Jan 10 '11 at 19:44
  • @Graham, I don't know of any specific problem with aescrypt. However, crypto is tricky. GPG/PGP are well-audited. As far as I know, the aescrypt reference implementation is not. In my experience, crypto packages that are not well audited often have subtle security problems. (Even the well-audited ones sometimes have security problems, but the risk is lowest if you use a well-audited security package that has been vetted extensively by knowledgeable cryptographers and security experts. GPG/PGP fall into that category; aescrypt doesn't.) – D.W. Jan 12 '11 at 02:07
4

The number of iterations should be adjusted depending on your available power, so that processing time still fits under the hard constraint of limited user patience; you still want to make the number as high as possible.

With mathematics: if processing a password takes time T on your system (e.g. 1 second), but the attacker is ready to spend T' to crack a password (e.g. 3 weeks) and has access to machines which are, collectively, P times as powerful as yours, then the attacker has an advantage of a factor T'*P/T over you. The defence against that is the password entropy: if a password has entropy e bits, then the attacker is (on average) defeated if and only if T'*P/T < 2e-1. Assuming T = 1 second, T' = 3 weeks, and P = 100 (the attacker has five PC with some GPU, for a budget under 3000$), then the password must have entropy at least 27.4 bits. You probably want to have a bit of a security margin and go for 30 bits. You can, as an informed individual, choose a password which is stronger than that (there is a bit of a discussion on that subject in this question); but it seems quite optimistic to expect most users to do as well.

In the equations above, the number of iterations is used to alter T. While the appropriate number ultimately depends on your available power and average user patience, chances are that 8192 iterations way too little; based on your description of the password transformation process (repeated hashing -- this looks like a reduced version of PBKDF2), the number of iterations on a typical PC should be counted in millions.

Speaking of which, a custom password-processing technique should be a strong red warning: this is not good. It is much better to rely on proven standardized algorithms such as PBKDF2 or bcrypt (the latter should be preferred since it lowers the "processing power factor" P of an attacker who uses GPU -- see this answer for details).

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
2

It's not "necessary", but it does make cracking passwords harder. It's called "key strengthening" -- it's logically like adding two characters (uppper/lower num symbol) to the end of the password. It's quite common. For example, the WPA that protects your access point hashes the password 4000 times.

Robert David Graham
  • 3,883
  • 1
  • 15
  • 14