I have made some software that encrypts and decrypts files using AES. I was wondering if it is safe to write a hash of the password to the file so that the program can check if a password is correct. The reason I am worried about this is because, without the hash, for a file that takes ten seconds to decrypt, one million attempts at cracking the encryption take ~115 days. Whereas, with the hash, regardless of the size of the file, one million attempts will take roughly 5 days if one attempt takes 0.5 seconds (much more than reality). This is a huge difference, hence my concern: Is it less secure to write the hash of an encrypted file's password to the file?
-
1If you just add the hash of the password to your file then you are vulnerable to rainbow table attacks. You might consider to safe your password with a salt. Then you might be as safe as only encrypting (an attacker still could decrypt the first bytes and look for magic numbers of common file formats). I'm no expert on this topic but I think you could also use an HMAC for this. – Sirac Jul 05 '15 at 17:53
-
@Sirac The encryption process does involve making a Rfc2898 hash with a salt. The hash I will write to the file will also be salted. Why would the magic numbers be important? – DividedByZero Jul 05 '15 at 18:20
-
See Thomas' answer for an explanation about the magic numbers. – Sirac Jul 05 '15 at 18:28
1 Answers
For password-based encryption, you need to:
- transform the password into a key suitable for the encryption algorithm (a process called key derivation);
- use that key to encrypt the file.
Assuming that everything about the encryption phase was done properly, and the used algorithm is not weak, then the most direct attack route is the password: the attacker will try potential passwords, applying key derivation then decryption, to see if the result "makes sense". The important point here is that the attacker does not need to decrypt the whole file to see if it looks like the expected cleartext or not; for instance, if the attacker infers that the file is a JPEG picture, then he just needs to decrypt the first four bytes to see if these yield the expected starter sequence for such a file (FF D8 FF E0). Even if the honest user wants the complete file, and thus will need to decrypt it whole (implying I/O for reading the file and writing it back), the attacker is in no way constrained to do the same during his attack.
Therefore, you MUST NOT assume that the whole-file decryption speed makes the attack slower or faster. In that sense, providing a "fast" test for password correctness will not make things easier for the attacker either.
However...
You must take care to do things correctly. First and foremost, the easiest way to design a secure format for password-based file encryption is to use an existing format for password-based file encryption, that has already been designed, analysed, implemented and tested. A reasonable candidate is then OpenPGP, and (in particular) its open-source implementation GnuPG.
If you still insist on doing your own design and then implementation, then you must understand that the core of your security will reside in the key derivation process, also called password hashing. There is a lot of theory and practice on the subject, so I suggest you start by reading this answer.
Now, if you got your password hashing right, with all the iterations and the random salts, then you could envision a process where the password is derived into a key K of, say, 256 bits (32 bytes). Then split that key into two halves K1 and K2 (128 bits each). Add a copy of K1 into the encrypted file header, to test for password correctness; and use K2 (not K1 !) for actually encrypting the file. This is safe as long as the password-based key derivation function is secure as a key derivation function.
Of course, if you need encryption, then (presumably) there are attackers, and attackers are, on the whole, known to be naughty. They don't play by the rules. Most notably, if they can see encrypted files, then chances are that they can modify encrypted files. To reliably detect such modifications, you will need a MAC, and then things become complex. Potential for catastrophic design error is high.
- 320,799
- 57
- 780
- 949