Practical steps for encrypting an encryption key for data stored in MySQL

Question

So I'm working on my first project where user data is stored as encrypted using the user's password as (part of) the key. I have read many similar things about best practices, but the details seems to be glossed over for someone like me who isn't as familiar.

I have read that it's best to:

Have a strong hash (bcrypt) of the user's password for validation. Stored in DB
Encrypt private data with AES using a random 256-bit key ("KEY1"). Not stored
Encrypt KEY1 with AES ("Enc_KEY1") using a strong hash derived from the user's password as "KEY2". Different salt than Step 1. Store Enc_KEY1 in DB. Do not store KEY2 anywhere.

There are a two issues I'm not able to find answers about:

The string length of a bcrypt hash isn't compatible as an AES key because of length and illegal characters such as periods and dollar signs. How does one deal with this problem?
When it comes to decrypting Enc_KEY1, I have to first generate KEY2 from a hash (bcrypt) of the user's password... how exactly do I recreate the hash? Am I supposed to store the salt? It seems like bcrypt wants to create a random salt each time... but unless the salt is stored, there's no way to recreate the same hash... Is it bad to store the salt? What am I missing here?

Thanks for the help.

score 2 · Accepted Answer · answered Aug 04 '15 at 06:50

The string length of a bcrypt hash isn't compatible as an AES key because of length and illegal characters such as periods and dollar signs. How does one deal with this problem?

Base64 encoding is your friend. Base64 encoded character are all legal characters with AES

When it comes to decrypting Enc_KEY1, I have to first generate KEY2 from a hash (bcrypt) of the user's password... how exactly do I recreate the hash? Am I supposed to store the salt? It seems like bcrypt wants to create a random salt each time... but unless the salt is stored, there's no way to recreate the same hash... Is it bad to store the salt? What am I missing here?

Yes, you need to store the salt (and optionally pepper if you are using AES to encrypt the password hash). Without it there is no way to re-make the hash.

Can you elaborate on the pepper? I'm not planning on encrypting the user's password hash... seems like overkill to me, the newb. — thequeue, Aug 04 '15 at 14:05

score 1 · Answer 2 · edited Oct 07 '21 at 06:58

If you are looking for details,

RFC 4880 describes all the details for implementing OpenPGP, which as far as I can tell is the most common method of encrypting data with one key (KEY1), encrypting that key with a second key (KEY2) to generate an encrypted key ("Enc_KEY1"), storing both the encrypted data and the encrypted key. Later only people who have (or can guess) the second key (KEY2) are able to decrypt the first key (KEY1) and recover the plaintext.

Because there is no need for the salt to be secret, the most popular method for storing hashes of pass phrases -- the Modular Crypt Format -- always stores the salt. There seem to be lots of people who talk about "not storing the salt".

You might find Jacco's essay educational: "What is a salt? Why is salting a hash useful? There is no need for the salt to be secret. Why does the salt have to be random?".

Taking a pass-phrase typed by a human and converting that into an encryption key is called "password-based key derivation". Functions that do that, such as bcrypt, are called key derivation functions.

As you have already discovered, most modern bcrypt implementations have a "hash" function that, when given a plaintext pass phrase, pulls a fresh random number, does a bunch of work with that random number and the pass phrase to generate a hash value, and spits out a string that is the concatenation of: "$2b$" (which indicates "bcrypt"), a cost parameter, another "$", a 22 character salt string (the base-64 encoded random number), and a 31 character checksum (the base-64 encoded hash value). Every time you call that "hash" function, even with exactly the same plaintext pass phrase, it pulls a fresh new random number -- so the salt string and the checksum is almost certainly going to be different.

Concatenating all those values together gives a long string that is convenient for storing all at once in a password-verification database. Those last 31 characters represent 23 bytes (184 bits) of data, encoded in normal base-64 encoding;. Encryption keys for the raw, underlying AES128 are a series of 128 bits (16 bytes); those bytes can be any byte (including 0x00); there are no "illegal characters". However, some implementations of AES128, in order to be "helpful", want those 128 bits in the form of 32 hexadecimal digits -- see https://stackoverflow.com/questions/14368374/how-to-turn-64-character-string-into-key-for-256-aes-encryption for some details. Feel free to ask on StackOverflow if you need help converting from a base64 string to a string of hexadecimal digits. Also, other implementations of AES128, in order to be helpful, include a key-derivation function that converts any arbitrary plain-ASCII string into the 128 bits needed for the raw, underlying AES128 algorithm.

Because most library maintainers want to help people Do the Right Thing, they emphasize the above "hash" function and a "verify" function, which is everything most programmers need to properly implement pass phrase hashing. For your application, however, you also need a function that takes a stored salt value and a stored iteration count and a pass phrase and gives you the actual hash value -- many implementations of bcrypt include such a function, including jBCrypt, bcrypt-nodejs, OpenBSD bcrypt, Openwall bcrypt, etc., generally as an internal function used to build up the above essential two functions. The function that you need is generally de-emphasized in the documentation.

You may also be interested in "Deriving Keys for Symmetric Encryption and Authentication" and "What's the most secure way to derive a key from a password repeatably?"

Your application sounds like it's using the same key for both authentication and encryption -- you may be interested to know that many people recommend using completely independent keys for these two purposes: Why should one not use the same asymmetric key for encryption as they do for signing?

Could you clarify what kind of timing attack vulnerabilities you are referring to in the "verify" function, given that it is comparing hashes, rather than passwords? — Shaun the Sheep, Aug 11 '15 at 16:11
@LukeTaylor: You are right. I was mis-applying the advice from ["A Lesson In Timing Attacks (or, Don’t use MessageDigest.isEquals)"](http://codahale.com/a-lesson-in-timing-attacks/). That essay says comparing hashes is vulnerable to a timing attack, but I'm pretty sure you are right that it doesn't apply to bcrypt-hashed passwords. — David Cary, Aug 11 '15 at 18:30

Practical steps for encrypting an encryption key for data stored in MySQL

2 Answers2