There is a task to design a system to store sensitive data securely (should be HIPAA compliant in the future). It's just a draft - this will not be used in production in a foreseeable future. I have a prototype inspired by TrueVault and want to know if there is some semantic security lacks or violations of security concepts in it.
So the system consists of 4 subsystems:
Encryptor/Decryptor (Cryptor) is responsible for random key/iv generation, encrypting and decrypting binary data with AES-256-GCM algorithm (OpenSSL implementation). This server performs only in-memory operations and stores the result inside 3 other subsystems and is connected with them via IPSEC or SSL VPN. Three other subsystems have no direct connection between each other. External client uses only encryptor/decryptor public interface and is not directly connected to other subsystems.
Public interface:
- dump(client_binary_data) -> external_uuid
- load(external_uuid) -> client_binary_data
DataStore stores [data_store_uuid, encrypted_data, auth_tag] triplet.
- dump(encrypted_data, auth_tag) -> data_store_uuid
- load(data_store_uuid) -> [encrypted_data, auth_tag]
KeyStore stores [key_store_uuid, key, iv] triplet.
- dump(key, iv) -> key_store_uuid
- load(key_store_uuid) -> [key, iv]
MapsStore stores map between DataStore triplet, KeyStore triplet and external_uuid: [external_uuid, data_store_uuid, key_store_uuid].
- dump(external_uuid, data_store_uuid, key_store_uuid)
- load(external_uuid) -> [data_store_uuid, key_store_uuid]
Workflow:
- Cryptor.dump(binary)
- generate external_uuid
- generate random key
- generate random iv
- use external_uuid as AAD for AES-256-GCM
- encrypt client_binary_data -> encrypted_data
- derive auth_tag
- KeyStore.dump(key, iv) -> key_store_uuid
- DataStore.dump(encrypted_data, auth_tag) -> data_store_uuid
- MapStore.dump(external_uuid, data_store_uuid, key_store_uuid)
- Return external_uuid to the client
- Cryptor.load(external_uuid)
- MapStore.load(external_uuid) -> [data_store_uuid, key_store_uuid]
- KeyStore.load(key_store_uuid) -> [key, iv]
- DataStore.load(data_store_uuid) -> [encrypted_data, auth_tag]
- Decrypt data and return it to client
Main questions I'm already in doubt with:
- is there better/more common/trusted way to encrypt and store data. It should be as fast as possible. Maximum 50 MB blobs are expected to work with.
- should iv be stored in KeyStore subsystem or in DataStore subsystem. Is there any difference between these two approaches? NIST says here (page 16) that iv is part of the message. I think that the "message" term is closest to information stored inside the DataStore rather than the KeyStore.
- is it safe to use external_uuid as AAD in this scheme? Or should I add another random uuid for that purpose to MapVault
- should I encrypt keys in KeyStore by client public key or some master key? Seems this approach used in Oracle TDE scheme. I think encryption with client's public key will make impossible to restore the data even if all three subsystems are stolen.