2

There is a task to design a system to store sensitive data securely (should be HIPAA compliant in the future). It's just a draft - this will not be used in production in a foreseeable future. I have a prototype inspired by TrueVault and want to know if there is some semantic security lacks or violations of security concepts in it.

So the system consists of 4 subsystems:

Encryptor/Decryptor (Cryptor) is responsible for random key/iv generation, encrypting and decrypting binary data with AES-256-GCM algorithm (OpenSSL implementation). This server performs only in-memory operations and stores the result inside 3 other subsystems and is connected with them via IPSEC or SSL VPN. Three other subsystems have no direct connection between each other. External client uses only encryptor/decryptor public interface and is not directly connected to other subsystems.

Public interface:

  • dump(client_binary_data) -> external_uuid
  • load(external_uuid) -> client_binary_data

DataStore stores [data_store_uuid, encrypted_data, auth_tag] triplet.

  • dump(encrypted_data, auth_tag) -> data_store_uuid
  • load(data_store_uuid) -> [encrypted_data, auth_tag]

KeyStore stores [key_store_uuid, key, iv] triplet.

  • dump(key, iv) -> key_store_uuid
  • load(key_store_uuid) -> [key, iv]

MapsStore stores map between DataStore triplet, KeyStore triplet and external_uuid: [external_uuid, data_store_uuid, key_store_uuid].

  • dump(external_uuid, data_store_uuid, key_store_uuid)
  • load(external_uuid) -> [data_store_uuid, key_store_uuid]

Workflow:

  • Cryptor.dump(binary)
    1. generate external_uuid
    2. generate random key
    3. generate random iv
    4. use external_uuid as AAD for AES-256-GCM
    5. encrypt client_binary_data -> encrypted_data
    6. derive auth_tag
    7. KeyStore.dump(key, iv) -> key_store_uuid
    8. DataStore.dump(encrypted_data, auth_tag) -> data_store_uuid
    9. MapStore.dump(external_uuid, data_store_uuid, key_store_uuid)
    10. Return external_uuid to the client
  • Cryptor.load(external_uuid)
    1. MapStore.load(external_uuid) -> [data_store_uuid, key_store_uuid]
    2. KeyStore.load(key_store_uuid) -> [key, iv]
    3. DataStore.load(data_store_uuid) -> [encrypted_data, auth_tag]
    4. Decrypt data and return it to client

Main questions I'm already in doubt with:

  1. is there better/more common/trusted way to encrypt and store data. It should be as fast as possible. Maximum 50 MB blobs are expected to work with.
  2. should iv be stored in KeyStore subsystem or in DataStore subsystem. Is there any difference between these two approaches? NIST says here (page 16) that iv is part of the message. I think that the "message" term is closest to information stored inside the DataStore rather than the KeyStore.
  3. is it safe to use external_uuid as AAD in this scheme? Or should I add another random uuid for that purpose to MapVault
  4. should I encrypt keys in KeyStore by client public key or some master key? Seems this approach used in Oracle TDE scheme. I think encryption with client's public key will make impossible to restore the data even if all three subsystems are stolen.

1 Answers1

1

You're actually best off using an existing product. There's plenty of risk in rolling your own crypto storage system. So many ways to make an error in security or data integrity it's not funny. One commercial product off the top of my head is StorageSecure by Safenet. They specialize in this stuff. There are also quite a few academic and open source projects you can draw on if you're insistent on homebrew.

Fabric Project
http://www.cs.cornell.edu/Projects/fabric/
Has a secure programming language, distributed architecture, protection of computations, and protection of data. Free IIRC.

Tahoe-LAFS
https://tahoe-lafs.org/trac/tahoe-lafs
Distributed, fault-tolerant, encrypted storage architecture that allows clients to insure security of data despite compromised storage nodes. Free.

Securing data in storage: A review of current research
http://arxiv.org/pdf/cs/0409034.pdf
A whole bunch of schemes and comparison of protections.

Encrypted storage of medical data on a grid
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.163.9593&rep=rep1&type=pdf
Right up your alley, eh?

I'm leaving off various full system, file system, and file level encryption tools as I figure you know of them. However, I will mention that a poor man's solution to app-level data encryption is to give each app and/or classification category an encrypted volume a la eCryptfs or Truecrypt. And then restrict it to that partition using permissions, mandatory access controls, etc. If you know how to use the tool and read/write to a file, you have encrypted storage. You also know it will work reliably as each will have been battle tested plenty.

Nick P
  • 667
  • 4
  • 4