18

I have a client who is looking to hold personal information such as Driving Licences and Insurance documents in order to verify if a user of the site is who they say they are and lives where they say they live (the site is a sort of brokerage)

We were looking at storing this information in an offsite solution such as amazon S3, obviously it will be encrypted before it is sent from our server and pulled down and decrypted when we need it but is this enough? Are there extra levels of compliance I need to meet?

Forgive me ignorance with this, I'm by no means a security expert and just want to know if this is something we should even be considering.

petedermott
  • 189
  • 1
  • 4
  • 4
    Compliance with who? Which regulations? In order to be compliant with my regulations, you need to share your root password. "Compliance" is a term that _ONLY_ has meaning for a specific authority. Is your client healthcare? Government? A third world gang? – MCW Jul 16 '14 at 11:54
  • You must absolutely point out the countries where the client resides and where all the expected users reside. In the EU that information would be considered sensitive personal information and answers very struct rules (must be stored inside the EU, to start with, which limits the number of cloud contractors you can rely on). I would actually advise consulting a lawyer for those things. – Steve Dodier-Lazaro Jul 16 '14 at 21:40

3 Answers3

21

From a purely technical perspective, if you encrypt it properly before you upload it, and it stays encrypted all the time it is in the cloud, then the data is very safe.

However, there are very probably extra levels of legal compliance you need to meet. This is "Personally Identifiable Information" and there are a lot of laws and regulations that apply to handling it. You need to take legal advice on what you need to do.

Note that you not only need to worry about the jurisdiction where your client is, but also the jurisdictions where the data is, and the jurisdictions where the users are. (That alone might make using S3 more of a challenge, since you'll have no way of controlling whether the data you put in there is in stored in Newark or Tokyo or some other place Amazon hasn't even told us about yet.)

Graham Hill
  • 15,394
  • 37
  • 62
  • Thanks guys, this all makes sense. We are only going to be working in the UK at the moment, this could obviously change at a later date but it wouldn't be the case for the foreseeable future. – petedermott Jul 16 '14 at 10:26
  • 3
    Especially when working from within the EU with a US-based provider is quite problematic because of the EU's more restrictive data privacy laws. You really should ask your companies attorney and/or data privacy officer. (Or get either one if you don't have them yet.) – Teetrinker Jul 16 '14 at 11:17
  • 2
    Try also to consult with Amazon on the matter, I'm sure a company spanning their cloud network across several continents has come across these questions before. –  Jul 16 '14 at 16:10
  • +1 to bringing in an expert. From the description it sounds like the OP wants to store a lot of data that the DPA and ICO may not consider necessary. – James Snell Jul 16 '14 at 20:51
  • Commented before I saw your answer. You summed it all up indeed! – Steve Dodier-Lazaro Jul 16 '14 at 21:41
4

Amazon EC2 allows you to specify where to store your data:

[http://docs.aws.amazon.com/AmazonS3/latest/dev/LocationSelection.html][1] Objects stored in an AWS region never leave that region unless you explicitly transfer them >to another region. For example, objects stored in the EU (Ireland) Region never leave it.

As far as how secure is "enough", you need to decide, however assuming that you do all the encryption prior to uploading it to amazon, then you have 100% full control (and also responsibility) over how secure it is.

Do note that it is generally advisable to keep copies of important data in different locations to prevent any potential loss / disruption.

user2813274
  • 2,051
  • 2
  • 13
  • 18
2

As @Graham Hill pointed out in his answer, encrypt it properly before it goes up the wire.

Amazon does allow you to specify that you'd like them to encrypt your S3 objects, but as they admit in their documentation, you can (and should) encrypt your information before it gets to them. Their own encryption that they add to your objects before/during save only potentially protects you against someone malicious accessing Amazon's servers: think an attacker gaining access to Amazon's internal data-centers. Even in this case, it's not so clear that you'd be protected, as if an attacker has root access to a machine, he can easily extract the encryption keys.

What Most (Security-Conscious) People Do

The best advice would be to encrypt your files using well-known and proven methods, most likely PGP/GPG encryption in this case. A pitfall of PGP encryption, however is that it is simple to determine the file type by inspecting the contents:

# encrypt myfile.jpg using PGP encryption to a new file called "things"
$ gpg --output things --encrypt --recipient me@myemail.com myfile.jpg
# what kind of file is "things"?
$ file things
things: GPG encrypted data

However, just because you know that it's a PGP-encrypted file doesn't mean that you know what it contains, especially if you give it a random filename.

For the Truly Paranoid

An even better1 way of doing things would be to create a file-backed container using TrueCrypt2, which is for all intents and purposes purposes a binary blob which doesn't really reveal what exactly it is. Plus, you can use TrueCrypt's hidden volume feature to gain yourself plausible deniability. Give it a fun name like this:

$ uuid | sha256sum - | cut -b 1-64
a42815e0a68efac65903cdbbbbf2875ee082d3e5f4f8ee831cbbe27606a8399f

And you'll have a binary blob with an incoherent, randomly-generated name which is a TrueCrypt volume (and possibly contains an additional hidden volume). Anyone who would download this file wouldn't be able to guess at what the file even is and if you use a significantly strong passphrase, you should be safe.

Want to be paranoid? Generate a ton of these volumes, each with its own name and its own significantly strong passphrase and put your PGP/GPG encrypted files inside of the volumes. The volumes will have static sizes and will significantly change as you modify files inside of them.


1 Possibly.
2 Scared about the security of TrueCrypt volumes? Read the security audit, and know that after reading the Phase I audit, Bruce Schneier still uses it.

Naftuli Kay
  • 6,715
  • 9
  • 47
  • 75
  • TrueCrypt is not completely audited yet. More importantly, a couple months ago TrueCrypt developer(s) removed the binary, put a notice of «Using TrueCrypt is not secure as it may contain unfixed security issues» and recommended using BitLocker. This has spawned lots of discussions on the security community, including suspicions that it is a warrant canary, that NSA was behind the project, they were sent a National Security Letter, and other theories. – Ángel Jul 16 '14 at 21:33
  • Regulatory requirements may disallow giving such data to Amazon, no matter how well they're encrypted. Jurisdiction is an important part here. – Peteris Jul 16 '14 at 21:37
  • @Ángel, this is true, but nothing has been substantiated yet. Reading the audit reveals that there are some things that don't help much (like file-based keys as an augment to passphrase) but the overall security makes sense and is based on cryptographically sound algorithms. On Linux, TrueCrypt simply defers over to dmcrypt for many operations. – Naftuli Kay Jul 16 '14 at 21:44
  • @Peteris This is also true, therefore follow [another answer's advice](http://security.stackexchange.com/a/63263/2374) of limiting _where_ your data goes. – Naftuli Kay Jul 16 '14 at 21:44
  • @naftuli-tzvi-kay, IMHO that shutdown by TC developers is enough reason to avoid it for new developments. You may still trust TrueCrypt, however mentioning the Phase I audit but not the later controversy seemed incomplete. ☺ – Ángel Jul 16 '14 at 22:20
  • If you'd like to contribute a edit, feel free :) – Naftuli Kay Jul 16 '14 at 22:32
  • Barring a complete breach of Amazon S3, anyone who's able to download files from your S3 bucket likely already knows what they're downloading. The random naming is good for its own reason (not accidentally naming your files, e.g., "customer-account-number.txt"), but it doesn't provide much of its own, it just prevents you from making other mistakes around the filename. – Chris Hayes Jul 17 '14 at 03:45
  • No. Do not put a TrueCrypt volume on S3. At least not if its something you're ever going to modify; that is *not* secure. See http://sockpuppet.org/blog/2014/04/30/you-dont-want-xts/ – derobert Jul 22 '14 at 21:01