3

One of the things that has always bothered me about simple database data encryption: If the server is compromised, the database is effectively compromised. The attacker can use the same code as the app to query out the data as desired. A simple review of the app code will show where/how the key is stored and what the database connection parameters are.

Of course, encrypting the data can be helpful when storing off-server backups, etc., but for this purpose the backup files could be encrypted entirely, instead of fields, saving some headaches with queries, etc.

Generally, making sure the server is secure is the best practice. But, what if the data must be protected from everyone, sys-admins included (much like the way we treat user passwords, through hashing)?

Ultimately, I'm wondering, how can an encryption key be protected, so that it is not accessible by anyone without the secret, even the application code. The only approach I can think of is the key must be provided by the user at run-time (it isn't persisted on the host machine, period).

I imagine the first issue is that if a user has access to the server, there is probably a way that the secret could be sniffed at the point of entry, or if the key is stored in memory by the app (e.g. in a session variable) this could probably be exploited as well.

What are the other (likely glaring) problems with this approach? Is this just reinventing the wheel? If so, what is the best practice when data security is critical, even from super-user roles?

This question may be somewhat similar to: How to login and encrypt data with the same password/key, except that I'm asking if the concept is even a good one, not how to implement.

RogerRoger
  • 173
  • 1
  • 6
  • 1
    The best you can get is password based key generation. The user enters a password for the blob of data they want to encrypt and send to the server. Through some cryptographically secure scheme an encryption key is generated, and the data is encrypted. The encrypted blob is uploaded to the server via any means, preferably HTTPS/SCP/FTPS etc (for an added layer). If you have multiple blobs of data the user has multiple passwords. Yes this leaves things open for password attacks, but if you're already this paranoid it shouldn't be a problem. – RoraΖ Aug 25 '14 at 18:21
  • Hi Raz, I'm specifically interesting in encrypting data that is stored in a database, but your points are relevant. I'd considered password based key generation, however, there will be multiple users accessing the same encrypted data, so there will still need to be a single central 'key'. There would likely need to be multiple factor authentication, with a second factor being a secret question in order to gain access to the data (essentially a shared password). This password would also likely change frequently (every week), which would also require all the data to be re-encrypted each change. – RogerRoger Aug 26 '14 at 01:03
  • 3
    Please note that we're not a discussion forum but a Q&A. If you have additional questions, please ask them in new ones and link to this one in them, if it helps provide context. Answerers are not required to expand on their answers as you present new questions to the thread with edits. In fact, they're not even notified that you edited your question. More is explained in the [Help]. – TildalWave Aug 26 '14 at 03:16

4 Answers4

2

In order to make a live database that encrypts its data, the database itself would have to have access to the keys. By that token, any admin could su into the account and find them. Your concerns are valid.

Your idea of having the user present a secure key (RSA for example) if you have them do it over an SSH or SSL protected connection is not a bad idea. @Eric G mentioned Mylar in the comments below which does this type of thing, and more. It would be well worth examination.

If instead of encrypting the database, you encrypt the data before it gets entered into the database, then you would have a different possibility -- this is meant as a possible alternative only:

The data be encrypted before posting, and decrypted by a client program afterwards. The database would not be encrypted, only the data within it.

...or for the truly security paranoid-- do both: database encryption and data encryption, each with different methods.

More is described and discussed below in comments.

Jeff Clayton
  • 932
  • 7
  • 16
  • 1
    Encrypting the data on the application defeats the main purpose of a database. How can you issue a *SELECT* on a encrypted database? All queries will be fullscans anyway. – ThoriumBR Aug 25 '14 at 21:50
  • Correct, it would limit searches. ID or timestamp but not many other options. – Jeff Clayton Aug 25 '14 at 22:03
  • It would be important using this method to decide which fields would be encrypted. Not all may be necessary. – Jeff Clayton Aug 25 '14 at 22:08
  • External applications outside of the machine and additiona security measures within the machine in question would still be required to do more. Example being intrusion detection systems. – Jeff Clayton Aug 25 '14 at 22:12
  • @ThoriumBR, maybe you don't need to issue a SELECT on those fields. If for example it deals with employees, you may do searches by ids or names, but never by SSN, which is the field you are encrypting. – Ángel Aug 25 '14 at 22:36
  • @Ángel, great example. Exactly the type of thing I was driving at. – Jeff Clayton Aug 25 '14 at 22:39
  • I'm not concerned about querying the data. There are ways to address this at the application tier. However, this answer doesn't address my original question as it is simply describes ways of encrypting data in the database, not ways of limiting access. A database can be encrypted, but my requirement is for it to be accessed from an application, but without storing the key on the server at all (the key would be presented by the application user at run-time). – RogerRoger Aug 26 '14 at 00:58
  • Your question appeared to be more of a concept - as in what you said at the end: if the concept is a good one. The reality is that it is very hard to protect keys on the server, but may really require the keys be off-server if you don't want people to hack in and see them - or sysadmins to find them, and also security via intrusion detection methods to track problems if they occur. Does that make sense? – Jeff Clayton Aug 26 '14 at 01:04
  • 1
    @ThoriumBR - you can do queries on encrypted data in some cases, if you use homomorphic encryption. There are also different use cases depending on which subjects/users will need full data access. See: http://css.csail.mit.edu/cryptdb/ and http://css.csail.mit.edu/mylar/ – Eric G Aug 26 '14 at 01:15
  • I made a few clarity edits. – Jeff Clayton Aug 26 '14 at 01:26
  • @ThoriumBR - great info, more great possibilities. – Jeff Clayton Aug 26 '14 at 01:31
  • @Eric G - Mylar is a great option for security key presentation – Jeff Clayton Aug 26 '14 at 01:50
  • @JeffClayton Yes, my original concept was for the key never to be stored (persisted) on the server. If the key is on the server, it is impossible to protect. If the key is presented at run-time, it gets better. Mylar is the most promising suggestion so far, but some sort of care would need to be taken to ensure the decrypted data isn't round-tripped back to the server via malicious client side script. – RogerRoger Aug 26 '14 at 02:36
  • Wouldn't the client side script then also need the key to get in? The one thing you cannot really protect against is someone giving away the key. You would have to find other ways to test the user is who the user is supposed to be. That AND the use of a key is much more 'safe'. Adding a captcha would help verify it is not a script. – Jeff Clayton Aug 26 '14 at 02:40
  • @JeffClayton Once Mylar decrypts the data in the browser, it is presumably rendered in the browser too (presented to the user). The app could easily be hacked to also send a malicious client side script (javascript) that forwards the decrypted content to a third-party. That said, I need to look into Mylar further, maybe they have a way of isolating the decrypted data in the browser, though I doubt it. – RogerRoger Aug 26 '14 at 02:49
  • Then what you are referring to is a hacked Mylar... different security issue. That may well be a good separate question for this forum. – Jeff Clayton Aug 26 '14 at 02:54
  • @JeffClayton True! I may ask it, after looking under Mylar's hood a bit. – RogerRoger Aug 26 '14 at 02:59
1

I think that the other answers are wrong or at best impractical.

In order to protect data in the database and have it accessible only by user request we can do the following.

Upon registration of the user we generate two hashes.

  • Hash A: Our password, bcrypt hash, stored in the DB
  • Hash B: Our encryption key, not the same password hash, not stored anywhere

We use Hash B to encrypt user data.

When a user logs in, we can create Hash B. We can use this hash to decrypt their data upon request. This hash can be kept in server memory, or a secure client side cache.

This way we ensure that all data is encrypted and only accessible by the user.

If a user wishes to change their password, we decrypt with Hash B, recreate Hash B, and then re-encrypt with the new hash in memory and update the database.

If the user loses or forgets their password they will lose their data. There are ways to guard against this.

The only way to prevent admins picking up unencrypted data or keys at some point is for the client to encrypt/decrypt on their end. The process I have outlined above is the most practical approach to protecting data.

  • 2
    This would be more efficient to use Hash B to encrypt a random key store in the database and used to encrypt the data. This way, when the password is changed, you just have to re-encrypt this random key, and not a potentially large number of database entries. Your scheme seems also to assume that users will never have to share any data with each other. – WhiteWinterWolf Jun 29 '16 at 14:08
  • @WhiteWinterWolf this was my initial suggestion but felt it was over complex. How would one use this key to share data between users? Admittedly I was only really considering small text-based data. A key would make more sense for large amounts of data. – Callam Delaney Jun 29 '16 at 14:19
  • With the scheme you described users cannot share any data, hence my comment. For this you would need the users to have a mean to share a common data encryption key securely, the exact mean will heavily depends on what your actual goal is. If you're interested in the existing, you can take a look at how Darknets work since they propose some implementations allowing people to share data sometimes without even the servers storing the data being able to decipher them. – WhiteWinterWolf Jun 29 '16 at 14:52
1

If you want to protect against a key or other secret being exposed as the result of a system compromise, you would look to a HSM to perform cryptographic activities. Since the HSM is essentially its own little computer, the HSM would then have to be compromised, which is much more difficult.

You may also use a distributed or n-tier architecture between the application, database, and decryption such that the compromise of one system does not result in a compromise of the other components (e..g, put the DB on its own server with less services and access (bastion host).

Other measures would depend on who actually need the plaintext, e.g. If just the subject/user who put the data in, you can use some type of local decryption via a password or local security device / cryptocard.


Edit: Also posted this in the comments to another answers, but relevant to this conversation is homomorphic encryption. Based on your comment below, if you only want specific subjects to have access, look at a system like Mylar from MIT. However, you change the landscape if you want to talk about data which needs to be access by both humans and service accounts.

Eric G
  • 9,691
  • 4
  • 31
  • 58
  • Hi Eric, I'm fairly certain that HSM or n-tier would only obfuscate access to encrypted data. I am seeking a way to ensure that the sys-admin (or anyone who gains access to the server) would have no ability to access the data. – RogerRoger Aug 26 '14 at 00:55
  • With an hsm the encryption/decryption happens on the hsm. In the case of n-tier, if you have multiple Sys Admins, don't give them all access. Your original question was about compromise of the systems. For the Sys Admin risk, you want to restrict the sys admin access and monitor using Privileged Identity Management with credential checkin/check out and constant password rotation. You need to apply defense in depth, one countermeasure may not address all vectors. – Eric G Aug 26 '14 at 01:11
  • The trouble is that if the app can access the data, be it via HSM or separate database server, simple coding can get the data out. In reality, even if the key is presented by the user (my proposed approach) the data is still vulnerable: if a user with access amended the app code to output any reports to plain-text the data is compromised. Which, in fact, may be the biggest weakness of this approach. I suppose the codebase could verified via a hash value, to ensure it wasn't tampered with... yuck. – RogerRoger Aug 26 '14 at 01:56
  • @RogerRoger - Again... defense in depth, multiple layers, balance of risk versus business need. These are all really different questions you are posing. If you want to address unauthorized changes, file integrity monitoring. Unauthorized use of data, DLP. – Eric G Aug 26 '14 at 01:59
  • @RogerRoger - some verification keys are very good, like the fingerprint from PGP. There are ways. What you have to do as an admin then is to decide what level of security is needed for the data, and to what extreme the user(s) must work to get through to the data. – Jeff Clayton Aug 26 '14 at 02:09
  • @EricG, Mylar is quite interesting! It moves the decryption process entirely off of the server to the browser. It also highlights the trouble with all of this: Even if the decryption happens in the browser, if the app is compromised, a client side script could be written that just feeds the decrypted data back down to the server to be captured (e.g. using AJAX). Still, at this point, Mylar is the most compelling answer - or, it gets to the crux most clearly, in that a compromised server leads to a compromised app which very likely leads to compromised data. – RogerRoger Aug 26 '14 at 02:25
  • Mylar really does impress, the more I read about it the more I want to try it myself. – Jeff Clayton Aug 26 '14 at 02:27
  • I suggest moving to the chat room, the comments on SE sites are not intended for these types of conversations. – Eric G Aug 26 '14 at 02:32
  • Good point, it seems there are many questions and answers here, far beyond the original post. – Jeff Clayton Aug 26 '14 at 02:33
  • Ok, but I think Eric G's reply is close to an answer. My initial question was whether the concept of not persisting the key on the server, ie. present at run-time, was problematic. Mylar takes this entirely off the server, yet, IMO, still shows that a vulnerability remains, because a client side script could just send the decrypted data back to the server (or elsewhere for that matter). Thus, ultimately showing that a compromised server is going to result in trouble. – RogerRoger Aug 26 '14 at 02:43
  • This is great info, though I have to ask, originally you asked if it were simply a good direction, and that you did not ask for an implementation. Not judging, but it looks like Eric and I both have answered that for you, with ideas on implementations, which seem to be what you are really asking for after all, a way to do it? – Jeff Clayton Aug 26 '14 at 02:48
  • We could implement the original concept fairly easily. This was intended to be a sanity check/sounding board question, and the feedback has been great. Thank you! – RogerRoger Aug 26 '14 at 02:56
  • Your concerns are all valid, finding new methods is not a problem with me. A chat direction might be wise as @Eric G suggested. – Jeff Clayton Aug 26 '14 at 02:58
-2

Databases are specially bad for this.

If the data is so critic, you shouldn't be storing the info in a DB. Save a crypted blob and decrypt client-side (gpg maybe?)

Ángel
  • 17,578
  • 3
  • 25
  • 60
  • Possibly, though maybe the original plan-- database of encrypted blocks with at least sort by date and/or timestamp might be better. I have a db of the us zip codes for example. To search all of them at once, and doing distance comparisons takes 6 seconds on my server - too long for web visitors, but pre-blocking out unneeded locations dropped the search to 1/6 of a second. A form of organization, even if it's only a small one - like your name idea your other comments, may help in such a manner. The size of the data then would be the issue. – Jeff Clayton Aug 25 '14 at 22:45
  • 1
    This answer misses the point. I'm specifically interested in ways of not storing the key with the data, regardless of the kind of encrypted data (though, a database is the likely persistence method in this case). – RogerRoger Aug 26 '14 at 01:05
  • @RogerRoger I don't understand what you mean. I never suggested that you stored the encription key with the blob. Given that I suggested `gpg` I don't know how you got that idea. Maybe you were referring to the database primary key instead? (but I don't see what would be your point, either). – Ángel Aug 28 '14 at 08:45
  • The problem with the db is that you can store the info in a crypted blob, but at that point you are leaving the db concept (it can be useful when only crypting some fields, though). Maybe an option would be to have a (non-trivial!) db wrapper that when you do SELECT foo, bar automatically detects that foo is encrypted and maps it transparently for you. – Ángel Aug 28 '14 at 08:46
  • @Ángel, by storing the key with with the data (blob), I mean only that both the key and data are available to the app. As soon as you require this combination (which is common), it is a simple code review to finding how the app uses the key, and then revise the code to decrypt the data. It doesn't matter how the data is persisted (blob, db, db blob, etc). – RogerRoger Aug 28 '14 at 12:08
  • @RogerRoger, I don't think you should store the key with the data (and I didn't say that). The key could be a locally installed certificate, the user gpg keyring, a password typed by the user… Although this is more viable with a fat client. And if your IT staff controls both the database server and the client PCs, there's little hope if they wanted to compromise it. – Ángel Aug 29 '14 at 07:58