5

I am working on a web project and I want to (as far as possible) handle user data in a way that reduces damage to the users privacy in case of someone compromising our servers/databases.

Of course we only have user data that is needed for the website to do it's job but because of the nature of the project we have quite a bit of information on our users (part of the functionality is to apply yourself to jobs and sending your cv with it)

We thought about encrypting/decrypting sensitive data with a private/public keypair of which the private key is encrypted with the users password but found some security and implementation problems with that :P

the question is how do you implement user privacy and a protection against data theft on centralised web sever with browser compatible protocols while for functionality it is required that users can exchange sensible data?

To give some additional insight: this project is not yet in production stage so there is still time to make things right.

we are already doing some basic stuff like

  • serving https
  • enforcing https for sites that may handle sensitive data
  • hashing salted passwords
  • some hardening of our server and services on it
  • encrypted harddrives to prevent someone from reading all client information after stealing our servers / harddrives

but that's about it, there is besides the password hashes no mechanism that would stop/at least make it harder for someone who managed to get into (part of) the server to gain all data on all our users. Nor do we see a way to encrypt user data to disable our self from reading them as we need the data (we wouldn't have collected it otherwise) for some part of the website / the functionality we want it to provide. Even if we for example managed somehow (maybe with some javascript) that all data would get to us encrypted (by the client's browser) and we serve the client his privatekey encrypted with some passphrase (like for example his login password) we could not for examle scan user uploaded files for viruses and the like. On the other hand would a client side encryption at least with the browser/webserver concept leave some issues with security at least as we imagine it (you are welcome to prove me wrong) and seems quite like reinventing the wheel, and maybe as this project is not primarily about privacy, but rather privacy is a prefarable property we might not want to reinvent the wheel for it. I strongly believe I am not the first webdeveloper thinking about this, am I? So what have other projects done? What have you done to try to protect your users data?

if relevant we are using django and postrgreSQL for most things and javascript for some UI

ps: this article describes some other reasons we are hesitant about client side encryption

Freebejan
  • 51
  • 1

1 Answers1

4

One standard method used for something like this is application layer encryption.

Your project consists of a web tier (Django) and a database tier (PostgreSQL). If you implement your encryption in Django, then you can write encrypted blobs to the PostgreSQL data store. This is not, by the way, "client side encryption" as described in the article you link to.

The advantage of this method is that you separate the encrypted data from the keys. An attacker who is able to dump tables out of your database can get at the encrypted data, but not the keys. An attacker who is able to compromise your application and extract the key still needs to get the data to use the key on.

The disadvantage of this method is that the encrypted data can't be searched or manipulated purely at the database level. If it's passwords, that's fine; you're not going to be SELECTing all the passwords with certain criteria, you'll usually just pull one and compare it. But for other sorts of data that can be onerous.

Update to address comment below in proper length:

Yes, if the attacker compromises the application, it is possible for them to compromise the key, and possibly even to leverage the application's database access to take advantage of the key. So why is this better?

First of all, it removes some common single-vector touchdown attacks. If the web app suffers from SQL injection, for example, an attacker can usually extract full tables from the database. But they can't extract the key from the application in the same way, and will then need another separate successful attack to achieve their goal.

The attacker can wait for something interesting - but if they compromise the app, they can probably intercept inputs and compromise info as it comes in before it's sent to the database; access to the key is irrelevant at that point. And that requires a long-term compromise and patience, which is less of a concern unless you're a target in the APT space.

Compensating controls can be put in place to limit the impact of application layer compromise. For example, if the application only has access to certain stored procedures, and not the ability to execute arbitrary SQL against the database, then even if the attacker compromises the application and gets the key they may not be able to leverage it to get bulk data. Any iteration to get data piecemeal is much more likely to get caught by logging, alerting, or SQL monitoring tools.

And, while less common, the separate storage of the key prevents backup attacks which gain access to database backup files.

There is no perfect security, but defense in depth is a proven method to improve security. Separating the encryption key from the encrypted data in a multi-tier application is an excellent contribution to defense in depth.

gowenfawr
  • 71,975
  • 17
  • 161
  • 198
  • Thank you very much for your answer. I still have a question to what you wrote: if an attacker was able to compromise the application wouldn't he be able to access the database from there as well? Or do you mean with compromising the application that an attacker manages to acquire the source code? if not one could eavesdrop the data accessed by the application and wait for something interesting to be decrypted? – Freebejan Jun 14 '14 at 22:49