The goal here is to prevent identification of the users and their data. Is it a good idea to partition my database into multiple ones, one for each kind of sensitive data, hiding the links between them?

At first it seems that the answer is yes, because an attacker would need to gain access to all databases (potentially served from different servers/VMs/containers) in order to build the relationships between data and users, and identify them.

Of course doing so adds a lot of complexity to the application layer so I wonder if this is a good idea at all.

EDIT: here is a more concrete example.

The project is a website. The application code runs on server A. Connections are done from the application to databases db1, db2 and db3, respectively on hosts B, C and D. No encryption at all (excepts passwords of course). The application code holds credentials for all three databases. So I do have a "security bottleneck" and it's server A. Besides, relationships between data are not stored in a separate database, but splitted in the three databases.

enter image description here

Is this bad design? Would I be better off setting up

  1. only one database (in server E) and strictly securing server A and E? or
  2. another database to hold the links between data, to delay furthermore possible identification of users?

Other solutions to consider?

Would any of this even be useful considering server A is a single point of failure / security bottleneck?

  • 123
  • 4

3 Answers3


The entity being able to combine the data later will probably be your "security" bottleneck.

You can of course have a relationship:

     /      \
    A        B

and hope that an attacker only gains access to A or B or potentially even BOTH but somebody somehow needs to have knowledge about how to piece this together again and that's where your bottleneck is going to be. However, let's say that someone manages to attack your bottleneck, they might only be able to retrieve the linking of datasets to users but not necessarily gain access to the data itself. Let's say you have three servers A,B and R where A,B store raw data and R stores the relationships. How would you do access control? Would you encrypt the data so that only the user himself can decrypt the information and piece it together? A and B don't know who you are so you can't really do access control on those data sets because if you store WHO has access to what data on A and B you're already linking it to a user again.

How does your application then access the data? Do you need to run computation on the data? How do you do that if you don't know the relationships?

For me... this is just way too broad. You might get a meaningful answer for a specific scenario but probably not in the generic case.

  • 555
  • 3
  • 9
  • Thank you for your comment. Indeed it's a bit too broad, I'll narrow it down and give a concrete example. – pawamoy Jun 15 '18 at 11:21

The threat isn't defined clearly enough. You said you want to prevent identification of the users and their data, but you didn't say by whom. Your teammates? Your hosting provider? A random attacker? Somebody interested in your data and willing to try a targeted attack?

Basically you admitted there's a bottleneck, which is your application and the server running it. So your idea of splitting the database will only work against the threats that are going to successfully attack your databases, but will not manage to attack your application. What kind of threats are these? I can't really think of many examples right now. Maybe the threat of a thief stealing the database backups: if the backups are in different places, the chances of stealing them all is lower, and therefore the thief is less likely to get all the data. But the backups should be encrypted anyway, so if good encryption practices are used, why waste time trying to implement a system that uses multiple databases? Also considering that adding complexity in the application will increase the probability of bugs, some of which might lead to additional vulnerabilities.

So, in my opinion, the question is, again: what are the threats that are going to compromise your databases, but not your application? After answering this question, the whole situation will be clearer.

  • 15,398
  • 6
  • 43
  • 64
  • I accepted this an the answer because this is what should be done in every similar situation: a threat / risk analysis BEFORE trying to implement such a system. It seems the most reasonable and obvious thing to do. – pawamoy Feb 06 '19 at 12:06

I'd say this falls under security from obscurity which is never a fool proof method of security. Who are you trying to hide the data from? To an end user it makes no difference.

Developers and sysadmins would presumably be able to access all databases. If as you say you serve them from different vms or containers, I am a hacker who gained root on your hypervisor or container host, therefore I have root access to them all.

If the db servers are hosted by different providers, presumably you would run the same OS and same DB system on each separate DB server, then a hacker finds an exploit in one db server its applicable on any other ones you have. You would just be increasing the number of servers you have to patch. otherwise you have separate os's and db versions on each provider you could encounter some horrible specific version dependency behaviors and increase the number different servers you have to manage security patches for.

  • 21
  • 2