1

Let's say we have a large company X which stores private user information. For example an e-mail provider, or social network provider. I mainly have GMail and Facebook in mind. So the company has thousands of employees. Let's further suppose that X uses Linux for all their machines (from my understanding this is indeed the case with Google and Facebook for their server machines).

Now, from my understanding, X would usually have some data-centers which contain many server machines, and the user data would be stored on those machines. Let's say we have some machine with user-data on it. It may be encrypted when stored, but it should be decryptable if it is going to be returned to the user at some point. So it would seem like a sysadmin with sufficient access rights could access that user-data.

Now, I understand that in such a company most employees would not have access rights to user information, and those who do would be subject to logging of their access to resources. However, wouldn't there always be at least one person/sysadmin who has superuser access to a given machine? and such a person could access user information and also have control of the logging functionality on that machine, so would be able to do that without traces?

Is my analysis correct? I am mostly wondering if such companies who store user data, would have such sysadmins who have superuser access to user-data and logging and so would be able to retrieve user data without leaving traces?

If this analysis is correct, who would usually have such permission? would it be just some low-ranking sysadmin? maybe only the Chief Information Security Officer?

If the analysis is incorrect, how do they set-up a situation where there is no single person/insider that is able to abuse user-data in the way I described? Some ideas I have for how this could perhaps be done: (1) require authentication from more than one person. (2) keep the user-data on one machine, with one sysadmin, and the decryption key on a separate machine which has another sysadmin (so again it would require a collusion of two insiders to abuse the data)

proggie165
  • 111
  • 2

3 Answers3

2

First off, most companies do not employ (or follow) all the necessary procedures for preventing users from accessing data that they shouldn't be able to access.

Assuming that company X does follow the necessary procedures, these procedures would be built around a few fundamental principles:

Need-to-know access
Every user has access to the information that is mandatory given their job duties. Given that this principle is followed properly, even sysadmins should not have access to user data beyond their job duties.

Separation of duties
Every user is assigned duties that are not overlapping so that conflicts of interest are avoided. For example a sysadmin that can alter user permissions can view log files but cannot delete log entries.

Four-eye principle or Two man rule
Critical procedures must be reviewed by two individuals before taking place. This ensures that a malicious insider will not act alone. For example, if a sysadmin wishes to alter user permissions, they would need authorisation from an additional sysadmin or user to do so.

Data pseudonymization or anonymization If for any reason, sensitive data are presented to a user (i.e. a spreadsheet that presents both sensitive and non-sensitive data), the data which is not relevant to your duties will be pseudonymized (i.e. replaced with pseudonyms) or anonymized (i.e. redacted or encrypted).

jonna_983
  • 84
  • 6
1

So basically you question is "What actions / policies can be enacted by a company to limit abuse and misuse of private (user) data?"

Well, there are in fact several industry standard approaches for this. I will list a few below with a short explanation to what they are after:

  • 4 eye's principle
  • Shamir's Secret Sharing algorithm.
  • multi-factor access control.
  • Smart card keys.
  • NDA's
  • Ethical rules.

    1. the 4 eye's principle (or two man rule) basically means that in order to do any action you need to have 2 people together to do said action (1 to perform and 1 to sanction).

    2. Shamir's Secret Sharing is a way to split up a 'shared' secret so no 1 person has the full secret and n secret keepers need to work together to reconstruct the secret.

    3. multi-factor access controle simple means more than 1 factor to bas eauthentication and authorization on (2fa is commonly used, but more factors can be employed [something you have, something you know, something you are, someplace you are, some action you preform])

    4. Smart card keys or other secret storage devices can be used to limit leaking of the secret. Nitrokey and Yubico have several good products that can do that (as an example of 2).

    5. NDA's, more a legal limit than a technical limit can enforce people to not abuse the private data in any way except for what is specified as proper use.

    6. Ethical rules are present in some legal systems world wide that limit what you are allowed to do legally with someone else there data. while not as specific as they other reasons it can be just as strictly enforced by courts when its part of the legal system.

As for your idea to segregate encryption keys from storage systems, this is already commonly done but not to limit abuse like this (it would be useless for such a use case as the admin can simply extract the data from the system through some means). Hashicorp's Vault is one such open source project to enable it.

LvB
  • 8,217
  • 1
  • 26
  • 43
0

This will vary heavily by organization.

I used to work as a developer for a large technology based corporate in the financial sector. To access a production database for read only access there was a special tool. This allowed you to pull pretty much anything from any database you had access to (which was only databases you had a reason to need access to) - but it logged absolutely everything you did.

If you needed to run a command on a production machine you had to put in a ticket selecting what you thought the risk involved was and listing exactly what you wanted to run and why. When i say exactly I mean there was a field to type the commands into. This ticket would go to your manager to approve or adjust (usually upping or downing your selected risk). Depending on the risk you selected one or more people both higher in your management chain and in the datacenter operations room might have to approve the ticket. For lower risk actions these would then get automatically run against the machine. For higher risk actions a datacenter operator would manually allow each line to run and watch the output. Again absolutely everything would be logged.

There were tools in place to obtain a root shell on a production machine but these would only be approved if absolutely necessary - usually a "its seriously broken" situation. And again everything would be logged via the systems that could get you that access.

I imagine inside the datacentres staff would have better options. However these are extremely secure facilities with cameras everywhere. Very few staff would have both physical access to the machines and the credentials needed for root access to them and multiple people would be involved in obtaining physical access. Meanwhile you'd need a very good excuse for accessing a production machine which is still in communication with the data store or accessing both a storage rack and a place where the encryption keys are available. And the datacenter staff are also not ordinarily going to know which databases contain which information or how it is structured.

There were loopholes. For example as a developer it was relatively easy to get a production database copied into a beta environment for testing / debug purposes and you had more tools at your disposal there. Although there is still going to be a log of the request. Then you have the challenge of getting large amounts of data out of the server environment and into your local PC - this will not be trivial. Nor would getting it out of the corporate environment - USB ports and disk drives were disabled and all internet traffic was monitored including HTTPS.

So could you get large amounts of data out? Yes. But it would be traceable back to you.

Hector
  • 10,893
  • 3
  • 41
  • 44