0

Question

Suppose you design a data lake with sensitive data. Due to immaturity of tools, dynamic data masking is unavailable. You have MFA, encryption at rest, audit logging, ETL processing data and people that develop/support/debug that ETL.

The standard answer is a separation of concerns between developer and support engineer (no access to sensitive data from developer, only support engineer).

But how to protect sensitive data from support engineer(s)?

Considerations

Support engineer need to debug issues, run jobs, etc. If ETL logic depends on sensistive data (i.e. join by IP address, or filter by medical status) this inherently mean support engineer has to have access to sensitive data.

Column-level encryption (besides a lot of burden) looks to not solve the issue: deterministic encryption isn't secure, moreover support engineer has to have indirect access to decryption keys in order to run the jobs.

Audit of queries also may not solve the issue, because query results may be downloaded.

VB_
  • 215
  • 2
  • 9
  • Solving these types of problems at this point is much too late. All these should've been part of the system design and the troubleshooting procedures should've been specified. You will run the risk of violating very serious HIPAA regulations (if in the US). – Nelson Sep 13 '21 at 09:20
  • @Nelson thanks for the answer! I'm at the stage of system design and troubleshooting procedures. Yes, HIPAA is my case. Could you please clarify your answer? – VB_ Sep 13 '21 at 10:36
  • Why do you want to protect the data from the support engineer? – schroeder Sep 13 '21 at 11:37
  • 1
    I'm thinking that you have a human/policy situation and not a technical one. If the engineer needs data-level access in order to ensure service to customers, then that can be ok. that's not an *access* problem that needs to be solved. – schroeder Sep 13 '21 at 11:39
  • @chroeder thanks for the details! Yeap, I'm interested in strategies, both technical and policy based. "If the engineer needs data-level access in order to ensure service to customers, then that can be ok" - agree. But, how can I ensure that in case of data leakage, I'll find the person/reason? Suppose support ask access to the whole history of clients' deseases PII. Would you try to do just auditing of queries? – VB_ Sep 13 '21 at 11:44
  • @schroeder motivation for restricting support engineer access is in reducing risk/impact in case of leakage/exposure. I'm actually trying to find a way of not restricting but still being safe – VB_ Sep 13 '21 at 11:45

1 Answers1

0

In some environments like yours, I've implemented a remote environment (RDP in my case) where the engineer can interact with the system, but cannot download to the local machine.

Yes, it is possible that the engineer can take pictures/screenshots of the data, but that limits the scope of the impact and prevents "accidental" mishandling of the data.

If you have a truly malicious engineer, there is little you can do to stop them. You can put controls in to lower the risk of mistakes and raise the threshold for opportunistic abuse.

schroeder
  • 123,438
  • 55
  • 284
  • 319