1

Data protection laws including GDPR state:

“Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes.”

GDPR stipulates data should not be used in non-production systems unless anonymized or through pseudonymization .

Generally speaking a customer would not expect their information to be used in a test environment or for the purpose of new technology solutions and hence we can argue we do/do not have an case for legitimate processing of PII in test environments.

I have requirement. I want to use personal identifiable data (PII) to develop new technology. I need to ingest PII in an AWS dev environment, the data quality is poor, then clean the data in a dev/test system, and sent to a production environment after i have proved the data cleansing works. Ofuscating the data in some fashion is not an option as we need to transform the poor data quality into making it good.

I will encrypt the relevant services used in AWS using KMS and data access will be limited to a small group of developers. Data will be deleted at the end of the dev/test period. All AWS services will be tightly controlled via security groups and IAM polices. This seems like an easier option than anonymization or pseudonymization which is difficult and cumbersome.

Does this seem like a good approach ? How have others secured live (PII) data in non-prod environments?

Architect
  • 631
  • 1
  • 6
  • 9
  • 2
    Unless you have the users' consent, you can't use their personal data to develop a "new technology", because I'm pretty sure that won't be considered a "legitimate interest" according to the GDPR. But why can't you just generate your test data? Generate your test data with a model based on your real data. The generated test data should contain no real PII, just example data: John Doe, Jane Davis, Jack Brown, X Æ B-13, etc. – reed Jul 02 '20 at 13:58
  • Why can't you anonymize production data for test? What's the functional difference between "Jane Smith" and "rggsrgrgsddr"? – schroeder Jul 02 '20 at 14:25
  • 1
    **You need to speak to a consultant in the field about this to get a definitive answer.** It seems to me like you need a system temporarily to verify a process for which you need PII data. This does not constitute a direct equivalence to a permanent non-production system where code and data are tested throughout the life of the live product. – Pedro Jul 02 '20 at 14:41
  • @Pedro - that is correct it will be temporary store to prove the solution works.This temporary storage is abit of a grey area when it comes to GDPR. Various laws and reg ask us to keep the data updated. However when we try to do this we have other laws and regs preventing us doing this as is this case. – Architect Jul 02 '20 at 15:41

0 Answers0