5

I must give certain employees access to a report which contains email addresses. I would like to redact or partially mask these email addresses, but I am having trouble finding official guidance on how to properly mask email addresses so that they cannot be de-anonymized.

From googling, I can see others are going with masked addresses in the following formats (assuming the email address to be masked is firstname.lastname@provider.com):

f***e@provider.com.au

f***e@p***r.com.au

f***e@p***r.***

I'm concerned that none of the information I'm finding is official, and I have no idea if this will stand up to an audit.

Can anyone point me to official guidance on how to partially mask email addresses so that they are no longer considered to be PII by the GDPR, please?

MeMyselfI
  • 153
  • 1
  • 3
  • 3
    You cant just replace it with bogus emails? That's how we do it – Artog Jul 18 '19 at 06:04
  • 3
    I agree with @Artog, change them to something like customer@example.com Only time I've seen addresses masked like that was on some websites while recovering password if only username was filled in ("Your password reset link was sent to c****r@example.com") – Quantoss Jul 18 '19 at 06:21
  • 1
    What's your reason to include the emails at all? If your report would work with `f***e@p***r.***`, then it will work with no email at all. – schroeder Jul 18 '19 at 15:36
  • The report is of emails sent to externals, and they may call the certain employees saying they didn't receive their emails. The employees need to be able to roughly verify the emails were sent to the correct address. (e.g, sometimes the externals change their email addresses they want the emails sent to, and it turns out we have sent the email to the old address.) – MeMyselfI Jul 19 '19 at 02:14

2 Answers2

4

There is no official guidance because this is not a GDPR enforced requirement. GDPR does not regulate specific security measures beyond making recommendations about what you should consider. Since you consider doing something like that, you in theory should do a Data Protection Impact Assessment to identify levels of risk may be associated with your situation and then decide on how to proceed.

Generally, using username@provider.com turned into u******e@provider.com is perfectly fine for most cases, since the domain name does not identify a person. However, if your specific intent is to also protect domain names (which is a thing again determinable by doing a Data Protection Impact Assessment), then you could extend the protection to a format covering also the domain: u******e@p******r.com .

Overmind
  • 8,779
  • 3
  • 19
  • 28
  • 6
    If there is only one person in this system matched by the masked address _u******e@provider.com_ then this pattern **does** identify a person then! Also some people use their own domain for email. If no one else is using this domain, it is PII too! – Josef Jul 18 '19 at 13:07
  • This is an exception that must be determined when you do a DPIA as I mentioned. – Overmind Jul 18 '19 at 13:09
  • 2
    it's not an exception. The question is: "Can you guarantee with high confidence that this data can't be used to identify a single person?" and the answer should better be "Yes". – Josef Jul 18 '19 at 13:12
  • Accepted because of the DPIA suggestion, which I will ask for, which I take as the primary answer to my question. Secondarily, I've currently got the software redacting like this [a***b@p***] which is in line with @mootmoot's suggestion in the interim. – MeMyselfI Jul 19 '19 at 02:18
  • Oops, the asterisks were taken as bold, I meant [a * * * b@p * * *] – MeMyselfI Jul 19 '19 at 04:13
3

GDPR is more restrictive than the US definition of PII, in which, non-PII that allow any inference to the identities is also under GDPR jurisdiction.

I doubt given masking examples will withstand GDPR audit. Replace the email address with an obvious placeholder (e.g. redacted@redacted.invalid), that is what everyone is doing.

Partial masking is weak in privacy, e.g. s****@provider.com can easily infer to smith@provider.com if smith is the only name start with character 's' using @provider.com address.

Even domain masking is not enough since a hidden mapping of a domain name can be created to reverse matching the masked domain name, e.g. p***r.*** map to provider.com.

This also extends to other conditions such as gender, age. It is not difficult to identify it is 35 years old Smith, when the data storing the ****@p*****.*** next to age and gender, which there is only one 35 years old Smith inside the database.

mootmoot
  • 2,387
  • 10
  • 16