10

I work at a large-ish tech company with hundreds of websites and some large web applications with a very large number of users.

I am planning to propose that we have a central system to track deletion requests to ensure if a person requests we delete their details they are deleted everywhere.

If a database is restored any users deleted record will return so deletion requests cant be one-off requests that are never recorded, we would need a record of users who have asked for their details to be deleted (unless there is a better idea).

It sounds self defeating and likely non-compliant to be storing details of users who have asked for their details to be removed.

It also feels like this is likely a previously invented wheel.

Is there a recommended or widely accepted practice for this?

I was thinking maybe a hash of the users first name, another hash of their surname and lastly a hash of their email address, and development teams could hash emails on their system and check for matches.

Edit: To aid search queries, I found the standard term for this is 'suppression list' after some further googling.

ZZ9
  • 273
  • 1
  • 7
  • Reach into backups and delete there, too. – schroeder Apr 23 '18 at 11:45
  • @schroeder this might not be a good idea - there has to be some record of deletion, otherwise new sets of data obtained from various sources could theoretically contain the before-deleted information (again). – Tobi Nary Apr 23 '18 at 11:58
  • 2
    I assume you have some sort of user id? Can you just store that id of deleted users, but nothing else? I have absolutely no idea if that is a compliant solution, but from a pure technical perspective I think it would do the trick. – Anders Apr 23 '18 at 11:59
  • @SmokeDispenser new sets from other sources is a new scope – schroeder Apr 23 '18 at 12:11
  • @schroeder Backups are explicitly exempt from GDPR and in our case we have multiple layers of backups (Glacier, Tape, Veeam Snapshots, SAN Replication and physically sending them off site to IronMountain) its not feasible to edit them and risks their consistency – ZZ9 Apr 23 '18 at 12:15
  • @Anders We have multiple systems, some with user IDs some with just email addresses... we also have forums, and hundreds of other sites where that don't necessarily use our single sign on system. – ZZ9 Apr 23 '18 at 12:18
  • Please cite sources that backups are exempt in GDPR. You may have reasonable business practices that means that you are cleared to keep personal data in your backups, but that is not a GDPR statement. You must have a legitimate reason to keep backups other than "it's really hard to delete stuff from them". – schroeder Apr 23 '18 at 12:19
  • @schroeder Backups are actually exempted in two ways: Article 17 of GDPR is only applicable when the data controller has no legal basis for processing personal data. Backups and backup consistency are a legal requirement and this is widely interpreted as exempting backups from right of erasure. The second is archival meterial such as data backups and microfiche archives are often non-divisable uneditable records, whereas GDPR defines only reasonable steps must be taken to ensur ethe data is deleted. It is generally accepted deleting entire archives for one user is not reasonable. – ZZ9 Apr 23 '18 at 12:40
  • @schroeder "…taking account of available technology and the cost of implementation, shall take reasonable steps, including technical measures…” Reasonable is a clearly defined legal term with a specific meaning. – ZZ9 Apr 23 '18 at 12:40
  • @schroeder A quick google shows this: https://thegdprguy.com/right-to-erasure/ – ZZ9 Apr 23 '18 at 12:41
  • 2
    @AirCombat but none of this means that "backups are exempt" - you have to show that in your cases that it passes the "reasonable" test. Make sure your DPO is looped in on this. You can't just wave a hand on this one. – schroeder Apr 23 '18 at 12:44
  • Also, while I agree that tapes and physical backups may be exempt, that does not mean all backups are. You should delete where possible and only keep what can't be reasonably deleted. But I would agree that storing an ID of the user to delete again should not violate the GDPR. An internal number can hardly be considered personal information on its own. I would argue hashes are more personal, as they can be easily connected with names. – Peter Harmann Apr 23 '18 at 13:35
  • @schroeder thanks, we have a meeting coming up anyway so i will – ZZ9 Apr 24 '18 at 11:31

1 Answers1

4

It is exactly the right thing to have a deletion request management system. In fact, given the importance of this function, the timeframes for response, the workflows to coordinate, it is almost certainly necessary.

GDPR does not prevent companies from storing personal details of their users/customers/etc for legitimate purposes. Instead it is intended to generate more thoughtful practices around the management lifecycle typically associated with this sensitive data.

For instance, what teams in companies do to record interactions with different flavors of users is track relevant and often personal details in whatever workflow tool the particular team happens to prefer. Maybe this is Salesforce, or JIRA, or email. Whatever.

This careless sprinkling of personal details around team-specific workflow infrastructure is one of the anti-practices a successful GDPR implementation will ideally deprecate.

So a deletion request management system needs store personal information that is used to match in other systems. It needs to be sufficient to identify the person requesting the deletion. That data can be retained for the duration of its legitimate need. When that need has expired- the deletion workflow is completed- this system is no different from any other. Personal details get deleted from the deletion request management system as well. (The exact policies and timing around this are of course decided by the legal team).

In terms of practices, the likely common practice with databases is scrubbing. IDs of records with scrubbed fields need to be retained, both for future validation and also to support the scrubbing process being applied following a restore of production data from backup.

When it comes to matching, one nit- my understanding as a non-lawyer is that what is considered to be personal information is not just identifying information like first and last name. Personal attributes that may not be identifying but are nevertheless unique and characterizing of a person are also subject to deletion/scrubbing.

Using hashes to efficiently match fields across systems may help, but with names, there are numerous spelling and other variations. Doing this sort of matching is its own data management practice that companies that have had to consolidate customer databases are likely familiar with.

Jonah Benton
  • 3,359
  • 12
  • 20