1

The GDPR changes a lot of data protection law, but how will it affect dumped databases of passwords?

At the moment these can be used to work out the most common passwords, and sites can use this knowledge to prevent people choosing overly common passwords.

Will this still be allowed under GDPR, as the password databases can be considered personal data, and if not would anonymising the passwords, so that they are stored without account details fix the issue, or would the passwords still be treated as personal information?

jrtapsell
  • 3,169
  • 15
  • 30

2 Answers2

2

GDPR relates specifically to personally identifiable information, that is information that could be reasonably used to identify a person, like address, email address, user names, IP address.

If the password dump contains only a password and no username, account details or IP address, it is likely that they would not be considered PII for the purposes of GDPR.

However, remember that if you have the plain text password and someone has used something personal as their password, maybe firstname_lastname_dob , that could be construed as personally identifiable.

Standard caveat: This is not legal advice, I am not a lawyer but I am based in the EU and working on a team ensuing GDPR compliance for a company processing sensitive data.

iainpb
  • 4,142
  • 2
  • 16
  • 35
1

While this is indeed mostly a legal question, as to what of it pertains to the practicalities of security:

  • Storing accounts+passwords would be information relating to an identifiable person, so definitely personal under GDPR.

  • You only need to store hashes of passwords to catch collisions. This doesn't change whether data is personal under GDPR, just turns it into pseudonymous data.

  • The most practical way to store passwords that I see for this purpose is a two-row table with the password's hash and the number of times that it's been found. Only passwords where that number is >1 should be included, as that would ensure that they can't uniquely ID anyone.

Specific implementation:

  1. The server randomly picks and periodically changes salt1
  2. The server hashes its dictionary of bad password hashes as badphash1+salt1->badphash2
  3. salt1 sent server->client
  4. The user enters the password in his client (browser)
  5. The client hashes the password, then hashes hash1+salt1=>hash2
  6. hash2 sent client->server
  7. The server requires new password (step 4) up if hash2 is IN badphash2
  8. Else the server creates user-specific salt2, sends it server->client
  9. The client hashes password+salt2=>hash3
  10. hash3 sent client->server
  11. salt2+hash3 stored on the server
  12. On login, hash(password+salt2) is checked against hash3.

This way, the salt isn't the same for all passwords. This follows best storage and best transmission practices.

IOW: There's no security reason to store plaintext passwords, not even to check for duplicates. Even that can be done on hashes without compromising anything.

ZOMVID-21
  • 2,450
  • 11
  • 17
  • Secure hashing of passwords involves salt. This means you can't compare hashes to find matching passwords. – Neil Smithline Feb 25 '18 at 04:08
  • It does, but sometimes you have to compromise. Just have to hash them twice - once without salt to find matches, then with salt for storage. – ZOMVID-21 Feb 25 '18 at 07:06
  • A hashed but unsalted password is little better than an unencrypted password. You should never store passwords as unsalted hashes. – Neil Smithline Feb 25 '18 at 14:45
  • You don't need to. Compare hashes in RAM, clear them, report if the password is OK, if it is, then store the salted hash. – ZOMVID-21 Feb 25 '18 at 15:05
  • (past edit time) Furthermore, salt can be applied to hashes just as it can to passwords. You can salt and hash that initial raw password's hash, then compare it to salted-and-hashed hashes in your database. The table can be precomputed, as long as it's specific to your site, and recomputed with a new salt every once in a while. – ZOMVID-21 Feb 25 '18 at 15:21
  • But two passwords won't have the same hash if they have different passwords. Using the same hash for all passwords allows a precompute attack. I don't follow how you can follow password best storage practices and compare for duplicates. – Neil Smithline Feb 25 '18 at 16:19
  • I've edited the answer to elaborate on how specifically it can be implemented. In short, you need to use separate salts for duplicate checking and password storage. – ZOMVID-21 Feb 25 '18 at 17:24
  • I'm still not sure I follow. Sorry. But I think we've gotten way off topic. The question is about presumably public password dumps. – Neil Smithline Feb 25 '18 at 18:50
  • Yes. My updated answer addresses a specific method of using public dumps to ban frequently-used passwords in a GPDR environment without risking leakage of user passwords. – ZOMVID-21 Feb 25 '18 at 19:24
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/73716/discussion-between-neil-smithline-and-therac). – Neil Smithline Feb 26 '18 at 14:48
  • This has gone off on a slight tangent if I may say so. –  Feb 26 '18 at 16:24