How many rounds should be used to hash card numbers?

Question

We want the ability for payments made without logging in using one of their saved payment methods to be associated with the saved payment method. E.g. if they buy a recurring subscription to magazine 1 using credit card 1, then buy another subscription to magazine 2 with credit card 1 (yet again), when they login to their account it should show that both magazines have been purchased using the same payment method. (Not that they both were purchased using two separate cards that just so happen to end in the same last 4 digits.)

As they are not logged in during these two checkouts, there's no way for them to pick their existing payment method to use. Internally, we need to realize that this payment method has been used before and "dedup" them.

My solution to this problem is to use Blowfish to hash the card details:

private static String hashSalt(Long userId) {
    final Long rounds = 10
    String userHash = sha1("$userId" + GLOBAL_SALT).substring(0, 16)
    return "\$2a\$$rounds\$$userHash"
}
private static String mergeCardDetails(String number, String cvv, String expirationMonth, String expirationYear) {
    return "card:$number:$cvv:$expirationMonth:$expirationYear"
}

hash = BCrypt.hashpw(mergeCardDetails("4111111111111111", "123", "05", "22"), hashSalt(userId))

Currently the round count is set to 10. However, I realize this is a very low number as the search space for card numbers is very low.

My question, is how many rounds is appropriate for hashing card numbers, today?

I know that in the future the round count will need to be increased, resulting in duplicate payment methods...

You say 'allow payments to be made without login' but also 'using their saved payment methods'. How are you identifying the user to know their saved payment methods? — PwdRsch, Jul 03 '18 at 16:43
You seem to be using userId as the salt, so how are you getting that if they're not logged in? — Neil Smithline, Jul 03 '18 at 16:43
What do you mean that they login "by email"? Do you send them a link? Do they just need to type their email address? Other? — Neil Smithline, Jul 03 '18 at 16:44
They put their email during the checkout process. No verification required. — Chris Smith, Jul 03 '18 at 16:45
@NeilSmithline By my reading, they don't log in by e-mail, e-mail is just used to identify "hey, this is probably person X, who we already know." They're not being allowed to USE saved payment methods, they still have to enter the full information, the goal is just to recognize ex post facto that the payment method used matches a saved method. Do I have it right, @ChrisSmith? — Tin Wizard, Jul 03 '18 at 21:46
@Walt Indeed, that's right! We want to avoid them checking their accounts and seeing two payment methods that _look_ the same, then contacting us wondering why there are duplicates. I realize it is possible for somebody that is not person X to add payment methods to person X's account, but that is acceptable in the business requirements. — Chris Smith, Jul 03 '18 at 21:50
The useability enhancement of seeing two payments listed under the same account seems so minor, and the danger of storing these payments so high, I would say just don't do it. Instead change your listing to show each payment separately, and which card was used to pay for it. — Martin Bonner supports Monica, Jul 04 '18 at 07:07
Card companies and payment providers usually specify how systems should store payment details, **if** they even allow it. — molnarm, Jul 04 '18 at 07:41
"Not that they both were purchased using two separate cards that just so happen to end in the same last 4 digits." I think you are worrying about a use-case that wont happen. First a single person has to have "duplicate" cards, then they have to _care_ about what your app has to say about their purchases. Just assume no one uses two cards with the same four last digits; it won't be a problem. — Odalrick, Jul 04 '18 at 11:29
So basically your users can provide an email and a set of card details, buy something and that item automatically gets logged to the account accociated with the provided email? No verification that the person doing that action actually owns the account linked to the provided email? — Pharap, Jul 04 '18 at 11:55
Storing CVV is not PCI compliant. Expect expensive legal problems if you do that. — , Jul 04 '18 at 13:56
@Odalrick: I agree; especially when taking into account *expiry date*. I'd rather store the maker (Visa/MasterCard/...), the last 4 digits and the expiry date, no encryption needed and should always be unique enough for identification. — Matthieu M., Jul 04 '18 at 15:17
Well we don't want anybody to enter a card similar to the saved card and the result is a purchase made with a saved card. The cards have to be the same, not "similar enough". — Chris Smith, Jul 04 '18 at 15:35
@ChrisSmith Then you _have_ to store everything. Hashing algorithms intentionally discard information, that's what hashing means. We are suggesting you use the generally recognized as "good enough" algorithm of the last four digits and that you reevaluate the _actual_ use case you have. It just seems you are designing for something that will never happen; like unintentional sha256 collisions. — Odalrick, Jul 05 '18 at 09:18

ThoriumBR · Answer 1 · 2018-07-04T14:59:23.657

38

Looks like you are hashing card details along with the CVV. That's bad, very bad. Don't do that. Ever. And there's no way to do the wrong thing on the right way.

"A man may do a right thing in a wrong way; but he cannot do a wrong thing in a right way. For there is no right way of doing wrong."_src

There are very cheap hash cracking rigs around the world that can try millions of hashes per second. Breaking your hashes can be done very fast, if you take into account that you have a handful of valid banks or issuers, so the first 6 digits are taken from a table, not the full search space. One database leak and all financial data is available on the underground forums.

Don't compromise your security for a little increase in convenience. It's not hard for the client to type the card again, or login at Paypal to pay you, but will be terrible if they receive a mail from you informing that all their card data leaked and they have to cancel and reissue their cards, and keep an eye on their account to find any fraudulent purchase.

Don't store payment information. Let the payment gateway process it. If they mishandle card data, it's on them, not on you.

Don't store card data. Never ever store the CVV, hashed or not, encrypted or plaintext, in clear or base64 or whatever other way.

Just a little math for the cracking time.

The first 6 digits are the issuer identifier. There are way less than a million, so you can look up on a table and get all possible ones, and pick the most common. Bank of America, Citibank, HSBC, Chase. You will have around a hundred for the first 6 digits.

For a MasterCard, the next 9 numbers are the account number, and the last digit is the verification digit. You can calculate the last one way faster than throwing it at the hash function, so you will have 1,000,000,000 possible account numbers and around 100 possible issuers, totaling for 10¹¹ possible card numbers.

With a hashing rig with 8x Nvidia GTX 1080s, they cracked 105 hundred OpenBSD bcrypt hashes per second, with a work factor of 5. I don't know the parameters you used on your bcrypt calculation, but let's consider you did it as good as OpenBSD developers (and they are quite good at it).

Using this numbers, a single rig can crack this relevant search space in:

100,000,000,000 numbers at 105,000 per second = 950k seconds
950k seconds = 264 hours = around 11 days

An attacker will not crack every single card on your database in 11 days, but will crack all cards from the major issuers in less than 2 weeks. Rent some Ethereum mining rigs from your fellow crypto currency miner, and an attacker can crack this all in a day.

edited Jul 04 '18 at 14:59

answered Jul 03 '18 at 17:45

ThoriumBR

50,648
13
127
142

I'm aware there is a security risk due to the low search space and the speed of hashing as addressed in my question. My question is asking what the appropriate number of Blowfish rounds would be to midigate this risk. – Chris Smith Jul 03 '18 at 17:53
13

You cannot mitigate the risk, you can only slow down a little. It's about having the attackers get all credit card information in a day or a week. No matter how much rounds you choose, you are only delaying disclosure. You cannot compete against a purpose-built, specialized hash cracking computer. – ThoriumBR Jul 03 '18 at 17:58
3

@ChrisSmith it is a losing battle. You gotta worry about CC security, your kids, the kitten litterbox and your wedding anniversary next week. The bad guys only care about cracking your sweet data. And it is not a matter of **IF**, it is a matter of **when**. Given that a single plastic card has an average validity lifespan of 3 years from, even if they crack the data in a ***months*** timeframe, most cracked cards will still be useable. Not day, nor week, as thorium stated. – Mindwin Jul 03 '18 at 18:12
3

Your numbers for bcrypt are very off, 5 was never a recommended cost for bcrypt. The [paper](https://www.usenix.org/legacy/event/usenix99/provos/provos.pdf) from 1999 suggests 6 for normal users and 8 for superusers (section 5.1). The recommendations now are usually 10-12 (10 is seeming a bit low these days). For something like this 12 would be an absolute minimum (if it is done at all, which it shouldn't be). – AndrolGenhald Jul 03 '18 at 19:26
1

@ChrisSmith, if you absolutely have to do this, "lots and lots and lots and lots and lots". As many rounds of hashing as you can tolerate. Assuming you're doing offline deduplication and don't need to respond to the customer in a reasonable time period, I'd try to ensure the hashing process takes at least an hour. – Mark Jul 03 '18 at 22:51
2

@Mark A small cluster of GPUs (imagine a rack or two of 4U servers with 16 GPUs each) could still find a good number of card details even with such a high work factor... – forest Jul 04 '18 at 02:35
@Mark If it takes an hour to hash the details, the OP would be bankrupt before the crackers could mine the card data because everyone would just switch to a faster service. As the saying goes: work smarter, not harder. – Pharap Jul 04 '18 at 11:46
1

@Pharap The answer already said how to do it "smarter". To not do it at all. There is no smart way to stop a meteor by building a brick wall, but if you are absolutely adamant on doing it, make it really high and really thick and really "harder". Which is what Mark suggested. The "pointlessness" of his solution should underscore the main point of the answer. Also, he said "assuming you are doing it offline and don't have to respond to the customer", so your point is void. – xDaizu Jul 04 '18 at 14:43
3

While I agree with your answer, I think it is also reasonable to give the OP accurate specifics. Therefore it is worth pointing out that the salting means that each card must be cracked individually. This means that your 11 days is 11 days *per card*, not for all of them - a huge difference. I ran the numbers myself this morning and despite low entropy suspect that you could store cards with minimum risk of brute force via bcrypt with a sufficiently high cost factor. However, I still think it is a bad idea and the best bet is to find a completely different solution. – Conor Mancone Jul 04 '18 at 14:45
The main issue with OP question is that he is thinking about hash speed **today**, not in 10 years when his code is still in use **and** GPU technology evolved and crypto rigs are 32 times faster (per Moore's Law). – ThoriumBR Jul 04 '18 at 14:55
1

@ThoriumBR that's a true story, but it is also true for passwords too. It doesn't mean hashes are just useless - simply that you need to be able to continuously adjust your cost factor with time (although that's probably harder with credit cards than passwords since the original has to come back through your system to update the cost factor - that happens on login with passwords but only on purchase with credit cards). – Conor Mancone Jul 04 '18 at 14:57
1

It's way easier to ask users to change their passwords than to ask them to revoke their credit cards and ask their banks to issue new ones. Hashes are not useless, but you must balance convenience and security. OP is trading a very small convenience for a massive security trade-off. – ThoriumBR Jul 04 '18 at 15:24

Conor Mancone · Answer 2 · 2018-07-04T14:56:40.673

I agree with ThoriumBR's answer. I have a few more details and a suggestion:

You are trying to hit a moving target. "How many rounds is enough?" is a question with a changing answer as hardware availability changes. When it comes to password hashing setups many systems therefore have methods to automatically increase the hashing rounds with time. What is a minimum number of rounds? Some math could certainly answer it (by comparing the average entropy of a strong password versus the average entropy of a credit card number and increasing the cost accordingly). However, I suspect that the answer is "a lot of rounds". Moreover, I worry that what you are trying to do is fundamentally flawed.

You tagged this question as "PCI-DSS", so obviously this is a relevant concern. Generally the best way to achieve PCI compliance is to simply never have credit card numbers on your server. The fact that you are hashing full credit card details means that you have those numbers. I don't remember exactly what the rules are for PCI compliance when credit card numbers actually travel through your server, but I do know that it is much more complicated and can even be prohibitively expensive. It's much better to just not do it.

Moreover, the way you are hashing passwords is both not quite the norm and increases the security risk to your customers. Salts should be a random string - no need to tie to the user id (seems strange too since the whole point of this is to match credit cards between anonymous users). Also, including not just the credit card number but the expiration and CVC in the hash is just a terrible idea. To accomplish what you want to accomplish you really just need the credit card details - any more information just lets an attacker get the full credit card details in the event that they obtain and successfully brute force the hash.

This is kind of where the whole "Don't roll your own security" idea comes in. If you'll forgive me, it sounds like you might be a bit outside of your experience level when it comes to securing important customer data, so coming up with your own system for matching customer credit cards seems like a good way to inadvertently leak a lot of credit card numbers and cause a lot of trouble for your business.

To recap you might be best off listening to ThoriumBR's answer: just don't do this. The slight improvement in the UI for your customers isn't worth the increased risk of leaking their credit card numbers. If you do want to do this, find a PCI compliant credit card processor that can translate the credit card number into a unique and secure string for you. You can then safely store and compare it in your database. The immediate example off the top of my head is Stripe.

And some numbers

I was also looking at some numbers this morning. I found this article relevant:

https://www.pxdojo.net/2015/08/what-i-learned-from-cracking-4000.html

The person describes using what is (probably) a low-mid level hashing setup to crack passwords from the Ashley Madison data dump. They used bcrypt with a cost factor of 12 (I believe that the number of rounds of encryption is equal to 2^cost_factor, i.e cost factor of 12 = 2^12 = 4096 rounds of bcrypt (but absolutely verify that fact if you actually try that). With this guys cracking rig he had a hash rate of 156 hashes per second. ThoriumBR outlines the entropy of a credit card number quite well (with about 1e11 possible credit card numbers) which means that cracking one credit card number with the same hash rig setup as the guy in the above link with a cost factor of 12 will take about 20 years. Salting will make sure that each hash has to be brute forced individually.

Of course, these numbers can change suddenly. The hashing setup I quoted in my example definitely isn't a top of the line one. Also, things can change dramatically if someone comes up with an efficient ASIC for bcrypt (I don't believe there is one yet, but could be wrong).

Perhaps I did not explain my situation well enough. We _do_ tokenize our cards with Stripe (along with a variety of other processors, we whitelabel), we don't want to store those. The problem we are trying to sove is combining duplicate payment methods under the same user account. We don't require the users to "login", but do require them to provide an email. The idea is IF they enter the same card that is already saved to their account, to combine the two ("dedup") under the same payment method. — Chris Smith, Jul 03 '18 at 19:43
The reason I reference `pci-dss` is because PCI says you must use "strong cryptographic hahses" when storing numbers. I'm just wondering "how strong". I suppose it is an "unsolvable problem". It's just a mappter of time, and the same with passwords. — Chris Smith, Jul 03 '18 at 19:43
@ChrisSmith - If you're using Stripe to tokenize, then you shouldn't ever have the full card number on your system. If you don't have the full card number, your choices are to hash the *token*, which requires using the same card to generate the same token (at which point why hash...) or to hash the masked card number (which doesn't have enough uniqueness, and again, why hash...). — Bobson, Jul 03 '18 at 21:18

How many rounds should be used to hash card numbers?

2 Answers2