35

In this interview posted on Krebs on Security, this question was asked and answered:

BK: I’ve heard people say, you know this probably would not have happened if LinkedIn and others had salted the passwords — or added some randomness to each of the passwords, thus forcing attackers to expend more resources to crack the password hashes. Do you agree with that?

Ptacek: That’s actually another misconception, the idea that the problem is that the passwords were unsalted. UNIX passwords, and they’ve been salted forever, since the 70s, and they have been cracked forever. The idea of a salt in your password is a 70s solution. Back in the 90s, when people broke into UNIX servers, they would steal the shadow password file and would crack that. Invariably when you lost the server, you lost the passwords on that server.

Ptacek doesn't really explain why this is the case--he only says that salt has not prevented this type of attack in the past.

My understanding is that salt can only prevent pre-computation of password hashes because of the space needed to store the precomputed hashes. But if you have compromised the system, you will have both the salt and the hash. So the time to dictionary attack the hash does not change significantly (just an extra concatenation of the salt to the dictionary word).

Is my understanding correct?

Andrei Botalov
  • 5,267
  • 10
  • 45
  • 73
pepsi
  • 485
  • 4
  • 7
  • 3
    Of course, it depends on the size of the salt and its visibility. If you use the old Unix salts, there are only 4096 of them and they are stored along with the hashes, so an attacker can still compute rainbow tables and know which one to use. It makes the job harder, but still feasible with rainbow tables and a lot easier than brute force. Now if LinkedIn used 256 bit pseudo-random salts and did not store the salts or seeds in the database, that would have prevented rainbow tables ***and*** recovery/warning tools like leakedin.org – Major Major Jun 11 '12 at 19:56

8 Answers8

37

Krebs follows up on this question, and Ptacek does clarify what he meant:

BK: Okay. So if the weakness isn’t with the strength of the cryptographic algorithm, and not with the lack of salt added to the hashed passwords, what’s the answer?

Ptacek: In LinkedIn’s case, and with many other sites, the problem is they’re using the wrong kind of algorithm. They use a cryptographic hash, when they need to use a password hash.

In the next couple of paragraphs, he also elaborates on the reasons for it. The long and the short of it is that SHA1, with or without salt, is far too fast to be used as a password hash. It is so fast, that when computed using a GPU or something similar, you can brute force 10s of thousands of hashes per second. As is elaborated on later in the interview, LinkedIn should have been using bcrypt, which is an adaptive hash that would have slowed the brute force time down to the order of 10s of hashes per second.

TwentyMiles
  • 869
  • 5
  • 8
  • Ahh, with this statement, Ptacek's answer makes a bit more sense. Remembers me that one should always get the whole picture before judging. – martinstoeckli Jun 11 '12 at 20:01
  • 7
    *tens* of hashes per second on a GPU? If that was really true, systems logging in and creating hundreds of accounts per second (such as linkedin) would need several dozens of servers just to calculate the hash. Not really feasible. Even with blowfish, a GPU can dish out *millions* of hashes per second – Razor Jun 12 '12 at 08:31
  • 8
    @Razor - bcrypt lets you set a "work factor" to decide how much computing power it should take to calculate the hash, so you can decide what tradeoff between speed and security is right. (You can also bump that number up as computers get faster.) That said, if you have enough server power for your actual site to function, hashing passwords will just be a tiny part of that. – Nathan Long Jun 12 '12 at 10:28
  • 3
    Note also that bcrypt **does** use salts - http://stackoverflow.com/questions/6832445/how-can-bcrypt-have-built-in-salts – Nathan Long Jun 12 '12 at 10:35
  • Also note that the auto-salt feature of BCrypt is not true for all implementations; most notably, the PHP implementation requires a salt value as one of the input parameters. – Jacco Jun 12 '12 at 10:40
  • SHA1 is not encryption it is a hashing. This makes me question Ptacek knowlege in the field. – Ramhound Jun 12 '12 at 11:35
  • 2
    Shouldn't you use [scrypt](http://www.tarsnap.com/scrypt.html) instead? – Zolomon Jun 15 '12 at 09:48
  • @Zolomon - Take a look at the SE question I linked to in my answer. It was perhaps a bit hidden, so I've changed the whole containing sentence to be a hyperlink. Essentially, the argument for bcrypt over scrypt is that scrypt is still very new, and since there aren't any known attacks against bcrypt, you should favor bcrypt over scrypt since there has been more analysis and testing performed on it. – TwentyMiles Jun 15 '12 at 15:07
  • 3
    @Ramhound Where does he claim it's encryption? – CodesInChaos Jun 15 '12 at 16:46
  • 1
    I think it's far too easy to say "Shoulda used bcrypt" after the fact. No, _shoulda not have had your DB stolen_. – bobobobo Apr 14 '13 at 19:56
  • @NathanLong "you can also bump that number up as computers get faster" That doesn't make sense, unless you want to ask all users to choose all new passwords (all the hashes will change). – bobobobo Apr 14 '13 at 19:57
  • 1
    @bobobobo - why? If you want to go from X rounds of hashing to X + N rounds, you can just rehash all your existing passwords N times and save them. From then on, when someone supplies a plain text password, you run X + N rounds and compare the result. See http://crypto.stackexchange.com/a/3021 – Nathan Long Apr 14 '13 at 20:05
  • @NathanLong You can't up the work factor of bcrypt/scrypt/pbkdf2 *without* the plaintext, meaning you can only upgrade the hash upon the next login (which should be good enough in practice, at least in most cases). IIRC, this is a desired feature for SHA-3. – Kitsune Apr 15 '13 at 20:49
  • @Kitsune - it appears that you're right and I was wrong. Discussion here: http://security.stackexchange.com/questions/15847/is-it-possible-to-increase-the-cost-of-bcrypt-or-pbkdf2-when-its-already-calcula – Nathan Long Apr 16 '13 at 11:52
  • 1
    @bobobobo you're forgetting the obvious solution: you wait for the user to log in again and then upgrade the password. You have the original because the user gave it to you. You dont need to ask him to change it. – chacham15 Apr 30 '13 at 14:48
  • @chacham15 Though having the user change their password is a good idea in case an attacker eventually gets access to the older, more vulnerable hashes. – JAB Dec 09 '16 at 01:58
28

Let's look at the salt question first, and then at the speed issue:

Salt, dictionary attacks and rainbow tables

A salt massively helps against dictionary attacks in the common case of an attacker getting access to more than one password hash.

Without a salt, an attacker will sort all the hashes. He will hash the first word from the password dictionary, and check whether the calculated hash is in his sorted list of stolen hashes. With a bit of luck, he already got access to multiple accounts.

But with a salt, he has to attack each account separately.

Rainbow tables are just a special case of dictionary attacks, in the sense that they have been done before the attack and are ready to use.

It's important to note: Sticking a constant random string to all passwords will render prebuilt rainbow tables useless. But it is still a bad idea because of the parallelism issue described earlier.

Speed, algorithms for documents vs. for passwords

In addition to not using a salt, LinkedIn used a hash algorithm which is very fast, and can be executed on special hardware for even more extra speed. This special hardware is not exotic but part of common graphic cards.

Robert David Graham posted the details of his cracking work, including performance data:

2 billion per second using the Radeon HD 7970. A 6 [character] password [in] 500 seconds [...] with brute force.

The original use case for hash algorithms was to sign documents. Therefore being fast was a design goal.

Modern hash algorithms for passwords are designed to be relatively slow. A simple way to make them slower is to use the hash function repeatably. ShaCrypt and BCrypt are such algorithms. They pay extra attention to prevent parallel processing and being resistant against pre-image attacks on a single round.

Scrypt takes this one step further: In addition to being slow, it requires lots of memory (e. g. 16 MB for the default configuration). Specialized hardware usually only has access to about 1 KB of fast internal memory. Access to core memory is slow. Therefore building a fast scrypt cracker in hardware gets expensive very quickly.

Hendrik Brummermann
  • 27,118
  • 6
  • 79
  • 121
  • 5
    Using a constant random string as a salt will render prebuilt rainbow tables useless (unless one of the rainbow tables happens to have used that same random string as its salt) but it is much less helpful than using a different salt for each password. Using a constant salt, an attacker can build a hash dictionary using that salt and compromise all accounts using a password in the dictionary in one pass. All accounts using "password" as the password would be discovered in one pass. Varying the salt would force the attacker to break each account separately, making the system much more secure. – Major Major Jun 11 '12 at 20:07
  • 1
    @MajorMajor, that is exactly what I wrote, didn't I? – Hendrik Brummermann Jun 11 '12 at 20:10
  • 1
    @MajorMajor, ah you got it the other way round. For the constant-"salt" case, I propose this: Sort the password database and use that one to lookup the freshly calculated hashes. This is way more effective then building a dedicated rainbow table first, with many entries that are not in the password database anyway. – Hendrik Brummermann Jun 11 '12 at 20:19
  • Hendrik, what I wrote agrees with what you wrote, but I wanted to clarify your comment that using constant salts would render prebuilt tables useless. With tiny salts the tables might have been prebuilt anyway and with huge salts you still crack all accounts with the same password in one go. For example, if you have an account on the system then you can immediately find everyone else who is using the same password as you. I was, as always, concerned that a newbie might misunderstand a remark and end up doing something stupidly insecure. – Major Major Jun 11 '12 at 20:51
17

You are correct, however that doesn't change the fact that it is essential to use a salt. In this case attackers got hold of the hashed passwords, so they could either use a rainbow table or start a brute force or dictionary attack.

  • A rainbow table will get you all the passwords (up to the size and complexity in the table) in a very short space of time.

  • Likewise, a dictionary attack will get you the passwords which are in dictionaries.

  • Brute forcing will get you the short and simple passwords quickly, but the time taken to get the longer ones quickly becomes so prohibitive that users with long passwords are still relatively safe.

So a salt removes that first set of possibilities, forcing the attackers to use dictionary and brute force solutions - making the users safer.

Rory Alsop
  • 61,367
  • 12
  • 115
  • 320
7

What a salt does is it renders rainbow tables useless, which does slow down bruteforce attempts on the password.

Definition of rainbow table:

A rainbow table is a precomputed table for reversing cryptographic hash functions, usually for cracking password hashes.

Without a salt, an attacker could easily use a pre-generated rainbow table containing millions of passwords and their hashed equivalent and compare it against the password.

With a salt, every password requires the attacker to generate an entirely new rainbow table.

It has no impact on dictionary attacks - easy, obvious dictionary based passwords like password will be cracked easily with or without the salt.

A salt SHOULD be used however. Password cracking is all about time/effort. No password/hash is invincible. It is all about forcing the attacker to spend more time than he is willing to spend on your password tables.

Iszi
  • 26,997
  • 18
  • 98
  • 163
  • "_With a salt, every password requires the attacker to generate an entirely new rainbow table._" then why bother with tables at all??? – curiousguy Jun 11 '12 at 15:29
  • As far as John the Ripper(the software i use) is concerned, password cracking goes like this. Takes list of input passwords, hash, compare against known hash. This is practically no different with generating rainbow tables for each salted hash and comparing it. –  Jun 11 '12 at 15:35
  • So if you were using a CorrectHorseBatteryStaple style password, and the table was salted your password would probably be safe, because they'd brute force all the "low hanging fruit" and probably not worry about that last 5% or so of the users that used the most secure passwords. Heck, hackers might be happy with bruteforcing just 20% or so of the passwords. – aslum Jun 11 '12 at 16:31
  • 4
    @TerryChia "_is is practically no different with generating rainbow tables for each salted hash and comparing it._" Hug? This is entirely different. Generating tables required lots of space, and is **slower than directly trying out passwords**. – curiousguy Jun 11 '12 at 17:17
  • @aslum - I wouldn't even say "probably be safe" considering the first thing they do is attempt to string multiple dictionary words together. – Ramhound Jun 12 '12 at 11:41
  • 1
    @Ramhound "_the first thing they do is attempt to string multiple dictionary words together._" which password cracker are you talking about? – curiousguy Jun 12 '12 at 14:18
  • @Ramhound: Well if you just pick words maybe, but really you should have a little something more in there like correct&battery&horse&STAPLE42 ... regardless if they're brute forcing it, they're going to catch the one word passwords first, then the two word, then the three. Which is going to be the large majority of users. Then again maybe the hackers are completionists, or are specifically interested in your passwords and they'll keep going until the heat death of the universe, or they crack your password, whichever comes first. – aslum Jun 12 '12 at 16:24
3

Salts prevent parallelism. Parallelism is generic in the whole space-time continuum; by this, I mean that it can be applied space-wise and time-wise:

  • Space-wise parallelism is when the attacker has several hashes to crack (that's the LinkedIn situation). With unsalted hashes, the attacker can hash one potential password and look the result up in the whole list of hashes he wants to crack.

  • Time-wise parallelism is when the attacker precomputes hashes of common passwords, into a big table (rainbow or not), to apply on hashes that he subsequently obtains.

What Ptacek meant was that:

  • Salts do only half the job; to really slow down the attacker, you need an inherently slow hashing process, like bcrypt.

  • When there are many users, at least some of the passwords will be so weak that they will be cracked, regardless of how much salted bcryptness you do.

So, while salts would have somewhat enhanced the situation for LinkedIn, they would not have saved them, only delayed and somewhat diluted the trouble. Using bcrypt or PBKDF2 would have further improved things, but not to the point that the breach could be totally ignored.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
2

If an attacker exploits a system, and the salt is also compromised, the difference is that pre-computed rainbow tables will be useless. (Any particular password could be stored on a number of different ways, depending on salt size). Rainbow tables being a trade-off between time and space, a lot more space would be needed.

Hope it helps. An example is salt usage in the shadow file in UNIX, described clearly here: Why shadow?

liviu
  • 153
  • 2
  • 7
  • "_Rainbow tables being a trade-off between time and space, a lot more space would be needed._" then, why bother with tables at all? – curiousguy Jun 11 '12 at 15:30
1

Ptacek doesn't actually say that a salt would not have prevented LinkedIn passwords from being cracked. I would have to argue that depending on the size of the salt it would have even protected the worst of the passwords.

The one weakness SHA1 is its weakness to a brute force attack. All you have to do is generate X hashes ahead of time in and compare the value of a string's SHA1 hash to your generated list of hashes.

With a salt depending on the size in order to generate the list ahead of time would take a very long time and a very large amount of disk storage. You have to remember that one would have to generate the same list for every possible salt. At this point your only hope is try every single combination and hope you find a match. If you are able to generate a unique salt per user without storing the salt in a database you are even more secure.

In other words...Instead of every single password being cracked...LinkedIn would only be looking at a very small percentage of their users passwords being leaked ( the worst of the worst ).

Update

Where would you store the key?

If done the correct way you would not need to store it. Just generate a salt based of the username, when the account was created, or a combination of the two and you would have a unique salt for the given user.

So unless the source to your website itself was leaked you any hackers wouldn't be able to generate it. You could even make this more secure and generate a new salt when the user changed the account's password.

Ramhound
  • 496
  • 4
  • 9
  • "_If you are able to generate a unique salt per user without storing the salt in a database_" how could you? – curiousguy Jun 11 '12 at 15:28
  • @curiousguy: You could calculate salt from the username, registration date, or some other unchanging profile field, using an algorithm/key that's unknown to the attacker (i.e. not stored in the database). – Jesse McGrew Jun 11 '12 at 17:10
  • @JesseMcGrew Where would you store the key? – curiousguy Jun 11 '12 at 17:15
  • @curiousguy - See my update. – Ramhound Jun 11 '12 at 18:24
  • I still don't get it. I assume that you are storing the username and when the account was created in the database. On the other hand, you could use the client's IP address as the salt. (Scrub your logs and assume the client always has a static IP address.) – emory Jun 12 '12 at 04:45
  • 4
    If you generate the salt from some user related data, you allow the attacker to attack multiple instances of the hash (for example on separate systems) for the cost of 1 attack. If you rely on your source code to stay secret, you enter the realms of security through obscurity. – Jacco Jun 12 '12 at 10:48
  • @Jacco - Clearly using a unique salt does every problem but it certainly stops some serious problems. – Ramhound Jun 12 '12 at 11:44
  • "_The one weakness SHA1 is its weakness to a brute force attack._" It isn't a weakness of SHA-1, it's weakness of the **use** of SHA-1 where resistance to brute force attack is required. – curiousguy Jun 12 '12 at 13:12
  • @emory "_(Scrub your logs and assume the client always has a static IP address.)_" are you serious? – curiousguy Jun 12 '12 at 14:17
  • 1
    @curiousguy In the vast majority of cases I don't see how you can use a salt that you do not store. But if you are not keeping http logs and you operationally assume that your clients will always have the same IP address (neither of which is reasonable) then you could use the client's IP address as a salt. Without making some sort of unreasonable assumptions, I don't see how you can have a salt that is not stored on your system. – emory Jun 12 '12 at 14:25
  • @curiousguy: You'd store the key in your source code or in shared memory. It wouldn't be safe from an attacker who has root access, but it would be safe from one who only has SQL access. That's not "security through obscurity" - all cryptography relies on being able to keep some value secret. – Jesse McGrew Jun 12 '12 at 18:33
  • To clarify, I'm talking about a function along the lines of: salt = hash(combine(username, creation date, secret key)). The algorithm isn't secret, but the key is. – Jesse McGrew Jun 12 '12 at 18:37
  • @JesseMcGrew I understand that the threat you are mostly concerned with is SQL injection, is that correct? I think SQL injection is very easy to avoid using the correct approach (_not_ pasting strings together to built SQL requests). If you have a secret key somewhere you could even encrypt the whole database data, not just passwords (except for indexes other that hashed indexes, of course). Or you could encrypt the database files, or the mass storage device for the database. – curiousguy Jun 13 '12 at 12:11
  • @curiousguy The threat I'm concerned with is any case where an attacker gains access to the database (as apparently happened at LinkedIn), without gaining access to the application source code or shared memory. SQL injection is just one of many ways that could happen. Good point about encrypting the rest of the database contents. – Jesse McGrew Jun 13 '12 at 21:57
  • @JesseMcGrew "_as apparently happened at LinkedIn_" "apparently" indeed: until we know how the database was accessed, we cannot rule out a remote access to the running password checking device (with encryption keys and everything). But it's plausible they only had access to a recent backup copy of the database (IMO backups should always be encrypted). – curiousguy Jun 13 '12 at 22:18
  • @JesseMcGrew - Even if you were to store the unique salt for each of your users and the entire database was stolen. You would be better off then if you used a single salt. Which is the reason I suggest using a unique salt that can be generated based on the user's profile. Even if the ability to generate the salt was compromised, you would still be better off, then if you had use a single salt for all your users. – Ramhound Jun 14 '12 at 11:38
  • @curiousguy - Indeed. Generating salt from the user's profile gives extra protection against SQL injection, backup theft, or breaking into a database server... and as Ramhound points out, even if the application server is also broken into, you're still no worse off than with traditional per-user salt. – Jesse McGrew Jun 15 '12 at 17:14
  • @JesseMcGrew If you have access to the running database or to a backup copy, or if you can do an SQL injection, you have access to all the "user profiles". – curiousguy Jun 15 '12 at 18:32
  • @curiousguy - Correct, but without access to the source code, you don't know the secret key that's hashed together with the profile data to generate the salt. – Jesse McGrew Jun 15 '12 at 20:39
  • 3
    Note that a "secret" component into password hashing (as your server-specific key) is often not called salt, but *pepper* (a salt is not considered secret). – Paŭlo Ebermann Jun 15 '12 at 22:29
  • @JesseMcGrew "_without access to the source code_" which means that you intend to keep the source secret, which implies that the software is never distributed, as source or or as binary (a binary can be reverse engineered). This is a very strong requirement: it means the source code cannot be widely peered reviewed, which is often considered bad especially for security sensitive, subtle code. – curiousguy Jun 17 '12 at 23:39
  • @curiousguy: You can redact the key before sending the code for review, which is easy if you keep it in a separate file. Or you can leave the key out of source code altogether by putting it in shared memory or some OS-level repository. This sort of thing is commonly done by applications that deal with secrets, such as keys for assembly signing in .NET or shared secrets for interacting with third-party systems. – Jesse McGrew Jun 18 '12 at 18:10
  • @JesseMcGrew If the code review can be done without looking at something, then by definition this something is not source code. – curiousguy Jun 18 '12 at 21:14
  • @curiousguy I disagree -- a variable declaration is source code, but the value used to initialize the variable is not necessary for a code review -- but perhaps you'd be happier with the term "resource". – Jesse McGrew Jun 19 '12 at 21:47
  • @JesseMcGrew "_I disagree_" I wrote "by definition", so you disagree with at least one possible definition of "source code": "what is reviewed by source review". Another different, incompatible definition of "source code" is "*.c, *.cc, *.cpp, *.h, *.hh, *.hpp, etc." Anyway... what matters is that we agree on a definition for the sake of discussion, and that we recognise that more than one definition exist. – curiousguy Jun 19 '12 at 23:54
0

A person with the stolen database tables has only 2 fields per user.

password hash                                  salt
c32528b3e9191a8c7471252c6de289939a1e6f01       cGFzc3dvcmQ

In the most basic terms, the way I understand it, all what salting does is make the password more longer.

The original password was 'password'. But I appended to the password before computing the hash the salt cGFzc3dvcmQ to make passwordcGFzc3dvcmQ. What does that do? Well, it just makes it so if you happen to have a rainbow table of common passwords and their hashes,

password                        hash
password                        5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8

where you can see at a glance that SHA('password')=5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8, looking up the original password from the hash is all too simple. We want the cracker to have to do some work for it.

So we go "Dyoh! We either prepended or appended THIS RANDOM STRING RIGHT HERE to the original password! Now your hash tables of common passwords and their hashes are useless"

Which they are. Because we complexified each password by the random string, basically "making the user's passwords better", and completely screwing up the hashes so they are completely uncommon now.

So salting makes it so each hash has to be cracked individually, because to do it, if the cracker has the salts, the cracker has to append/prepend the salt to each common password he tries. 2 different users could use the simple password password, but cracking that will be 2 entirely separate jobs for the cracker, because of the salt. So cracking all the passwords will take longer.

The article here says that because of the compute power available today, however, using salts is trivial to crack.

bobobobo
  • 101
  • 3