14

I had a discussion with a friend today about his password hash comparison. I argued that you can't return false on the first hash mismatch you have and I sent him a link to an article about a Java timing attack that happened in Java 6.

public static boolean isEqual(byte digesta[], byte digestb[]) {
    for (int i = 0; i < digesta.length; i++) {
        if (digesta[i] != digestb[i]) {
            return false;
        }
    }
    return true;
}

And my version, which I think is correct:

public static boolean isEqual(byte digesta[], byte digestb[]) {
    boolean isEquals = true;   
    //this avoid a possible timing attack
    for (int i = 0; i < digesta.length; i++) {
        if (digesta[i] != digestb[i]) {
            isEquals = false;
        }
    }
    return isEquals;
}

Which, for me, indicates there is a possible timing attack that could be done because you return false if there is a mismatch in the hash. He argued that this would not be a risk in the application and that the important thing is that the password is salted and hashed and that will probably will not affect the security of the application.

Am I paranoid about the security of my application?

schroeder
  • 123,438
  • 55
  • 284
  • 319
YShow
  • 141
  • 1
  • 5
  • 10
    Salting and hashing is only important for storage, not for timing attacks. I have a ***huge*** problem with your code defaulting to `True` for the password check. I'm not sure what your code is supposed to do to combat timing attacks ... How many hashes are in `digesta` and `digestb`? – schroeder Oct 01 '20 at 10:26
  • its a single hash on each digest variable, the part about salt and hash is what my friend said sorry if the message wasn't clear, English is not my first language, this function is to compare the the password that the person typed (digesta) and the digestb is the stored password in the database, my code is not entirely correct since i don't have the actual code with me right now – YShow Oct 01 '20 at 10:43
  • 1
    That's ok if the code is approximate, My question is: where is the *timing* difference between the 2 codes? `digesta[i] != digestb[i]` is the only relevant check and it's the same in both samples. – schroeder Oct 01 '20 at 10:45
  • 3
    the first one would return false as soon the hash didnt match meaning it could be faster for a attacker to know the hash didn't match, my version would go through all the array even if its confirmed false and then return the result after checking every single index, https://codahale.com/a-lesson-in-timing-attacks this is the place where im getting this information from, he did this same mistake in his function to check the password and its what started the whole argument with him. – YShow Oct 01 '20 at 10:48
  • 3
    A note on terminology (I'm just repeating what @schroeder said). This wouldn't be an attack, it would be a vulnerability – Conor Mancone Oct 01 '20 at 11:14
  • 3
    Another question would be if Java doesn't optimise you version back into the short-circuiting one anyway. I tend to think it doesn't, but wouldn't bet on it if it really was security-critical. – leftaroundabout Oct 01 '20 at 16:47
  • Thank you Conor, im not really good with the terminology in English yet. – YShow Oct 01 '20 at 21:08

7 Answers7

20

The first algorithm is sensitive to timing attacks, while the second looks better (but I don't know for sure if it's vulnerable or not). However, there is a bug with potential security implications in the second version: What happends if the two strings have different lengths?

Timing attacks are a real security issue that it is reasonable to be worried about, so you are right to bring the issue up. I could partly agree with your friend in that it is more important to use salt and a good hashing algorithm. This however does not mean that timing attacks are not important or should not be taken seriously. They should.

However, in this case, it is not obvious how an attacker could pull off a timing attack. Since the input (the password) is hashed with a salt, the attacker probably can't freely control what any of the compared hashes will be. That means it might not be possible to search your way to a matching hash. But all this depends on how the entire system is built, not just on the string comparison algorithm.

One good way to deal with all of this is to use a good library for the hashing. A well written library should take care of both salting and defense against timing attacks for you, without you having to go through the hassle to write all the code yourself.

Anders
  • 64,406
  • 24
  • 178
  • 215
  • 6
    +1 for suggesting a standard library. – Conor Mancone Oct 01 '20 at 11:05
  • This is the definition of a timing attack. Whether or not it can be practically exploited is another matter.... – Conor Mancone Oct 01 '20 at 11:06
  • @ConorMancone "exploitable" is one thing. I'm challenging the notion that it's an "attack". – schroeder Oct 01 '20 at 11:07
  • 5
    _"Since the input (the password) is hashed with a salt, the attacker probably can't freely control what any of the compared hashes will be."_ - That might have been exactly the point of the OPs friend. – marcelm Oct 01 '20 at 16:42
  • about the different lenghts, this was not the complete code, i usually use java to do authentication with passwords and i use the standard MessageDigest.isEquals for that, his code was made in C# and i don't know of a method equivalent in C#, and yes you're right that the index could fail if they are of different lengths, ill have to check with him about his code again. – YShow Oct 01 '20 at 21:05
  • 4
    The second is *less* vulnerable to timing attacks than the first, but it's still got a branch that's taken or not depending on the outcome of comparing each byte. A timing attack on the first tells you which byte has the first mismatch, while a timing attack on the second tells you how many bytes don't match (a harder thing to exploit). – Mark Oct 01 '20 at 21:07
  • 4
    Also, let's not forget that a compiler/intepreter may optimise the second version to act the same as the first. Which is another reason why standard libraries should be used (the assumption is this has been handled properly by those two do this for a living). – Gregory Currie Oct 02 '20 at 09:55
  • ... maybe the second isn't vulnerable to a timing attack until it runs through an optimizer. Java runs code optimizations for over 20 years, haven't they? i bet the 2nd implementation is vulnerable in practice, if you inspect the java bytecode it compiles to.. – user1067003 Oct 03 '20 at 23:04
10

You're both right, but you've both missed the point :)

You are correct: this is a classic example of a timing weakness and creates a possible side channel attack.

Your coder is correct: given network travel times and other mitigating factors that you naturally put in (brute force detection, up lockout, etc...) it is very unlikely that an exploit is possible.

However, that isn't my main concern. This is a classic example of a timing vulnerability, AKA this is a pretty basic security vulnerability. Therefore I would assume that whoever wrote this code is not as experienced with managing passwords and hashes as they probably think they are. This leads to a very natural question: what else did they unknowingly get wrong?

Without a full code/system review from someone who is an expert in this area, you'll never know the answer to that question. As a result I suggest you take @Ander's advice: switch to a standard library.

Conor Mancone
  • 29,899
  • 13
  • 91
  • 96
  • Yes i agree with you, unfortunately his code is private and its for another company that hes working, im not a expert C# so i don't know how correct the rest of the code is, i just noticed that vulnerability in the code when he showed me the isequals method of him. – YShow Oct 01 '20 at 21:17
  • 1
    is the code correct after running through the compiler's optimization stage? bet the answer is no – user1067003 Oct 03 '20 at 23:03
8

The second piece of code you showed is still susceptible to timing attacks. This is because how many times the if-branch is taken, and in what sequence it is taken, are dependent on the input data.

Here is an algorithm that is resistant to timing attacks on any reasonable CPU:

public static boolean isEqual(byte digesta[], byte digestb[]) {
    if (digesta.length != digestb.length)
        return false;
    int difference = 0;
    for (int i = 0; i < digesta.length; i++)
        difference |= digesta[i] ^ digestb[i];
    return difference == 0;
}
Nayuki
  • 222
  • 2
  • 7
  • Wouldn't the initial length comparison also make it very easy determine the length of the correct hash with a timing attack? – Ivo Merchiers Oct 02 '20 at 06:59
  • 2
    @IvoMerchiers The length of the hash isn't generally a secret. – richardb Oct 02 '20 at 07:56
  • 1
    I was going to point that out, and just saw your answer! I cannot more agree with you that 2nd code still has a vulnerability. – user1532080 Oct 02 '20 at 15:41
  • 2
    @IvoMerchiers I would also expects both hash to be out of reach of the attacker, and of same length... Hashes algorithm usually have a fixed size output. This check is to me an extra security to avoid an exception. – user1532080 Oct 02 '20 at 15:42
  • 1
    I wonder if any compiler would short circuit this if `difference` becomes full of 1 bits. – Vaelus Oct 02 '20 at 15:47
  • 2
    @Vaelus that is actually a very valid concern, and another reason why it is best to just use standard libraries. It doesn't matter how good your code is if an interpreter/compiler is going to make optimizations behind the scenes, and that has happened and has caused vulnerabilities before. – Conor Mancone Oct 02 '20 at 17:03
  • Now your code leaks information about the digest length. That being said, this is unlikely to matter in practice because for a given hash the output size is always fixed, so length differences shouldn't happen in practice. Not that I would ignore this all together, but it isn't obviously a security concern and your solution doesn't obviously fix it either. – Conor Mancone Oct 02 '20 at 17:05
  • 1
    @IvoMerchiers The hash length is not controlled by the attacker as I mentioned my answer. The attacker submits the password and the library calculates the hash and the comparison is made. So they are the same length. – kelalaka Oct 02 '20 at 20:34
  • @ConorMancone I understand your concern, but I can't think of any way to avoid length-dependent timing. For example you could change the loop condition to iterate until `Math.min(digesta.length, digestb.length)`, but you could conduct a timing attack to see when the length of your controlled input exceeds the length of the hidden input. – Nayuki Oct 02 '20 at 23:49
  • @Nayuki All the more reason to stick to standard libraries although, again, I don't think this is actually a concern. However if you *really* wanted to do it right then you would loop for the *maximum* length and put in a simple check to not raise exceptions in cases where the length is different (but still run through the loop). Finally you either set a minimum loop count of your expected hash length, or set a minimum loop set of something much higher (in practice, 100 or more). – Conor Mancone Oct 02 '20 at 23:58
  • @Nayuki Length independent version: Iterate to the length of the stored digest, compare with modulo index (to make sure no buffers are over read): difference |= digesta[i % digesta.length] ^ digestb[i % digestb.length]; then only after the loop check the lengths (note non-short circuit and): if ((digesta.length == digestb.length) & (difference == 0)) return true; else return false; – tylisirn Oct 03 '20 at 06:15
  • If you really want to remove any timing attack, design your system to let's say perform a check in 0.01s on one machine. Add a rate limiter that limits to 100 checks per second. Before comparing, get the current time with high precision (ms or µs). After comparing, get the current time, compare, and for instance, sleep for "1s minus time spent in the comparison". Your response time will consistently be around 1s. – user1532080 Oct 03 '20 at 06:46
  • @tylisirn That's almost right, but I think modern CPUs have variable timing for `/` and `%`; the number of clock cycles for the operation depends on the data values. But they are constant-time for `+`, `-`, `*`, `&`, `|`, `^`, `<<`, `>>`. – Nayuki Oct 03 '20 at 14:51
  • 1
    @Nayuki Ah, interesting. You're right, I was under impression that they were fully pipelined on modern CPUs, but they're not. So this fails to be constant time. I'm not sure this leaks any information though. Assume B is attacker input, then [i % a.length] will always compute same sequence, and in fact that could just be [i] since we just iterate to a.len anyway, so only variable is [i % b.length]. Can we derive from that how many times it was called? I suppose one could, at least theoretically. – tylisirn Oct 03 '20 at 20:25
  • @Nayuki New suggestion then, instead of modulo indexing: `difference |= (i < digestb.length) ? digesta[i] ^ digestb[i] : digesta[i] ^ digesta[i];` Note latter xors A with A. Doing that way because I'm not confident compiler wouldn't optimize digestb[0] into something different from digestb[i]. I hope the compiler wouldn't be smart enough to optimize A ^ A to just 0. I suppose one would have to verify it from the decompiled bytecode if either of those work. – tylisirn Oct 03 '20 at 20:31
4

You are correct. Because you wrote the code to check the hash character by character (why are you doing that?), it would be possible to use timing to work out the correct hash character by character. But that would be no different from trying random passwords. You just know which of your attempts resulted in a close hash. It doesn't inform your next guesses.

Add to that the idea that you should also have numerous other protections against brute-force attacks, and this is not a big threat.

schroeder
  • 123,438
  • 55
  • 284
  • 319
  • I think it would actually assist with a bruteforce attack and being able to brute force the hash one character at a time would substantially reduce your search space. However just simple network travel time inconsistencies would likely be enough to render a potential attack impossible – Conor Mancone Oct 01 '20 at 11:17
  • @ConorMancone I can kind of see that, theoretically, but I have trouble seeing the efficiency gain over directed brute force. – schroeder Oct 01 '20 at 11:19
  • Finding out the target hash value changes the attack from an online attack to an offline attack. Though the transformation is already detectable and blockable through conventional means (fail2ban and similar). – Artjom B. Oct 04 '20 at 19:54
  • @ArtjomB. I doubt the ability to obtain the hash in an more optimised time than random brute force – schroeder Oct 04 '20 at 19:56
  • You're right. The attacker sends passwords to try and if they find the first byte then they have to send passwords to try where the first byte always matches the first byte that was already found (provided the server doesn't use secrets like unknown hashing algorithm or random salt). This gets really computationally intensive when they try to go from 5 to 6 bytes or higher. – Artjom B. Oct 04 '20 at 21:26
2

If the password is stored plainly clearly the first code can cause a remote timing attack. The second one only leaks about the number of if condition satisfied. Can this leak be used for remote timing? It is not clear without a real.

If the second is written the if condition as ( using C)

int strctcmp(const char*digesta, const char*digestb)
{
  int equal = 0;
  for (; *digesta && *digestb; ++digesta, ++digestb)
  {
    equal |= *digesta != *digestb;
  }
  return equal;
}

then it is a true constant time that eliminates the information of the number of equalities. The string length are always equal since the attacker cannot control the length. Coding is not enough and one also need to consider the compiler optimization. Therefore one either check the result or use in assembly. There is a great page about constant timing on BearSSL pages written by the Thomas Pornin.

When the salt is introduced and a password hashing is applied like Scrypt, PBKDf2, Argon2 the remote timing attack becomes impossible for both cases if the salt is not known. Since the attacker can only get the comparison timing of hashed passwords. We expect them to have avalanche criteria, which is a bit change in the input flips each output bit with 50%. So the attacker has no knowledge about the result in order to get information.

If the salt is known, then the attacker can use this knowledge to produce the necessary hash values as vectors to execute a remote attack. Producing the test vectors will require too much time since we cannot control the output of a hash. So in this case it is impossible to good passwords. If the password is weak, then the attacker can try all and compare the timing too. Wait, are you allowing that much password testing, then reconsider your system security.

Conclusion: Your friend is correct, If salted and hashed correctly, and the incorrect trial limit is set then there is no problem.

kelalaka
  • 5,409
  • 4
  • 24
  • 47
  • the closer the attacker is to the real string being compared to, the longer your function will run... your function is timing-attack-vulnerable in the sense that "an attacker controlling the length of 1 of the strings will be able to figure out how long the other string is", hmm – user1067003 Oct 03 '20 at 23:10
0

... maybe the second isn't vulnerable to a timing attack until it runs through an optimizer. Java runs code optimization passes for over 20 years, haven't they? i bet the 2nd implementation is vulnerable in practice, if you inspect the java bytecode it compiles to...

the traditional way to do a constant-time string comparison is:

bool is_string_equal_constant_time(const string &str1, const string &str2){
    if(str1.size()!=str2.size()){
        throw invalid_argument("both strings must be the same length to compare them in constant-time! sha2 hash em or something");
    }
    int result=0;
    for (size_t i = 0; i < str1.size(); ++i) {
        result |= str1[i] ^ str2[i];
    }
    return (result==0);
}

and i'm guessing there's a good reason it's done that way (PS: i don't do java, i don't know how to do it in java, but this is how it's done in C++, it should be easy to port to Java for someone who knows java.)

user1067003
  • 564
  • 4
  • 11
0

The answer to this is the single most important aspect of cryptography: you need a threat model. Your threat model dominates answers like these. The correct answer for how to keep your little sister out of your diary is very different from the answer when dealing with state level actors.

Salting and hashing defend against a different kind of attack than these timing-attack-free algorithms are trying to deal with. Salting and hashing is all about how you protect against an adversary that can get their hands on your data at rest -- i.e. the passwords file on your hard drive. If they can get their hands on that, salting and hashing is an essential part of protecting that data.

But that's only one threat model. There's other threat models that worry about timing attacks, where an adversary finds a way to execeute code on your computer while it is online. This creates totally different attack vectors. For example, you may lock down your data perfectly, so that an adversary could never crack the passwords given that they steal the hard drive, but you permit them to run some small scripts as part of the normal business uses of this server. Suddenly they can use this to create a timing attack.

Or worse, you wisely isolate this server so that it only runs your secure software, because you knew that these timing attacks were not in your threat model. Then, one day, someone decides to save a bunch of money by virtualizing these computers and sticks your secure application on a physical computer that's shared with a less secure app. Thanks to virtualization, you know your data is safe, but an attacker can go after you by using the less secure server and leveraging timing attacks.

So the question becomes "what is your threat model." What sort of adversary are you trying to defend against? Are you just trying to keep some data safe from script kiddies? Or are you worried about a state-level actor expending millions of dollars worth of resources to attack your computer? Sometimes you may have to consider timing attacks. Other times, you may not.

Cort Ammon
  • 9,206
  • 3
  • 25
  • 26