Exploiting timing information is one possible attack against things like password authentication systems.
Conceptually memcmp()
works by comparing two sets of binary data on a byte-by-byte basis (in reality processors can compare multiple bytes at a time, depending on optimizations, but the same principles below will apply). The function starts at the beginning of the data, compares each byte sequentially, and exits as soon as it finds a difference. If no difference is found then the function returns a code indicating that the data matches.
Because the function returns as soon as it finds a difference, an attacker with a sufficiently accurate clock can deduce secret information. They can induce calls to memcmp()
with different inputs and by measuring which inputs take longer they can deduce what a stored secret might be.
Example:
Consider a classical password hashing system. Suppose your password is stored as a secret hash, say for example Ek8fAMjPhBo
. (That hash was generated using the DES scheme provided by the Linux crypt()
function with a salt of na
and a password of secret
. Note that this function is insecure and you should not use it in real systems.)
In a strong password system your hash Ek8fAMjPhBo
is stored, but your password is not stored. When you are asked to authenticate the system will take your password, hash it, and then compare the two hashes to one another. If the resultant hashes match then you are granted access to the system, if the hashes do not match then your password is rejected. This allows the system to check to see whether or not you know your password without actually storing the password itself.
How an attacker can use timing to attack this system:
In order to attack this system an adversary really just needs to figure out what password hashes to the stored hash. Normally the stored hash is kept secret, but the adversary can use timing information to deduce what the stored hash could be. Once the adversary deduces the stored hash it is vulnerable to much faster and off-line rainbow table attacks, as well as circumventing on-line security measures like password retry limits.
The password system above has to compare a candidate hash against the stored hash in order to function properly. Suppose it takes 10 nanoseconds to compare each byte of the candidate hash to the secret stored hash. If no bytes match (one comparison) then memcmp()
will take about 10ns. If one byte matches (two comparisons) then memcmp()
will take about 20ns. Your attacker generates a few passwords and runs them through the system, recording how long each one takes. Suppose the first few hash comparisons take about 10ns each and then return, indicating that none of the bytes of the candidate hash match the stored hash. After a few tries one of the hash comparisons takes 20ns- which indicates that the first byte of the candidate hash matches the stored hash. In the example above this indicates that the attacker has deduced that the first byte of the hash Ek8fAMjPhBo
is E
.
Hashes by design have the property that you can't predict what hash will correspond to what password, so for example this does not tell the attacker that the password starts with s
. However, the attacker could have a large table of pre-computed hashes (a rainbow table) so they can look up other passwords that hash to a string starting with E
. After they try enough hashes they'll eventually get an input that causes memcmp()
to take 30ns, which indicates that the first two bytes match, and they have deduced that the first two bytes of the hash are Ek
. They repeat this process over and over until they deduce all or most of the hash. At that point they either know the password or can brute force it with a traditional rainbow table attack.
This is a little hypothetical, but you can find lots of practical information about timing attacks elsewhere on the net, for example:
https://research.kudelskisecurity.com/2013/12/13/timing-attacks-part-1/