Let's say that we're building a generic timing-safe comparison function for general purpose use. Making it so that it is safe when both strings are equal length is pretty well known. However, what I'm not sure about is how we can make it safe if the strings are different lengths (arbitrary).
One string will be deemed as "known", and the other as "unknown". We'll assume that an attacker only has control of the "unknown" value. Ideally, this function should leak no information about the length of the known string.
A trivial implementation, such as:
// returns 1 is same, 0 otherwise
int timing_safe_compare(char *known, size_t known_len, char *unknown, size_t unknown_len) {
// Safe since all strings **will** be null terminated
size_t mod_len = known_len + 1;
size_t i;
int result = 0;
result = known_len - unknown_len;
for (i = 0; i < unknown_len; i++) {
result |= known[i % mod_len] ^ unknown[i];
}
return result == 0 ? 1 : 0;
}
The problem here though is that there may be cache information leak.
For example, a word size in x64 is 64 bits. So we can fit 8 characters in a single register. If the known value is a string that's 7 characters or less (since we add 1 to the known_len), the comparison never requires another load operation for the known string, even though the unknown string will.
In other words, if the size of the unknown string differs from the known string by one or more word boundaries, the total amount of "work" being done may change.
My first instinct would be to only compare strings of equal sizes, but then length information would be leaked (as execution would follow several different branches).
So, this leaves two questions:
Is this small difference enough to be concerned about?
Can this sort of difference be prevented without leaking information about the known size at all?