22

The password hash used for MySQL passwords prior to version 4.1 (now called OLD_PASSWORD()) seems like a very simple ad-hoc hash, without salts or iteration counts. See e.g an implementation in Python at Django snippets: Old MySQL Password Hash

Has it by cryptanalyzed? Broken?

All I see on the web is brute-force attacks. These are very successful for short-to-medium length passwords as one would expect. Though I also wonder if brute force on this ad-hoc algorithm are slower than attacks on the newer PASSWORD() function which simply uses SHA1 (twice), since I assume there is more widespread hardware acceleration support for SHA1. See more on flaws and attacks on MySQL passwords at Looking for example of well-known app using unsalted hashes

nealmcb
  • 20,544
  • 6
  • 69
  • 116

3 Answers3

33

I am not aware of any published cryptanalysis on MySQL OLD_PASSWORD(), but it is so weak that it is kind of a joke. It could be given as an exercise during a cryptography course.

Update: a cryptanalysis similar to the meet-in-the-middle described below was published in F. Muller and T. Peyrin "Cryptanalysis of T-Function-Based Hash Functions" in International Conference on Information Security and Cryptology - ICISC 2006 in 2006, with a more generic description, and some optimization to find short passwords and keep in RAM.

For instance, here is a C code which "reverts" the internal state:

static int
oldpw_rev(uint32_t *pnr, uint32_t *pnr2, uint32_t add,
        unsigned char *cc, unsigned len)
{
        uint32_t nr, nr2;
        uint32_t c, u, e, y;

        if (len == 0) {
                return 0;
        }

        nr = *pnr;
        nr2 = *pnr2;
        c = cc[len - 1];
        add -= c;

        u = nr2 - nr;
        u = nr2 - ((u << 8) ^ nr);
        u = nr2 - ((u << 8) ^ nr);
        nr2 = nr2 - ((u << 8) ^ nr);
        nr2 &= 0x7FFFFFFF;

        y = nr;
        for (e = 0; e < 64; e ++) {
                uint32_t z, g;

                z = (e + add) * c;
                g = (e ^ z) & 0x3F;
                if (g == (y & 0x3F)) {
                        uint32_t x;

                        x = e;
                        x = y ^ (z + (x << 8));

                        x = y ^ (z + (x << 8));
                        x = y ^ (z + (x << 8));
                        nr = y ^ (z + (x << 8));
                        nr &= 0x7FFFFFFF;
                        if (oldpw_rev(&nr, &nr2, add, cc, len - 1) == 0) {
                                *pnr = nr;
                                *pnr2 = nr2;
                                return 0;
                        }
                }
        }

        return -1;
}

This function, when given the internal state after the len password characters given in the cc[] array (nr and nr2, two 31-bit words, and the add value which is the sum of the password characters), computes a valid solution for nr and nr2 before the insertion of the password characters. This is efficient.

This leads to an easy meet-in-the-middle attack. Consider the sequences of 14 lowercase ASCII letters, such that each letter is followed by its complement (the complement of 'a' is 'z', the complement of 'b' is 'y', and so on...). There are about 8 billions such sequences. Note that the sum of the characters for any of those sequences is always the fixed value 1533. Take N of those sequences; for each of them, compute the corresponding hash with OLD_PASSWORD(), and accumulate the values in a big file: each entry contains the sequence of characters, and the corresponding nr and nr2. Then sort the file by the 62-bit nr/nr2 pair. This file is a big table of: "using this sequence, we get from the initial values to that internal state".

Then takes the N sequences again, and this time use oldpw_rev() (as shown above) for each of them, using the actual attacked hash as starting point for nr and nr2, and 2*1533 == 3066 for add. This will given you N other pairs nr/nr2, each with a corresponding sequence. These values are accumulated in another file, which you again sort by 62-bit nr/nr2 pair. This second file is a big table of: "using this sequence on that internal state, we obtain the hash value which we are currently attacking".

At that point you just have to find two matching pairs, i.e. the same nr/nr2 in the first file and the second file. The corresponding 14-character sequences are then the two halves of a 28-character password which matches the hash output. That's probably not the password which was used in the first place, but this is a password which will hash to the same value and will be accepted by MySQL. Chances are high that you get a matching pair when N reaches 2 billions or so (we are in a space of size 262, so it suffices that N is on the order of sqrt(262)).

This attack has work factor about 237 (accounting for the sorting step), which is vastly smaller than the 262 work factor that could theoretically be achieved with a hash function with a 62-bit output (which is already too low for proper security). Thus, the OLD_PASSWORD() function is cryptographically broken.

(There are probably much better attacks than that.)

nealmcb
  • 20,544
  • 6
  • 69
  • 116
Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • 1
    Fast, and outstanding. And +1 for the idea of assigning this in crypto classes. Many many of these old MySQL hashes are still alive out there (as the exposure of the passwords of mysql.com developers at Oracle showed....) And I should have added my prediction that you would provide a great answer..... Though for most real-world attacks that leverage knowledge of the hashes, it would be more valuable to have the actual password (usable on sites with different hash functions) rather than the 28-character password. – nealmcb Apr 18 '11 at 03:12
  • The method I describe can be optimized to shorter passwords; what you need is to have about 2 billions half-passwords with a fixed sum of characters. Finding "the" password is not necessarily feasible since many passwords match the hash output, so there is no information to choose the "right" one. You can still brute-force possible passwords; one should be able to try out about 2 billions passwords per second on a not-so-recent GPU. – Thomas Pornin Apr 18 '11 at 11:28
  • 2
    I wonder if and where it might make sense to more formally publish this. Perhaps not suitable for a fancy journal, but perhaps someplace that merges practice with theory like USENIX ;login: See also my question, and some interesting links at [Has the StackExchange family led to first mention of any ideas publishable in peer-reviewed journals? - Meta Stack Overflow](http://meta.stackexchange.com/questions/87854/has-the-stackexchange-family-led-to-first-mention-of-any-ideas-publishable-in-pee/87874#87874) – nealmcb Apr 18 '11 at 19:05
9

I polished up my google-fu yesterday and this time managed to find a 2006 paper on this: F. Muller and T. Peyrin "Cryptanalysis of T-Function-Based Hash Functions" in International Conference on Information Security and Cryptology - ICISC 2006. Great work!

But the function it describes is very slightly different than the actual MySQL OLD_PASSWD (aka libmysql/password.c hash_password()) so their collision between doesn't quite work.

The paper says "if we pick the password “MySQL123”. The hash of the password is (nk1 & K = 0x1b03947c, nk2 & K = 0x1d6d177b), with K = 0x3fffffff." and says the hash of RGp0mA23 is the same.

But:

mysql> select OLD_PASSWORD('MySQL123');
+--------------------------+
| OLD_PASSWORD('MySQL123') |
+--------------------------+
| 1b03947a6f1ae59b         |
+--------------------------+

vs 1b03947a77350d71 for RGp0mA23.

So there is still presumably a bit of work to do to address the MySQL OLD_PASSWORD.

Anyway, please stop using the MySQL default password hashes (either old or new) and use technology at least as advanced as "crypt" from Unix in 1978 with salts and iteration counts....

nealmcb
  • 20,544
  • 6
  • 69
  • 116
1

CVE-2003-1480 was issued for MySQL's OLD_PASSWORD hash function. It appears from the advisory the main concern is that its too fast. From the MySQL dev teams perspective I can see how they want to reduce load on the database as much as possible. The use of a double sha1 is an effort of key strengthening. To be honest double sha1 is still very fast and only running a it twice is just to appease the CVE.

rook
  • 46,916
  • 10
  • 92
  • 181
  • double-hashing sha1 also removes sha1's length-extension-attack-vulnerability, not that length-extension is useful for cracking mysql hashes – user1067003 May 31 '21 at 16:05
  • @user1067003 it does not, the inner call would be affected and the outer would produce the same image. – rook Jun 18 '21 at 06:46
  • in the book "Practical Cryptography", Ferguson and Schneier proposed using a double-hash function as a means to defend against “length-extension” attacks with SHA-256, and they named it SHA-256d. Definition: SHA-256d(x) = SHA-256(SHA-256(x)) - if double-hashing prevents length extension attacks on sha256, i'm guessing it does the same to sha1. – user1067003 Jun 18 '21 at 19:21