80

How can I create a password, which when directly hashed (without any salt) with md5 will return a string containing the 8 characters "SALT ME!". The hope is that a naive developer browsing through his user database will see the "hash", realize the insecurity of his application, and eventually make the world a better place for everyone.

Md5 outputs 128 bits, which is 16 bytes. If I had a 16-byte message, getting the original plaintext password would be equivalent to a pre-image, which to my knowledge is practically impossible. However, I'm only looking for 8 specific bytes in my hash.

Is obtaining such a password feasible in day-timeframes on a typical computer? If so, how can I compute such a password?

Joel
  • 1,069
  • 1
  • 8
  • 7
  • 55
    My immediate thought is that if you're using MD5 as your password hashing algorithm, whether a salt is being used (or not) is not your biggest concern. – Xander Apr 22 '14 at 16:35
  • 26
    -1 because the stated purpose of this is not only absurd, but won't even work. Also does not belong to on this site: The question is essentially asking about calculating specific hashes, and is better suited to StackOverflow or SuperUser. – Superbest Apr 22 '14 at 19:24
  • 38
    or [CodeGolf](http://codegolf.stackexchange.com/)... – Mooing Duck Apr 22 '14 at 19:28
  • 15
    Almost no admin goes thorugh the password database and reads all of them. Or the database should have been compromised and your "Salt me" should stand out from the rest ... All given examples below rely on Base64-representation of the hashes, while most MD5-hashes are represented as hexidecimals – BlueCacti Apr 22 '14 at 21:02
  • 8
    Total red herring. [Salting won't save your backside if you don't mitigate against offline brute-force searches.](http://web.archive.org/web/20130407190430/http://chargen.matasano.com/chargen/2007/9/7/enough-with-the-rainbow-tables-what-you-need-to-know-about-s.html) Use scrypt or bcrypt or something similar, no ifs or buts. – C. K. Young Apr 22 '14 at 23:02
  • 5
    What would probably work better is to use the password "SALT&HASH-ME!" The number of times I have clicked the *forgotten password* link and been emailed my password in plain text is astounding. – DarcyThomas Apr 23 '14 at 07:26
  • @MooingDuck CodeGolf probably wont give a useful solution since technically it's only concerned with minimizing keystrokes. If finding the actual solution takes longer than a lifetime, that's fine from a CodeGolf perspective. – Rik Apr 23 '14 at 08:07
  • 3
    What if the salted password yields to HashMe? What is the naive developer going to do then? – Koray Tugay Apr 23 '14 at 11:20
  • 1
    @Rik. Code Golf SE is actually for programming puzzles in general. (See their popular "code-trolling" tag for some non-golf examples.) – TRiG Apr 24 '14 at 03:06
  • 1
    Well, I found the idea entertaining. People do know that there is no point in it anyway and learning a bit about hash algorithms won't do harm, so really, let's just smile at this one. – Janis F Apr 24 '14 at 08:03

3 Answers3

261

The output of MD5 is binary: a sequence of 128 bits, commonly encoded as 16 bytes (technically, 16 octets, but let's use the common convention of bytes being octets).

Humans don't read bits or bytes. They read characters. There are numerous code pages which tell how to encode characters as bytes, and, similarly, to decode bytes into characters. For almost all of them (because of ASCII), the low-value bytes (0 to 31) are "control characters", hence not really representable as characters. So nobody really reads MD5 output directly. If someone is "reading" the hash values, then these values are most probably encoded into characters using one of the few common conventions for that. The two most prevalent conventions are hexadecimal and Base64.

With hexadecimal, there are only digits, and letters 'a' to 'f' (traditionally lowercase for hash values). You won't get "SALT ME!" in an hexadecimal output...

With Base64, encoding uses all 26 unaccentuated latin letters (both lowercase and uppercase), digits, and the '+' and '/' signs. You could thus hope for "SaltMe" or "SALTME". Now that is doable, because each character in Base64 encodes 6 bits, so a 6-letter output corresponds to 36 bits only. Looking for a password which yields either "SaltMe" or "SALTME" will be done in (on average) 235 tries, i.e. within a few minutes or hours with some decently optimized code.

Note, though, that someone who actually spends some time to read Base64-encoded hash values probably has some, let's say, "social issues", and as such might not react the way you hope.

And it is done: When hashing with MD5 then Base64-encoding the result:

  • infjfieq yields: SALTMEnBrODYbFY0c/tf+Q==
  • lakvqagi yields: SaltMe+neeRdUB6h99kOFQ==
Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 254
    Nothing relaxes me after a long day at work like having a glass of wine and reading through some base64 encoded hash values. – Abe Miessler Apr 22 '14 at 17:01
  • 20
    I'd have to up vote this, even if you didn't find two inputs that yield `SALTME` as the first 6 characters; doing that merits more than a +1, but alas, that's all I can give. Fantastic answer friend! – Mike Perrenoud Apr 22 '14 at 18:03
  • 2
    for an encore, what about values whose B64ed MD5 salts begin NoMD5, UsePbkdf2, or UseBCrypt? – Dan Is Fiddling By Firelight Apr 22 '14 at 18:26
  • 28
    `NoMD5` is only five letters; we can go to six and get `NoMD5+`, which looks better. And, indeed, `pmcrsihh` yields `NoMD5+pyhpe6Xxa3x93iGQ==`. – Tom Leek Apr 22 '14 at 18:45
  • 5
    `UsePBKDF2` and `UseBcrypt` are 9 letters each, for an average cost of 2^54, which is quite beyond what I can do with my lone quad-core server. With these nifty GPU arrays which claim to do 8 billions MD5 per second, this could be done within a month (more or less). – Tom Leek Apr 22 '14 at 18:47
  • 4
    How did you generate those 2 passwords? – Joel Apr 22 '14 at 19:03
  • 15
    Brutally... I took an MD5 implementation from [here](http://www.saphir2.com/sphlib/), then went through many 8-letter passwords until the proper output was obtained. About 2^23 tries per second and per core on my 3.1 GHz server (I estimate that I could do 2^24 with some SSE2-aware code -- I had done that at one time for SHA-1). – Tom Leek Apr 22 '14 at 19:13
  • 7
    It should be pointed out that in the two (awesome) examples given, the "salt me" is barely visible. If the OP's goal is to catch the eye of somebody reading through a database, the hash would need to look like `////SALT+ME////` or something similarly "stand-out"-esque. By the time that gets implemented, however, the OP's *actual* problem could have been solved... – Ti Strga Apr 22 '14 at 19:32
  • 29
    When I saw `SALTMEnBrODYbFY0c/tf+Q==` The first thing that came to mind was.... Don't Salt Me Bro! LOL – mezoid Apr 22 '14 at 23:35
  • 3
    Confess... you generated the second hash on purpose, trying to characterize the people who spend time on reading them! SaltMe+ **neeRd** UB6h99kOFQ== –  Apr 28 '14 at 11:59
  • I don't get the `You could thus hope for "SaltMe" or "SALTME"` part. Why are they more likely than `saltme` or `saLTMe`? – Aif Apr 09 '17 at 13:38
5

Purely in theory...

There are rainbow table sets with pre-computed hashes. You could search it for a hash value that contains the suitable byte values (EDIT: would require some brute-forcing of pre-generating full hashes and probing making it much harder to do - see bonsaiviking's comment to the answer), if one such exists (e.g. one that contains 0xBADBAD). If one exists, then from the rainbow table itself you can get back the plaintext from which that hash was derived (and if you use that password, then it is, by definition, is insecure as anyone who gets the database can reverse it, and so not advised).

In practice, however....

There are many assumptions in this approach.

  • In reasonable size orgs, developers typically do not have access to production database except for perhaps individual troubleshooting cases.
  • You're assuming that MD5 is stored/displayed in some human-readable format; yet it is just a byte array of 16 bytes, and would really depend on software that developer/IT-person would be using to visualize it. In Visual Studio, for example, debugger would display byte array typically by its number value, e.g. 87,45,34,67,...
  • You're assuming that unsalted MD5 is being used, but could be some other hashing function, for which hash would, obviously, be different and thus fail your requirement.
  • And as others already suggested, you don't know how it is stored as some additional transformations may've been applied to it (like Base64, for example) that would make it look different.

But primarily the biggest assumption is that someone is looking through the database of thousands/tenthousands/millions/tenmillions/etc of users and would spot the odd value out of a bunch of hashes, and even then out of a part of a hash.

LB2
  • 420
  • 2
  • 8
  • 6
    Rainbow tables don't store all the hashes, so you can't just "search through them". They are a specific compression format for finding plaintext for particular *whole* input hashes, so you'd have to keep generating "satisfactory" hashes until you got one with a "real" plaintext. – bonsaiviking Apr 22 '14 at 17:15
  • @bonsaiviking You're right - I didn't even realize that. Makes that a whole lot harder... – LB2 Apr 22 '14 at 17:32
  • "developers typically do not have access to production database except for perhaps individual troubleshooting cases" I think this is the only sane answer to the question. I would think it highly unlikely that obtaining the end-result will be seen and even then it will probably not change someones mind. – Niels Bom Apr 24 '14 at 13:43
  • 3
    "Pass2112127" hashes to "badbad3aad02eeb05be660c5bb30efe5". – tobyink Apr 24 '14 at 22:13
  • 2
    "Pwd9383247" hashes to "bad7ca3eb9307233f790a9d4a4a0bbad" which has "bad" at the beginning and end. – tobyink Apr 24 '14 at 22:23
  • I can't agree with the assertion that reasonable size orgs will have controls in place on the production database. There is surely a correlation between orgs that are improperly storing passwords and orgs that are improperly allowing devs access to the production database. – stannius Sep 22 '15 at 19:09
1

8 bytes is 64 bit, which is beyond bruteforce. I don't know of any preimage attacks on md5, so you're probably out of luck.

Also, some websites escape or restrict your charset as an (ineffective) measure against SQLI

miniBill
  • 335
  • 1
  • 8
  • 2
    Since it is beyond brute force, do you use 64-bit hashes per chance? – Kevin Li Apr 22 '14 at 17:21
  • 35
    It's beyond brute force, except for the answer above in which it was brute forced. – corsiKa Apr 22 '14 at 19:18
  • 10
    @corsiKa The accepted answer brute-forced "SaltMe" or "SALTME", not the requested "SALT ME!" The difference between 6 and 8 characters is extremely significant here. – Mike Scott Apr 22 '14 at 19:38
  • 4
    He was able to do it in under an hour on a single machine, including setting up the app. Adding a character would make it take around 2 days, which is very reasonable. Considering there weren't many optimizations made (answerer estimated he could make it roughly 10x faster with good optimizations) and that someone with the setup to just throw something like that together probably has a couple machines to parallelize it on, it appears a password could be brute forced in a week or two. Certainly feasible for something OP only needs once. – corsiKa Apr 22 '14 at 19:51
  • 10
    Except he bruteforced 6 6-bit chars, 2^36, which is *far* less than even "SALTME" in 8-bit which would be 2^48 – miniBill Apr 23 '14 at 06:30
  • 1
    Now, 2^48 is far less than 2^64, and it is in fact bruteforceable (it would take ~4096 hours on Tom's machine) – miniBill Apr 23 '14 at 17:50
  • 1
    @KevinLi 64-bit would lower the barrier needed for second preimage considerably. Also, birthday paradox. – miniBill Apr 23 '14 at 17:51