At work the hashing algorithm we use for passwords appears to be bespoke. Obviously that's a pretty bad idea, but the management don't seem bothered.

The algorithm always produces 20 character long upper case strings. One particularly worrying aspect of its behaviour is that similar passwords produce similar hashes: Password1 and Password2 produce hashes that are different by only about 5 characters.

The other obvious issue is that it doesn't look like the algorithm is deliberately slow, which I'm told good hashing algorithms (bcrypt, scrypt) should be.

How does one go about evaluating the strength of a hashing algorithm? And what sorts of attacks should we be particularly concerned about with a fast hashing algorithm with poor uniformity?

I have access to the source (although unfortunately I can't post it on a public forum, for obvious reasons).

  • 113
  • 4
  • What 'obvious reasons' do you have for not posting the source? If you think it is because the algorithm needs to be kept a secret that is a clear indicator the algorithm is no good. – Jeff Dec 27 '12 at 14:41
  • 2
    @Jeff: It's likely not because AnonEmployee thinks it should be kept secret, but certainly his boss will think so. Also, in the places I've worked at, posting full modules of the company's code online is likely to get you into a lot of trouble, no matter whether it's the hashing algorithm or the daily backup scripts... – us2012 Dec 27 '12 at 19:16

4 Answers4


A strong hashing algorithm would imply that changing even one character will result in a completely different hash.

A good hashing algo has these characteristics:

  • The hash value is determined by the data being hashed
  • The hash function uses all given data
  • The hash distributes all possible hashes uniformly across all possible hash outcomes
  • If you have a string and you take a very similar string, you will get a completely different hash result (changing one single bit of input should change all output bits with a probability of 50%. Another requirement is that the hash is, to the highest extent possible, statistically independent of the input, e.g. a high hamming weight in the input does not produce an abnormal hamming weight in the output. -Polynomial)

Please tell your management

enter image description here

not to be Dave, and just use one of the standard hashing algorithms. If you are using the hashes to store passwords, use bcrypt. When hashing a password you are also concerned about salts, bcrypt does all that for you. Plus it's all free.

Lucas Kauffman
  • 54,169
  • 17
  • 112
  • 196
  • We're doing image memes here now? – pdubs Dec 26 '12 at 21:41
  • 2
    I told myself to post one of the "oh you" meme's everytime someone posts a case of a homebrew crypto or crypto hash. Especially when it's bad. – Lucas Kauffman Dec 26 '12 at 21:45
  • To clarify your last requirement: changing one single bit of input should change all output bits with a probability of 50%. Another requirement is that the hash is, to the highest extent possible, statistically independent of the input, e.g. a high [hamming weight](http://en.wikipedia.org/wiki/Hamming_weight) in the input does not produce an abnormal hamming weight in the output. – Polynomial Dec 26 '12 at 22:16
  • 6
    I like how *Dave* has become a meme now. – tylerl Dec 27 '12 at 03:28
  • @Polynomial added it to the answer – Lucas Kauffman Dec 27 '12 at 06:31

Noticing that:

similar passwords produce similar hashes: Password1 and Password2 produce hashes that are different by only about 5 characters

is enough to declare that this "hashing" algorithm is pure junk. The least that an algorithm which purports to be a "hash" is to, at least, look random. Even thoroughly broken algorithms like MD4 or even non-cryptographic hashes (like CRC32) offer an output which is "satisfying" with regards to simple statistical analysis.

The hash function you allude to appears to follow the Hollywood fantasy of "close decryption" (when you have almost the right key, the text is almost readable, just a bit blurred). This is just a revival of the Mastermind board game.

Security is about defeating a dedicated, smart and malevolent adversary. The hash function at your work can be breached by a chimpanzee.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949

A professional evaluation is very complicated and has to be done by experts in the field over a long time. That being said, I don't think you need to have an expert have a look at this: With a fast hashing algorithm with poor uniformity and small size (20 uppercase characters are approximately 94 bits, even md5 has 128) you are vulnerable to both brute force attacks and hash collision attacks.

Your company should change this system asap.

(Actually, depending on how bad that algorithm really is, it may be possible to reverse it so that a quick program can immediately give correct passwords for certain hashes. This is particularly terrible as it would break all passwords at once.)

  • 240
  • 1
  • 6
  • 94 bits are a decent security level, since hash-collisions don't apply to password hashing. So the short output size alone is no problem by itself. – CodesInChaos Dec 26 '12 at 22:17

A good hashing function must have what's called good "avalanche characteristics"; a small change to the message produces a large (and theoretically unpredictable) change in the hash digest. Your hash function has very little avalanche effect.

Avalanche effect is one of the primary ways that hash functions resist "preimage attacks". A preimage attack is essentially "de-hashing" the digest; given a hash digest and the hash function, find a message (usually any will do; sometimes there are message length and byte value limitations) that will produce the target hash. The basic purpose of a hash function is its deterministic but one-way transformation, and so any algorithm that can find a preimage in less than 2^N time, for a hash digest of N bits, should be taken as evidence that the hash is fundamentally broken.

Case in point; the low avalanche effect means that the changes to a hash digest, given a known change to the message, can be tracked and predicted. That allows for a "goal-seeking" approach; given a starting message, make a small change, compute the hash, and see if it's "closer" to the real one (more of the correct bits set). Surgical changes can then be used to mold the working message into a preimage.

  • 6,678
  • 1
  • 22
  • 38