Why improvising your own Hash function out of existing hash functions is so bad

Question

I'm afraid I'll have tomatoes thrown at me for asking this old question, but here goes.

After reading that cooking up your own password hash out of existing hashing functions is dangerous over and over again I still don't understand the logic. Here are some examples:

md5(md5(salt) + bcrypt(password))
scrypt(bcrypt(password + salt))
sha1(md5(scrypt(password + md5(salt))))

The typical arguments against these go as follows:

You're not a cryptographer! You've got no idea if these hashes are more secure. Leave it to the experts who know what they're doing. These add no extra security.

Granted they don't improve the function as a hash (i.e. make it harder to reverse or find collisions etc.), but surely surely they don't make it worse as a hash? If they did then hackers would be able to re-hash standardly hashed passwords into these wacky hashes as they see fit and weaken the hash? I don't buy it.

Second argument:

Kerckoffs's principle: A cryptosystem should be secure even if everything about the system is known.

Agreed. This is basically the motivation for not storing your passwords as plaintext in the first place. But if my response to the first criticism stands then these wacky hashes still function as secure hashes, and our system doesn't break Kerckoffs's principle anymore than it would with a standard hash.

Here are two possible (and worthwhile, as far as I can see) advantages to using a "wacky" hash over a normal hash:

Sure, your system should be secure if the attacker has the source code, but it's a very likely possibility that your attacker wont have access to your source code and probably won't be able to guess your wacky hash, making any attempt at a brute force impossible.
(This one is the real motivation behind me asking this question) BCrypt is thought to be secure, hard for the CPU and GPU (great) but can be very fast with specialized hardware. SCrypt is said to be hard to bruteforce on CPUs, GPUs and currently available specialized hardward but is more recent and not trusted by the cryptographic community as much as BCrypt due to the lack of exposure it's had. But doesn't the hash BCrypt(SCrypt(password + salt)) get the best of both worlds?

I appreciate that the passion/anger behind most rants against these home-brewed hashes comes from the average programmer's lack of knowledge of what makes a good hash, and a worry that encouraging this sort of wacky-hashing will inevitably end up with weak and useless hashes getting into production code. But If the wacky hash is carefully constructed out of solid and trusted hashes, are the gains in security not very valuable and real?

Update

I got a bunch of good answers on this, thanks. What I seemed to be overlooking in my assumptions was that, although combining hashes can't make it easier to crack the original password and therefore crack the constituent hashes, the combination of two or more secure hashes can - at least in principle - be weaker than any one of its inner hashes due to the unstudied and complex interactions between them. Meaning it could be possible to find some string that got past the wacky hash without necessarily breaking the hashes that made it up.

If your goal is just to protect against an attacker without access to your source code, then [using a pepper](http://security.stackexchange.com/questions/21263/how-to-apply-a-pepper-correctly-to-bcrypt) (basically a hard-coded secret value that you hash along with the password) would be sufficient, and less error-prone than creating your own hash function. — Brendan Long, Apr 01 '13 at 15:10
why not add a `sleep(15000);` after each hash? sure benefits are unknown but it certainly doesn't make it less secure :) — Sedat Kapanoglu, Apr 02 '13 at 10:55

score 71 · Answer 1 · edited Mar 15 '17 at 15:28

71

The fact that you need to ask this question is the answer itself - you do not know what is wrong with stacking these primitives, and therefore cannot possibly know what benefits or weaknesses there are.

Let's do some analysis on each of the examples you gave:

md5(md5(salt) + bcrypt(password))

I can see a few issues here. The first is that you're MD5'ing the salt. What benefit does this give? None. It adds complexity, and the salt is simply meant to be unique to prevent password collisions and pre-computation (e.g. rainbow table) attacks. Using MD5 here doesn't make any sense, and might actually weaken the scheme since MD5 has known trivial collisions. As such, there is a small possibility that introducing MD5 here might mean that two unique salts produce the same MD5 hash, resulting in an effectively duplicated salt. That's bad.

Next, you use bcrypt on the password. Ok. Well, most bcrypt implementations require a salt internally, so this is already technically invalid. Let's say you know that, and you meant to say bcrypt(md5(salt), password). This part is still falling to the weakness I described above, but it's not too shabby - remove the MD5 and it's a standard use of bcrypt.

Finally, you MD5 the whole thing. Why are you doing this? What is the purpose? What benefit does it bring? As far as I can see, there is no benefit at all. On the detriment side, it adds more complexity. Since most bcrypt implementations use the $2a$rounds$salt$hash notation, you're going to have to write code to parse that so that you can extract the hash part and store the rest separately. You're also going to need an MD5 implementation, which was unnecessary.

So, in terms of code footprint for potential attack vectors, you've gone from a simple bcrypt implementation, to a bcrypt implementation with custom parsing code, and MD5 implementation, and some glue code to stick it all together. For zero benefit, and a potential vulnerability in salt handling.

Next one:

scrypt(bcrypt(password + salt))

This one isn't too bad, but again you need some code to parse out the results of bcrypt into hash and salt / round count separately. In this case I'd guess that there is a slight benefit, because bcrypt and scrypt work in different ways for roughly the same goal, which would make it a little more difficult for an extremely well-funded attacker to build custom ASICs to break your scheme. But is that really necessary? Are you really going to hit a situation where a nation state will devote a few million dollars to just to break your hash? And, if that case ever arises, will it really bother the attacker to have to spend a few extra million to double their chip count?

Another potential issue with combining bcrypt and scrypt like this is that there has been very little study into how the two interact. As such, we don't know if there are any weird cases that can cause problems. As a more obvious example, take the one time pad. We compute c=m^k for some message m and some equally long perfectly random key k, and we get perfect security. So let's do it twice, for even more security! That gives us c=m^k^k... oh, wait, that just gives us m. So because we didn't take the time to properly understand how the internals of the system worked, we ended up with a real security vulnerability. Obviously it's more complicated in the case of KDFs, but the same principle applies.

And finally:

sha1(md5(scrypt(password + md5(salt))))

Again we're running into the MD5'ed salt issue. I'm also intrigued by MD5'ing the SHA1 hash. What possible benefit could that have, if you're already using a slow KDF like scrypt? The few nanoseconds it would take to compute those hashes pales in comparison to the hundreds of milliseconds it would take to compute the scrypt digest of the password. You're adding complexity for an absolutely irrelevant layer of "security", which is always a bad thing. Every line of code that you write is a potential vulnerability.

Now remember that point I made at the start of my answer. If, at any point in this answer, you thought "oh yeah, I didn't think about that", then my point is proven.

You're running into what I would describe as Dave's false maxim:

If I add more crypto things, it will be more secure.

This is a common trait among developers, and I once believed it too. It goes hand-in-hand with denial of other principles, such as Kerckhoff's Principle. Ultimately, you have to realise and accept that obscurity isn't a safety rail; it's a crutch for weak crypto. If your crypto is strong, it needs no crutch.

edited Mar 15 '17 at 15:28

Community

1

answered Apr 01 '13 at 11:32

Polynomial

132,208
43
298
379

Nice answer. I'd not really thought about the first and the last example of stacked hash. They kinda ruined my argument a little for no reason. I was more interested in the bcrypt - scrypt hash and whether that could *possibly* be weaker than either scrypt/bcrypt on their own. – George Powell Apr 01 '13 at 12:10
26

The interaction between the two is undefined, because nobody has studied it. Combining the two *might* provide more security, but it also *might* lead to unusual interactions between the internals of those hashes. A good example of this is the one time pad. Computing `c=m^k` gives you a perfect level of security when `k` is truly random and unknown to the attacker. So let's do it twice, for more security! So we end up with `c=m^k^k`, which results in `m`. Oh, whoops. Obviously the interactions are more complex in hashes, but the principle still stands. – Polynomial Apr 01 '13 at 13:09
4

My point is that since we don't know what potential problems might exist, why not go with the known safe route and just use scrypt with a decent work factor? It's more than secure enough for almost every purpose. – Polynomial Apr 01 '13 at 13:10
2

@Polynomial +1 Great answer. However I wasn't convinced about the bcrypt/scrypt part of your answer until I read your comment about one-time pads. Might consider adding that comment to your answer. – Phil Apr 01 '13 at 14:02
@Phil Done. Hopefully it makes sense. – Polynomial Apr 01 '13 at 15:26
1

"Another potential issue with combining bcrypt and scrypt like this is that there has been very little study into how the two interact. As such, we don't know if there are any weird cases that can cause problems. As a more obvious example, take the one time pad. We compute c=m^k for some message m and some equally long perfectly random key k, and we get perfect security. So let's do it twice, for even more security!" You're doing it wrong. The `bcrypt(scrypt(x))` stuff is ok while `xor(xor(x,k),k)` isn't because an extra `bcrypt` is something an attacker could do, too. – thejh Apr 01 '13 at 20:20
@thejh It was an example of how ignorance of the internals of an operation can lead to security vulnerabilities. It was not meant to be a direct analogy, hence why I said "Obviously it's more complicated in the case of KDFs, but the same principle applies." – Polynomial Apr 01 '13 at 20:24
10

I still don't see why the scrypt(bcrypt(x)) thing could be a problem. If the two interact weirdly, just using bcrypt doesn't make it more secure- an attacker can simply scrypt() what you've bcrypt()'d. Additionally, if someone figures out a more intelligent way to guess the original plaintext for one hashing algorithm, they probably won't simultaneously have a faster way for the other algorithm. Say people legitimately thought ROT13 was a good hashing mechanism. If I used ROT13(bcrypt(x)), wouldn't I still be better off when people 'cracked' ROT13? – root Apr 02 '13 at 09:46
The primary issue is that you have to introduce more code to parse the encoded output of scrypt / bcrypt so that the salt and round count can be stored separately. Plus it doesn't really offer any increased security - scrypt is more than enough for 99.9% of situations. – Polynomial Apr 02 '13 at 14:59
Using scrypt and bcrypt together has the problem that they both take time to run - you'll probably wind up halving the number of rounds on each to compensate for having to run both, which is more likely to be insecure than a "full" scrypt on top of a "full" bcrypt is. – Brilliand Aug 18 '14 at 22:16
4

@Brilliand: Attackers will almost certainly find ways of speeding up both algorithms at least somewhat--probably by different amounts. If it turns out attackers find a 100:1 speedup for one algorithm and 2:1 for the other, the attackers would net about a 4:1 speedup against a system that used each algorithm for half as many rounds as could have used in the absence of the other. Unless one knows which system attackers would have "better luck" with, hedging bets hardly seems unreasonable. – supercat Oct 19 '14 at 22:39
@Polynomial : I posted a simplified answer (kind of a summary). Not sure if my logic is correct, so if you have few minutes, could you check it / edit it please? I believe it can be useful. Thanks in advance! – lepe Apr 13 '16 at 02:11
@root : That will give them something such that scrypt(bcrypt(what_attacker_has)) = scrypt(bcrypt(real_password)) , but not necessarily such that bcrypt(what_attacker_has) = scrypt(real_password) . – Apr 14 '16 at 18:37
@Polynomial, `m^k^k` is not a good analogy. What about `m^k^k2`? I don't see how that would be **less** secure. – Pacerier Sep 10 '16 at 02:30
1

@Ricky, ?? I don't think anyone understood what you are trying to say. – Pacerier Sep 10 '16 at 02:39
1

@Brilliand, Splitting the percentage of hash over different breeds of algorithm is actually an advantage because of **economies of scale**. An attacker that could mass produce bcrypt-optimized breakers might not at the same time be able to mass produce scrypt-optimized breakers. The same logic applies to all hashes of different breeds. – Pacerier Sep 10 '16 at 02:43
1

A bigger problem is that `sha1(md5(m))` for any values of `m` decreases the possible outcomes to 128 bits, despite SHA-1 providing a 160 bit digest. While that's probably not a big deal because 128 is still plenty, you are unnecessarily reducing the digest size. Consider an extreme example, where you have `sha1(crc8(m))`. Despite a SHA-1 digest being 128 bits, there will only be 256 possible outcomes due to the 8 bit CRC in the middle. By creating a bottleneck, you are reducing security by limiting the keysize. – forest Dec 29 '17 at 05:52
https://crypto.stackexchange.com/questions/59583/if-and-why-is-it-bad-to-scrypt-a-bcrypted-password/59588#59588 – gaazkam May 28 '18 at 09:23
What if you just use the combined hash string? – Solomon Ucko Jan 06 '19 at 14:30
1

@SolomonUcko Then it becomes highly vulnerable to partial preimage attacks. – forest Jan 09 '19 at 07:20
I think the most compelling argument against combining/nesting these algorithms yourself, is that if it was a good idea, then the existing algorithms would have already done it, or new algorithms will soon, so just use the best well-established stuff that lots of cryptographers endorse. I don't worry if I should `scrypt(Argon2id(...))`, because I trust that whatever benefit `scrypt` provides, it is probably already included in `Argon2`, in a much more optimal way than a naive wrapping. – mtraceur Jun 08 '22 at 15:51

score 15 · Answer 2 · edited Mar 17 '17 at 13:14

15

Crypto primitives can be stacked safely, and increase security if, and only if, you know the primitives well enough to understand their weaknesses and how those weaknesses interact. If you don't know them, or don't understand the details - well, that's how you get Dave's protocol.

The problem is very few people know them all well enough to judge if a certain combination is safe. Which is why it needs to be something that is published and reviewed, if it's not been reviewed you have no way of knowing if it's as strong as scrypt or if it's closer to CRC32.

So, if you aren't an expert - it's quite possible that you have something weaker than the weakest primitive you've used (see Dave's protocol) and you wouldn't know it. Or at least you wouldn't know it untill it's cracked – finding your users' passwords on Pastebin isn't quite the ideal way to determine the scheme is flawed.

I do agree that some degree of obscurity can help from a defense in depth perspective, but the underlying system must be secure.

Between scrypt, bcrypt and PBDKF2 - at least one of them will be supported on pretty much every platform. These are known and well tested - they offer differing levels of protection, but they are still far more secure than a bizarre stacking of md5 and sha1.

edited Mar 17 '17 at 13:14

Community

1

answered Apr 01 '13 at 02:03

Adam Caudill

1,794
14
18

1

I think that stacking of md5 and sha1 was a bad example, I'd never use that one and it was just for illustration. I'm particulary interested in the bcrypt-scrypt stack. I still don't understand how it could possibly make the hash weaker? As I said, it's a step that could be taken by a hacker if they wanted anyway so if it does weaken the hash then that's a huge problem even if I don't do it for them! – George Powell Apr 01 '13 at 10:37
2

@GeorgePowell : A hacker isn't only interested in recovering original plaintext passwords. He might also be interested in finding collisions. It's possible that you increase the amount of collisions by chaining those functions together. – Christian Dec 20 '13 at 19:44
Could you explain why, if a hacker has access to the database, they are attempting to find collisions? Also, surely the number of collisions created by a hash function is a function of its quality? If hashing a hash results in more collisions I'd suggest the hash function is poor from the get-go. – monster Nov 24 '15 at 10:00
@GeorgePowell one way it could be weaker is that `bcrypt` might already include all parts/techniques from `scrypt` that make it stronger (or vice versa), but optimized to include just those pieces rather than the whole thing, so you'd get more security for the amount of execution time and memory you're willing to give it by just running the already expertly combined one, than by splitting up the execution time across both. – mtraceur Jun 08 '22 at 15:58

score 14 · Answer 3 · answered Apr 01 '13 at 17:58

14

For your specific question of combining scrypt and bcrypt, remember that these functions have a configurable cost, and you want to raise that cost as much as possible, while keeping it tolerable for your specific usage. For instance, if you can use bcrypt with up to X iterations (beyond which it is too expensive for your server and your average number of user connections per second), or scrypt with up to Y iterations, then you cannot use scrypt(bcrypt) with X iterations for bcrypt then Y iterations for scrypt: this will be beyond your CPU budget.

Thus, if you cascade scrypt and bcrypt, then you must use both with less iterations than what you could have done with one alone. You don't "get the best of both worlds" by simply stringing them together. In fact, the best you can hope for is a kind of average between the two. And that comes at the price of more complex code, something which is inherently bad when talking about security (or, for that matter, maintainability).

answered Apr 01 '13 at 17:58

Thomas Pornin

320,799
57
780
949

1

Most algorithms are likely to become faster with time, though not necessarily by equal amounts. Suppose there's a 1% chance that method X will get sped up 100 fold, and a 10% chance it will get sped up two-fold; likewise for method Y. Unless I'm missing something, I would think that running both algorithms at half-strength would mean that a massive speedup would only be possible for the composite if one was possible for *both* constituent algorithms. – supercat Nov 11 '14 at 00:22
Yes, another issue is economies of scale. Mentioned here: https://security.stackexchange.com/questions/33531/why-improvising-your-own-hash-function-out-of-existing-hash-functions-is-so-bad?rq=1#comment254238_33550 – Pacerier Sep 10 '16 at 02:47
@supercat but of course smart cryptographers presumably already thought about that, and deliberately designed each of these algorithms to combine different pieces which are unlikely to be sped up by the same ways. When you combine them, the default assumption is not "I am combining two differently secure techniques", the default assumption should be "the only thing I could change here is inefficiently increase the ratio of uninteresting glue versus the already combined differently secure techniques". – mtraceur Jun 08 '22 at 16:05
@mtraceur: Whether a cryptographic algorithm gets sped up two-fold or 100-fold will depend in significant measure upon the amount of money that gets spent on the effort. If one uses only popular algorithms, then money will likely get spent improving the performance of all of them. If a program spends 20% of its time on an algorithm that is less popular and attracts less investment, and is thus only spend up two-fold, then no amount of investment in the other algorithms could sped it up more than ten-fold. – supercat Jun 08 '22 at 19:18
@supercat yes, but "algorithms" aren't monolithic atoms of functionality which can only either be sped up with targeted investment or not at all - they are ultimately compositions of smaller simpler algorithms/primitives/operations, and much of the speedup lies from investment in cracking lies in optimizing those little pieces. Now I don't know how many little innards are the same between KDF `foo` vs KDF `bar`, but I imagine that the most recommended and strong KDF uses all of the best pieces, or is strongest against speedup despite not using all of them because the pieces it does use are. – mtraceur Jun 09 '22 at 06:10
@supercat in other words, when I look at `less_popular_kdf(Argon2id(...))`, I don't immediately see it as necessarily inevitable that attackers investing in Argon2id speedups would be slowed by `less_popular_kdf` - without looking inside them both, I can't rule out the possibility that `less_popular_kdf` and Argon2id share so much logic that the same investment in speedup covers both. – mtraceur Jun 09 '22 at 06:16
1

@mtraceur: Few things are necessarily inevitable. My point was that a combination of two KDF functions that are run for less time than a single KDF would have been *may* be stronger than a single KDF would have been, in some *plausible* scenarios. Not that such combinations were always beneficial, much less worth the effort to set them up, but that they could in some cases be beneficial. – supercat Jun 09 '22 at 06:59

Stephen Touset · Answer 4 · 2013-04-01T15:39:29.710

9

In addition to Adam's answer, I'd like to also mention that any time you use cryptography, you should have a strong and unavoidable reason to do so. In your examples above, this does not exist.

md5(md5(salt) + bcrypt(password))
scrypt(bcrypt(password + salt))

The bcrypt and scrypt algorithms are already strong enough, and considered effectively unbreakable. What problem are you trying to solve? And why do you believe that combining their results (particularly with md5) will solve it? In the best case scenario, you've likely merely reduced the difficulty of cracking the password to that of the weakest hash, rather than actually improving security. And the worst case scenario is frighteningly undefined.

md5(sha1(md5(md5(password) + sha1(password + salt)) + password))

This solution is even worse. It manually implements a repeated hashing scheme, but without enough rounds to actually impose a significant work factor on attackers.

In a nutshell, the problem is that:

you're throwing around cryptography without actually having a problem that needs solving
you've dramatically increased the likelihood of introducing flaws in your implementation
you've likely decreased security to the weakest of the hashing algorithms, and
you've introduced an unknown worst-case scenario where none used to exist

edited Apr 01 '13 at 15:39

answered Apr 01 '13 at 03:10

Stephen Touset

5,736
1
23
38

The "problem that i'm trying to solve" or the advantages of doing them was addressed at the end of the question. And I still don't understand how a stack of hashes like sha1(md5(bcrypt())) could be weaker than bcrypt! Please explain, if putting md5 ontop of bcrypt made it weaker then why on earth aren't hackers doing it themselves once they get the bcryt-ed passwords? – George Powell Apr 01 '13 at 10:57
@GeorgePowell Your `sha1(md5(bcrypt()))` scheme *makes no sense*. What does it offer that bcrypt alone doesn't? If you're looking for more security in bcrypt, why not just increase the work factor, or switch to scrypt? Abusing other hash primitives, especially mainly-broken ones like MD5, for an undefined and minuscule benefit (which may *actually* be a detriment) isn't smart and can only provide you with more complexity. – Polynomial Apr 01 '13 at 11:06
I agree it makes no sense, you're right. It doesn't make a better hash. But I don't understand how it would make it weaker? Even if you'd fully beaten the md5 and sha1 hashes you'd still be left with all the strengths of bcrypt? Or am I completely wrong here? – George Powell Apr 01 '13 at 11:12
5

The security issue isn't really a cryptographic one. You have to write lines of code to produce this implementation, and every line you write brings potential vulnerabilities. Even subtle things like how Unicode is handled on certain machines can be catastrophic from a security perspective. Keep it simple, keep the implementation clean of needless crud, and you'll be safer. – Polynomial Apr 01 '13 at 12:11
2

@GeorgePowell It doesn't help them. Imagine there exists a hash function that has an output space of one single byte. It is obviously *trivial* for an attacker to brute force our passwords if we use that at any point to compute digests (even if we later pass that to something like `bcrypt`). But the fact that this weak hash exists doesn't allow an attacker to use it against us, because *we're* not using it. – Stephen Touset Apr 01 '13 at 15:28
1

@GeorgePowell Additionally, the "problem" that you're trying to solve doesn't actually exist. It is purely imaginary. If it existed, professional cryptographers would be scrambling like mad to put out a permanent solution. Your apparent belief that 1) a practical problem exists, 2) there is no published solution, and 3) the "solution" is trivial comes along with the implicit assertion that professional cryptographers are stupid and/or lazy. Otherwise they'd already have come up with a fix, right? – Stephen Touset Apr 01 '13 at 15:34

score 9 · Answer 5 · answered Apr 01 '13 at 10:54

If you apply unsafe operations to a secure algorithm, you can definitely break the hashing function. Your new function could even be far worse than the weakest link.

Why don't attackers use this to break secure functions? It doesn't help them. For example, if I overwrite the first 440 bits of a password safely stored using bcrypt with zeroes, I can easily find a matching password by brute force, but that password will only work on my own terrible algorithm. A sane implementation would probably reject it.

Zeroing large chunks of a hash is clearly bad, but even safe operations can be combined into something dangerous. Adding two numbers (modulo n to ensure a constant length) is 'safe'. Generally, no entropy is lost. Still, h(x) + h(x) mod n reduces the quality of the hash h(x) by one bit, as the result is always even. The equally safe operator XOR does even worse, as h(x) XOR h(x) always returns zero.

These pitfalls are fairly obvious, but not all of them are. Keep in mind that, as always, it is trivial to invent a scheme good enough that you won't find any weaknesses yourself, but very difficult to invent one where no one else can.

Good answer, I think I understand the gist: wacky hashes *could* make it easier to find a match by brute force but couldn't make it easier to find the original password that was used. This is why hackers have no use applying the weaker hash themselves. Cracking the stacked hash does not imply cracking the inner-hashes. — George Powell, Apr 01 '13 at 11:20

score 6 · Answer 6 · answered Apr 01 '13 at 07:36

Hash functions are built by cryptographers and destroyed by cryptographers. There are many strong hash functions as well as weak ones still being use today. Programmers should trust the cryptographers and the hash function. If there was ever a vulnerability in the hash function, then you would surely hear about it on the internet or through co-workers and then cryptographers will surely to a deep investigation. With any secure hash algorithm, it's only known weakness may be a bruteforce attack.

Combining hash functions adds almost no extra security and anything you might want to add, is probably already implemented in the function already.

Salting a password is great for reducing the effectiveness against rainbow tables such that a password couldn't be just "looked up." Whether you hash a function twice or change the hash function, it is essentially salting the password. And most functions include an easy method to salt so there is really no need to implement this.

Lets say I want to create my own secure hash because everyone does it. And since I'm not a cryptographer, I will need it "really" secure, because of course, every programmer knows how to make a secure hash instead of using the secure ones already created. So I create my devious hash function, mod10(md5(sha1(bcrypt(password + salt)))).

As you can see from my hash function, it is really secure because I use so many different things in it. Of course in this silly example, it is easy to see there are only going to be 10 different possible outputs. But by simply using a single secure hash function, it would have completely avoided this.

Sure, your system should be secure if the attacker has the source code, but it's a very likely possibility that your attacker wont have access to your source code and probably won't be able to guess your wacky hash, making any attempt at a brute force impossible

So we are assuming an attacker got ahold of the database table that contains the hashes. I would assume it would be very likely that an attacker can get the webpage files aswell. Your systems running these services maybe the same exploit that allowed you database to get taken. A database server is setup so that that public cannot directly access it. On the other hand, your web server containing your code, is on the front lines.

score 5 · Answer 7 · answered Apr 01 '13 at 04:04

Anytime you increase the complexity of an algorithm or even add more lines of code, you increase the points of failure in your application. There may be unintended consequences of combining algorithms. This may lead to certain tells or other signs which can actually weaken the cryptographic strength of the system.

The more libraries you use in your application, the greater the risk to your application in general. If a flaw is found in one implementation that allows a weakness, code execution, etc. then your application is vulnerable. If you happened by luck to select another algorithm that was not attacked, your safe for the time being (of course you could also be on the unlucky side of luck).

Remember KISS: Keep it simple, stupid, or else you may get lost in the complexity.

score 2 · Answer 8 · answered Apr 02 '13 at 00:05

2

I am going to disagree with a whole bunch of people who are smarter than me and more experienced in security than I am. So I am probably wrong.

Improvising your own hash function is a good idea - if you do it right. Follow these 3 simple steps.

Identify a weakness or flaw in the existing hash functions.
Improvise your own hash function that does not have this flaw.
Verify that your improvised hash function has all the strengths of the existing hash functions.

After you complete step 3, you would be a fool not to use your improvised hash function.

answered Apr 02 '13 at 00:05

emory

1,560
11
14

5

The problem is that you *can't* complete step 3. No one in the world is capable of unilaterally determining that a new hash algorithm is strong. This is why, in order to gain acceptance, new candidate algorithms go though massive multi-year refereed peer-reviewed competitions and undergo exceptional amounts of scrutiny. You simply can't know if an algorithm is safe in a vacuum. – Xander Apr 02 '13 at 00:54
1

@Xander, I can not complete any one of the steps. So for me, I don't improvise. While step 3 is probably the most important, step 1 is the most basic. If you can't at least identify a problem with an existing secure hash function, then why are you wasting your time improvising a new hash function. – emory Apr 03 '13 at 01:04

user3083447 · Answer 9 · 2016-12-13T15:03:13.627

When concatenating different hash functions because you are worried the one it is vital that you apply the salt back before applying the next hashing algorithm, then any collision weaknesses in one of the algorithm won't taint the second hash function that you are using:

scrypt(bcrypt(password + salt) + salt)

But I think scrypt is an established technique now, Argon 2i has won the password hashing competition and is believed to be more secure and bcrypt has been around a long time and has proven to be secure against trivial bypasses.

Therefore the following would make more sense, and would combine the strength of argon 2i, but would fall back to bcrypt if a future attacker showed how to trivially break argon 2i, which is currently thought to be hard:

bcrypt(Argon2i(password + salt) + salt)

But you are less likely to make a mistake if you simply do:

scrypt(password + salt) Or bcrypt(password + salt)

Remember, most breaches are due to human error in the code, much better to keep it simple and thoroughly review your code with dynamic, static analysis and human code reviewers, to make sure no sql injection attacks slip through (Remember always parameterize your database queries!)

lepe · Answer 10 · 2019-02-22T02:43:19.537

-2

I have learned a lot with this thread, so I thought in a way to easily understand the possible outcome when combined different methods (of course, it won't fit all cases):

[W=weak, S=strong]:

Formula        | Example
---------------|-----------------
W(W) = W       | md5(sha1(pwd))
W(S) = W       | md5(bcrypt(pwd))
S(W) = W       | bcrypt(md5(pwd)) 
S(S) = S-      | bcrypt(scrypt(pwd)) //their interaction is unknown
W+S = W        | md5(pwd)+brcypt(pwd) 
S+S = W        | bcrypt(pwd)+scrypt(pwd)
W'+S" = S      | md5(pwd1)+brcypt(pwd2) //pwd1 != pwd2
S'+S" = 2S     | bcrypt(pwd1)+scrypt(pwd2) //pwd1 != pwd2
SUBSTR(W) = W  | substr(md5(pwd),n); //Shorter the string, the weaker
SUBSTR(S) = S- | substr(bcrypt(pwd),n); //Shorter the string, the weaker
S+x = S+       | bcrypt(pwd)+x   //x=Variable str. unrelated to pwd
S(S+x) = S-    | bcrypt(scrypt(pwd)+x) //x=Variable str. unrelated to pwd
ROT13(S) = S   | ROT13(bcrypt(pwd))

What I want to achieve here is to show in a simple way that those combinations won't add extra security in most of the cases (so added complexity is not worthy).

edited Feb 22 '19 at 02:43

answered Apr 13 '16 at 02:06

lepe

2,184
2
15
29

It is impossible to say such a general thing as `S(S) = S-`. As Polynomial explained in a comment above, repeating a strong hashing function can in principle result in a completely useless hash. What are these "Formulas" based on? – George Powell Apr 14 '16 at 17:54
@GeorgePowell: Yes, `S(S)` could lead to an unexpected behavior, which could potentially result in a useless hash. That is why is marked: `S-`, which means it will be less strong than `S` alone, so it is not recommended. As we are not specifying the actual algorithms, we can't be 100% sure it will produce useless hashes, but should be avoided due to such risk. These "Formulas" have no scientific grounds in any way and are just my attempt to represent such interactions in an easy way so anyone trying to "run their own" can have an idea of what to expect and avoid their use. – lepe Apr 15 '16 at 00:57
@lepe, `W(S) = W`? Are you sure? See https://security.stackexchange.com/questions/33531/why-improvising-your-own-hash-function-out-of-existing-hash-functions-is-so-bad?rq=1#comment52264_33550 – Pacerier Sep 10 '16 at 02:49
@Pacerier: One point to analyze is "collision probability". If you use MD5 to hash a bcrypt hash, you are increasing the chances of collision and thus, reducing its strength. Also, if the MD5 hash is compromised, brute-forcing the MD5 hash directly could be much faster, removing "S" from the formula (depending on implementation that could be a huge issue). One more point is that if the W algorithm has a security flaw, it can make the W(S) flawed as well. For those reasons, W(S) should never be considered S or S+. It's better to use S alone to prevent adding complexity and unnecessarily risks. – lepe Sep 12 '16 at 01:13
@lepe, **1)** Sure you could bruteforce `W`, but bruteforcing `W` gives you `S(password)`, not the password itself. In other words, after bruteforcing `W` you are **back at square one**. See the link to Root's comment above. **2)** Next, if the `W` algorithm has a security flaw, it **cannot** make `W(S)` flawed. [cont] – Pacerier Sep 18 '16 at 19:05
[cont] Because if `W(S)` is flawed, then `S` would then no longer be called `S` since `S` would then be said to be broken and no longer a **S** trong hash. http://en.wikipedia.org/wiki/Self-refuting_idea . In order words, before we would be hacked, someone would have made world news in breaking `S`. Although it's certainly true that there is a chance the world has crackers with more expertise than the white researchers and cryptanalysts, allowing knowledge of having had broken `S` to circulate in the black market for a longer time before it is publicly known. – Pacerier Sep 18 '16 at 19:06
Good table except ROT13(S) = S. I advise certain people who really do fear the stock ASIC board to do XOR(S, key) where key is stored elsewhere so ... – Joshua Feb 22 '19 at 01:29
@Pacerior: I saw a few cases where W was so bad that bruting W(S) was feasible. Hint: if W has 64 bits entropy, 1<<63 trials will suffice on average. – Joshua Feb 22 '19 at 01:31
@Pacerier: I changed ROT13(S) = S as suggested. – lepe Feb 22 '19 at 02:45

Why improvising your own Hash function out of existing hash functions is so bad

Update

10 Answers10

Linked

Related