2

I read from a lot of sources that sha256 is not secure because it is fast. Most links that I read suggested bcrypt. However, I just want to know that if data is 64 cryptographically random byte, will attacker be able to find original string somehow?

Scenerio

secret string is 64 cryptographically random byte data

User download secret string, we store hashed version of it. He will upload this data to site when he wants to get access to some information. (work process can not be modified, it is stable system we just need to secure it)

System has 1 trillion secret string and all of them should be protected by some hash algorithm.

Is it possible that attacker will get even one of the secret string if he got access to all hashes?

  • 2
    `SHA256` is not secure _for hashing passwords_. For hashing >= 16 bytes of cryptographically random data even `MD5` is still reasonably secure. – AndrolGenhald Feb 08 '18 at 20:57
  • That said, if you have 1 trillion secrets and you don't want an adversary to get even 1, it's _maybe_ possible that a 128 bit output would come close to the possibility of being attacked, but it's still very unlikely. A 256 bit output like `SHA256` would be fine. – AndrolGenhald Feb 08 '18 at 20:59
  • if the attacker is successful (he will earn $1000), how much it will cost for attacker? @AndrolGenhald – Alex Robertson Feb 08 '18 at 21:06
  • @AlexRobertson: what kind of trillion do you mean? American (10^12) or British (10^18)? http://mathcentral.uregina.ca/qq/database/qq.09.02/ryanandaylah1.html – Steffen Ullrich Feb 08 '18 at 21:19
  • @SteffenUllrich British – Alex Robertson Feb 08 '18 at 21:25
  • In this case the chance to guess one of the secrets is about 10^-135. The chance to guess some potential secret which results in the correct hash is about 10^-58. Based on this it is very very very unlikely for the attacker to guess at least once correctly in his life time - and he would not try for $1000 only. – Steffen Ullrich Feb 08 '18 at 21:30
  • (slightly related side-note: not even the British use British trillions any more!) – Polynomial Feb 08 '18 at 21:30
  • @SteffenUllrich Where are you getting those numbers? I can't reproduce them. Also, 10^18?? Storing that many 256 bit hashes would require 27 _exabytes_. 10^12 is a bit on the ridiculous side too, but at least it's possible. – AndrolGenhald Feb 08 '18 at 21:44
  • @AndrolGenhald: a 64 bytes secret to guess is 512 bits and thus 10^512 different secrets which are about 10^153. Thus the chance to guess a single secret from 10^18 is 10^(18-153), i.e. 10^-135. Similar for guessing not the original secret but the one which results in a known sha-256 hash. In this case one can assume that you need to guess about one of 2^256/10^18 values, i.e. about 10^-58. But I agree that the claimed numbers of hashes look ridiculous. – Steffen Ullrich Feb 08 '18 at 22:22
  • @SteffenUllrich Ok, I see it now. I was basing everything on the output size of 256 bits, and I converted to base 2 and rounded before doing the calculation so I got closer to 10^-60. Thanks. – AndrolGenhald Feb 08 '18 at 22:29
  • @SteffenUllrich in this case attacker even couldn't succeed with brute force, right? – Alex Robertson Feb 08 '18 at 22:32
  • @AlexRobertson: to cite myself from a previous comment: *"Based on this it is very very very unlikely for the attacker to guess at least once correctly in his life time - and he would not try for $1000 only."* – Steffen Ullrich Feb 09 '18 at 06:02
  • no, not even with sha1 would a random-input recovery be close to feasible. there's a LOT more combos of random bytes than there are passwords. – dandavis Feb 09 '18 at 07:18

2 Answers2

5

You're confusing cryptographic hashes with password hashes. A cryptographic hash takes an input and gives a random looking output. A password hash takes a password and a salt and very slowly gives a random looking output.

What you are worried about here is preimage attacks. You want to ensure that, knowing a hash y, the attacker cannot find a value x such that h(x) = y. 80 bits of complexity is still generally considered "good" (though not "great", and possibly vulnerable to a well funded attacker such as a nation-state). What this means is that you want a hash where the best preimage attack has a complexity of over 80 bits.

Analysis

Let's start with MD5 since it's something of a poster child for outdated crypto. The best known preimage attack on MD5 is 116.9 bits, well above 80. But you want to protect 1 trillion secrets, which lets the attacker do some optimizations.

Let's assume the attacker is able to use a hash table (and many terabytes of RAM) to check if a hash matches any of the 1 trillion in O(1). With 1 trillion secrets this reduces the complexity to break a single secret by log2(1012) ≈ 40 bits, which brings MD5 down to around 76 bits. Still not terrible, but worrying if you're facing a nation-state or want to keep these secret for many years.

But you're doing the right thing and using SHA256. The best preimage attack I could find on SHA256 was for reduced steps, and still had complexity over 250 bits, so you're looking at over 210 bits of complexity there. In this case an attacker will be running into some issues with physics.

AndrolGenhald
  • 15,436
  • 5
  • 45
  • 50
  • if I use `sha256`, will all of data be secure for many years? what about agaist brute force attack? – Alex Robertson Feb 08 '18 at 21:52
  • I will accept it as an answer, but I just want to be sure that should I use `sha256` with salt and which `rounds` is recommended in this case? – Alex Robertson Feb 08 '18 at 22:20
  • 1
    @AlexRobertson A salt and multiple rounds are entirely unnecessary when your input has more cryptographically random bytes than the output size of the hash. A single round without a salt is just as good. And it should be good for much longer than the rest of your life. – AndrolGenhald Feb 08 '18 at 22:24
  • Note that OP is talking about a British trillion, which is 10^18. Might want to update your answer accordingly. – forest Feb 09 '18 at 01:52
  • @forest I considered it but decided against considering very few organizations are even capable of storing over 27 exabytes of data. – AndrolGenhald Feb 09 '18 at 05:35
1

Specialized password hashing functions like bcrypt are designed to protect secret values that are relatively easy to guess. And that, to put it more technically, means secret values that the following strategy can guess with good practical chances of success:

  1. Guess likelier values ahead of comparatively unlikely ones;
  2. Try a very high volume of guesses at very large speeds.

Specialized password hashing functions take it for granted that #1 is just a fact of life for passwords, so they try to thwart the attacker on point #2 by slowing them down.

But if you're generating byte strings uniformly at random from a strong, cryptographic generator, then #1 is no longer a problem, so you don't need a specialized password hashing function. The only defense you need is to make the random byte strings long enough. 64 bytes is 512 bits, way more than you need, and in fact if you're passing it through SHA-256 that already pegs your security level at 256 bits, so you gain nothing for using more than 32 bytes. But actually, a 16 byte string (128 bits) should be enough; for example, the cipher that my browser is using to connect to this site uses a 128-bit key, which is an absolutely commonplace security level

Luis Casillas
  • 10,181
  • 2
  • 27
  • 42