-1

Let's say I have an input message of "enoughly long" length (I'll got for a 128 bytes long message, but it might be 256 or whatever).

I know the hash sha1(message || secret) where || is byte concatenation.

Since the message is long enough and knowing how sha1 works, I can tell that the message will be split into chunks. Each chunk will be hashed, leading to an "internal state machine", reused to hash next chunk, and only the internal state before hashing the first chunk involving the secret actually matters to the result hash.

So, in theory, this means I could tamper the message into another thing message2, and have the same hash result sha1(message2 || secret) so long I can keep the internal state unchanged.

Is there a way to do so? Is there a way to built a message2 so that sha1(message2 || secret) == sha1(message || secret) when I know message and fully control message2 (including its length)? If there is, how to do so? It's not a length extension attack, but it feels like the same kind of process could be used, but I can't find it.

Xenos
  • 1,331
  • 8
  • 16
  • It's not an exact duplicate, but given the specification in the last paragraph, the suggested dupe gives details of why this doesn't (currently) work. – Matthew Apr 11 '19 at 16:28

1 Answers1

1

SHA1 is vulnerable to both collision attacks and length extension attacks. If you have a message1 and message2 that are multiples of SHA1's 512-bit (64 byte) block size (probably also work if the same length modulo 64 bytes; would not work if you could find a collision with different sized final padding block) that you generated in such a way to be a collision (sha1(message1) == sha1(message2)), then by construction you will have sha1(message1||secret) == sha1(message2||secret).

Using the PDF collisions from the shattered paper:

$ wget https://shattered.io/static/shattered-1.pdf
$ wget https://shattered.io/static/shattered-2.pdf

$ echo "This is a secret!" > secret

$ cat shattered-1.pdf secret > new_collision1

$ cat shattered-2.pdf secret > new_collision2

$ sha1sum *
6168d4eabea18ba0a256e37b5de2f046a272fa81  new_collision1
6168d4eabea18ba0a256e37b5de2f046a272fa81  new_collision2
b56a50ed081661fd5b877e9418d31d4981ddbd87  secret
38762cf7f55934b34d179ae6a4c80cadccbb7f0a  shattered-1.pdf
38762cf7f55934b34d179ae6a4c80cadccbb7f0a  shattered-2.pdf
$ md5sum *
9b6265c6939f916e8bf5e408ce0ba1fc  new_collision1
c889bbafde03df897c31da16b719c07d  new_collision2
4fa38cdd374c8a51a9ae198a89dba4da  secret
ee4aa52b139d925f8d8884402b0a750c  shattered-1.pdf
5bd9d8cabc46041579a311230539b8d1  shattered-2.pdf

E.g., note shattered-1.pdf and shattered-2.pdf have different contents (evidenced by different MD5 hash) but were constructed to have an identical SHA1 hash. Concatenating the same file secret to the end of both files keeps them both as SHA1 collisions.

I should clarify that the SHAttered attack to generate a SHA1 collision needs to modify both message1 and message2. If message1 can't be tampered with (and wasn't specifically chosen with a known collision), you will need a preimage attack not a collision attack. Collision attacks ask to find any match where H(m1) == H(m2) allowing you to alter both m1 and m2 as you search; brute-force collision attacks should take O(sqrt(2N)) work because of the birthday paradox -- sha1's 160-bit hash should take 280 ~ 1024 work for a brute-force collision attack. Pre-image attacks require finding a message m1 such that H(m1) = h for some fixed hash h; brute-force pre-image attacks take O(2N) work, which in sha1's case would be 2160 ~ 1048 work (that is 1024 or a million million million million times harder than a collision attack).

dr jimbob
  • 38,768
  • 8
  • 92
  • 161
  • Alright. I would also like to ask then if `m1` `m2` can be tamperable (so collision attack) but with contraints on them? Like `m1 must start with the following 32 bytes` (or whatever number)? Or with two constraints like `m1 must start with the following 32 bytes, m2 miust start with some other fixed 32 bytes`? – Xenos Apr 12 '19 at 07:30