-1

To my knowledge, a file can have only one hash and a hash corresponds to only that file(keeping aside the 2 file one hash bug). Using that knowledge is it possible to make an algorithm to recreate a file using solely the hash?

Vedant
  • 111
  • 1
  • 4
  • 3
    `2 file one hash bug`? Do you mean hash collisions? – SaAtomic Jul 27 '17 at 05:47
  • Yes, (y can u comment something less than 15 char – Vedant Jul 27 '17 at 06:04
  • 1
    Similar [Finding hash collision](https://security.stackexchange.com/questions/135936/finding-hash-collision) – TheJulyPlot Jul 27 '17 at 07:04
  • I don't think the duplicate question you are referring to is the same. When only one part is lost it could be possible to recover the remaining day with the help of the other data and it's respective hash – Vedant Jul 27 '17 at 07:56
  • 1
    Your question is not clear to me: from *"recreate a file using solely the hash"* in the initial question it becomes *"When only one part is lost it could be possible to recover the remaining day with the help of the other data and it's respective hash"* which are two completely different questions and situations. If your question was about the latter, check [RAID](https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5) bit parity information which is designed exactly for this purpose. – WhiteWinterWolf Jul 27 '17 at 09:39
  • unless the file is smaller than the hash output, no, otherwise you need physically impossible compression – dandavis Jul 27 '17 at 20:46

2 Answers2

2

First of all, hashing is always a one-way function. (Ideally)There's no way to "reverse engineer" the hash function by analyzing the hashed values.

Second, there is no way a hash can carry information(or metadata) about the file.

Hashing algorithms (like MD4 and subsequent SHA algorithms) use 32 bit variables with bitwise Boolean functions such as the logical AND, OR and XOR operators to work through from the input file/text to the output hash.

ramailo sathi
  • 271
  • 1
  • 4
  • 18
1

Theoretically yes, technically extremely difficult even though hashes are mathematically irreversible functions.

Lets take an example. Maximum input size for a SHA1 algorithm is 2^64−1 bits. So with a hashing algorithm you apparently are mapping all possible files under 2^64−1 bits to a fixed length 160 bit value.

So theoretically there can be numerous collisions as we are mapping from a larger set to a smaller set.
Hash Collission

In that case if there is a lookup table with hashes of all possible values of {M} we can look up the hash values in table and can narrow down the possible messages for a hash to a handful of collisions that hash value can have. In the above example, if hash value A is given, we can be certain that the message would be either M1 or M4 or M10. If we know the approximate size of the file, an educated guess can be made to pin point the actual message.

Technically it is quite impossible because of the large number of possible messages.

hax
  • 3,851
  • 1
  • 16
  • 34
  • I can't upvote you answer because I'm new using this account, but thanks! – Vedant Jul 27 '17 at 05:06
  • @VedantKarandikar Glad to be of help. I have edited the answer to make it more clear – hax Jul 27 '17 at 05:10
  • 5
    I find this answer a bit misleading : 1. how do you find that 2^64−1 bits is 2 GB ? 2GB = 2^31 Bytes, and 1 byte is not 33 bit. Then if you take 2^63 possible inputs mapped to 160 bit (not a power of 2, but let say 2^8), you have 2^63/2^8 = 2^55 (3.6e16) possible files for every hash. If you make the assumption that files are below 2GB in size, you'll still get 2^23 collisions (8M). That's still a lot of files to filter out to get the right one ;) – Thierry Jul 27 '17 at 08:14
  • No, a file will hash to a size dependent on the algorithm used. A SHA1 hash of a 5Mb jpeg will still be 160 bits. – ste-fu Jul 27 '17 at 04:54
  • I understand that. But let us say we know 1010 translates to 'a' in hash than can't we replace a with 1010 using a he editor. This was just an example – Vedant Jul 27 '17 at 04:57
  • Hashing algorithms are designed to be one way. If abc hashes to 123, bbc might hash to 456 ie a change of 1 character results in a totally different hash. – ste-fu Jul 27 '17 at 05:02
  • Ok. Thanks.so you mean the same character will change it's hash Everytime? – Vedant Jul 27 '17 at 05:02
  • Yes. Tiny changes in input result in massive differences in output, in a way that is incredibly unpredicatable. – ste-fu Jul 27 '17 at 08:08
  • @Thierry Thanks for pointing it out. My bad. Terrible miscalculation :) – hax Jul 27 '17 at 18:21