0

Note: this question arises purely from a theoretical interest in security research.

Frequently in popular fiction we see white-hat hackers decrypt secret information in a matter of seconds. While this is obviously not likely in the real world, it got me thinking as to how one might actually go about such a process, given that we have the ciphertext but not the key.

The first object to decrypting the unknown data would be finding out what algorithm was used. It is not too far a stretch of the imagination that we might be able to get this information, for example, it is possible to assume certain standard protocols are likely to have been used in particular situations (such as SSL/TLS on an e-commerce transaction). It is even conceivable to suggest that we might also know the length of the key.

The second main object is the amount of time that it takes to brute-force all possible combinations. This is frequently cited as a strong reason for the safety of cryptography (with a suitable key length). However, let us say we have the ciphertext, the algorithm and the key length (but not the key), as well as a highly-powerful computer capable of brute-forcing the entire set of possible keys in a reasonable amount of time.

The question is: seeing as we do not (obviously) know what the plaintext was, how do we know when we have cracked the encryption? It is one thing to enumerate all possible decryptions of the data, but how can we determine which of these candidates in the actual plaintext?

ose
  • 143
  • 5

1 Answers1

1

In theory you don't know.

But in practice you have some idea about what the data is or how it's structured. Perhaps because you know it's, say HTTP traffic, or because you know it's an encrypted .DOCX or .PDF file, or somesuch, or you are aware of some other properties of the plaintext (e.g. perhaps you might know that the plaintext is a binary blob followed by its 512-bit SHA-512 hash).

And even if you don't have any of the above, you can still try and make an educated guess on whether the resulting decrypted blob is correct by a statistical analysis.

Homework assignment: You may want to think of how this question applies to XORing a data stream with a one-time pad, and why that method is secure.

Nik Bougalis
  • 127
  • 5
  • That is possible. However there are plenty of situations where such assumptions are not possible (for example attempting to decode encrypted control systems such as military drones, satellites, etc or even decoding illegal encrypted images (format unknown) for police/forensic purposes). That raises the question as to whether statistical analysis would be of much use given the data is not natural language format. – ose Nov 23 '12 at 11:49
  • As for the one-time pad, there is always the problem of key exchange and of finding a suitable PRNG, although that's for another question... – ose Nov 23 '12 at 11:53
  • Sure, if the data you are attempting to decrypt cannot be distinguished from "random" data (as might be the case when you are encrypting something that's already encrypted) then statistical analysis may not be of much help. As for whether it can help when the file is not in a "natural langauge" format, ask yourself whether you could write a program to detect Windows PE or Linux ELF executable files using purely statistical methods and that question will be answered. – Nik Bougalis Nov 23 '12 at 19:01
  • And the question of the one-time pad was meant to get you thinking about statistical analysis. Given a blob that's n bits longs, composed of one-time-pad XOR-encrypted data, I can derive a key that generates any possible message I want - the original might say "ATTACK AT DAWN" but my decryption might generate "SURRENDER NOW!". This isn't always the case with other ciphers, although it may very well be. But "may" doesn't count. – Nik Bougalis Nov 23 '12 at 19:04
  • Padding can give you a good hint if block ciphers are used. You are a bit screwed with stream – Konrads Nov 23 '12 at 23:30