I have a program that can encrypt text, I tried to discover the encryption that is used in this program and could not figure it out, knowing that if the plaintext is P and the encrypted text is C, if we encrypt C, it will generate P again "double encryption generate the original message" also the encrypted text is a unicode charachters, strong text e.g. input "abcd", output "žœ›" is it possible to find the encryption technique?
-
In general no. We expect that the encryption algorithms are close to random. Hashing is not encryption. Do you know the key size? encryption mode? only input and output? – kelalaka Jan 14 '19 at 17:59
-
unfortunately only input and output, 1 char input => 1 char output – Null Jan 14 '19 at 18:03
-
same input outputs same output? – kelalaka Jan 14 '19 at 18:04
-
if you mean "if I put the encrypted output in input place" so yes, I tried to put the encrypted text in input and the result was the original text "without encryption" – Null Jan 14 '19 at 18:09
-
Well that was not I asking but that says it is in stream mode. Let `p` be the input if you encrypt `p` you will get `c`, if you encrypt `p` again do you get `c` again? - without resetting. – kelalaka Jan 14 '19 at 18:13
-
yes exactly, double encryption bring the original text, that's surprised me – Null Jan 14 '19 at 18:15
-
this may be a duplicate of https://security.stackexchange.com/questions/64757/how-to-identify-encryption-algorithm-for-data-coming-in-field-attribute-values-o – David Scholefield Jan 14 '19 at 19:05
-
There are multiple people more qualified than me to answer this, but I do want to chip in that I've used a technique called a "crib" which a known plaintext string in encryption that you can use to speed up decryption. Some code breaking tools support this. Just wanted to throw the word "crib" into this mix – bashCypher Jan 14 '19 at 22:28
-
Unicode is a red herring here. The output of most encryption function is a binary string, it's whatever program you used to read the ciphertext that interprets that binary string as unicode. That it's rendered as unicode characters is a characteristic of the viewer program, not the ciphertext. – Lie Ryan Jan 14 '19 at 23:06
-
That you are able to obtain the plain text by reencrypting is a characteristic that is common to XOR-based encryption used with cipher chaining mode that doesn't depend on the data such as CTR. For example, [AES-CTR](https://stackoverflow.com/questions/31049685/aes-ctr-double-encryption-reverses-the-ciphertext-to-plaintext) has this property. XOR-based encryption is essentially done by XOR-ing the output of a pseudo random number generator with the plaintext. – Lie Ryan Jan 15 '19 at 03:01
4 Answers
As @David's answer pointed out, this is, in general very hard to do. Especially since some encryption algorithms like AES use a random initialization vector (IV) as the starting block. So correct implementations would generate a different IV each time, even if the plaintext is the same, resulting in a different message.
As an example, let's encrypt Hello World! in CBC mode with:
key: 0000 0000 0000 0000 (not secure, just for illustration)
If I use IV = 0000 0000 0000 0001, I get ciphertext (base64): KcS/oAzkKMi03Nrf65hlww==
If I use IV = 0000 0000 0000 0002, I get ciphertext (base64): 2wTIRW8sSb4XpE9Ue3xm6w==
(The IV is usually included with the encrypted message. I omitted it for ease of comparison)
- 111
- 3
-
I don't think it is using that hard encryption, especially after knowing that double encryption will generate the original text – Null Jan 14 '19 at 21:59
-
@Null Can you clarify on "double encryption will generate the original text". Do you mean encrypting the same plaintext twice yields the same cipertext or that encrypting the ciphertext again will generate the plaintext!? – Allan Pinkerton Jan 14 '19 at 22:02
-
yes, if the plaintext was P, after encrypt P, it will generate C as encrypted message. now, if we encrypt C it will generate the plaintext P. – Null Jan 14 '19 at 22:16
-
@Null Can you check if it follows basic patters of a simple _substitution cipher_ like ROT13? Such as does the ciptertext of abcd contain the ciphertext of abc as a substring, perhaps? – Allan Pinkerton Jan 14 '19 at 22:22
-
I tried but did not found anything, I think it is something related to unicode, because the encrypted text is a unicode text, and after put the unicode as input the output is the plaintext "original text" – Null Jan 14 '19 at 22:28
-
If clear text "A" encrypts to crypt "C" and "C" encrypts to "A", this sounds like a simple XOR. Assuming key "K": A xor K = C, C xor K = A ; Therefore by substitution for "C": (A xor K) xor K = A because (K xor K) = 0 and (A xor 0) = A. The more interesting result would be finding the key by: A xor C = K . – user10216038 Jan 14 '19 at 23:13
-
If you put in a string of all the same characters (AAAAAAAAAAAAAAAA etc.), the output repeat pattern should show you the key length. – user10216038 Jan 14 '19 at 23:18
What you are describing is an activity called cryptanalysis, which is the study of a cryptographic system.
Given that you've said you have a program that encrypts, I would start there. The first step is called Open Source Intelligence gathering, or OSINT (which is a fancy way of saying "google it".) Search the program's documentation for information about the encryption in use. Google the application name, looking for it to be accompanied by words like "crack" or "hack".
Next, I would analyze the program. Perhaps there are still symbols in the binary image. From a command line run a program like strings programname
to look at the various text strings in the binary image; if you see strings like AES128
, DESede
, HMAC
, LFSR
, or even ROT13
, those might be clues indicating the algorithm in use.
If all you have is the ability to encrypt and decrypt a message, you can try comparing the input and output. Encrypt a string of AAAAAAAB a couple of times, and see if the output is always the same. See if it follows a similar pattern, such as NNNNNNNO. Add another letter; take one away; see what happens to the encrypted data. If the same input message always yields the same output, it might be a simple substitution, or an ECB cipher. If the message always changes completely, it's probably using better crypto than you're going to be able to decipher. If a single letter change results in a completely different garbled message, you may be looking at a block cipher (again, difficult.) If you can ever determine that it's using a block cipher (changes always occur in multiples of 8, 16, or 32 bytes at a time) you'll probably need to disassemble and reverse engineer the actual program to recover the key.
- 33,650
- 3
- 57
- 110
-
thanks for these info, yes it is simple, all time the same encryption text, it is just like substitution, but i could not found it, i tried to make encryption for specified char to found the key or equation .etc but could not, i do not have that much of info in security field – Null Jan 15 '19 at 04:50
Not in any easy, guaranteed, way, no!
If you have access to a range of encryption algorithms then you could attempt to build 'reverse lookup tables' (assuming you have the key) and then compare the result of your unknown encryption algorithm to them.
For example, given an encryption algorithm enc(text, key) create a table of: enc('a', key) enc('b', key) enc('c', key) ...
and then test your algorithm against 'enc' by comparing the outputs.
You could even try this with guessing the key if you don't know it but that's a lot of guessing and you'd have to be very lucky!
However, as @kelalaka points out, this will also only work easily with an encryption algorithm that produces the same output for the same input without some kind of 'reset'.
Other than this I can't think of a way of doing this.
- 1,824
- 12
- 21
Specific to this encryption, but not a general answer:
Based upon your edited additional information, it's likely a simple XOR as I suggested earlier in a comment. XOR has two interesting characteristics: It reverses itself, and XORing same values produces zero.
I suspect the reason it looks like unicode is that the KEY has a high bit set. Standard english ASCII is actually 7-bits with the 8th bit used for extended character sets. The XOR with the key produces a high bit character in Extended-ASCII.
Since you can enter your own text, enter a long string of either "space" or "@" if for whatever reason spaces don't work. If it's nothing but an XOR, it should output a repeating string where the repeat count is the key length.
The reason to use "space" or "@" is that both of these have a single bit set, hex 20 or hex 40 respectively. It makes it easier to compute the key manually if you only have to deal with a single bit change.
In your example of input "abcd", output "žœ›", Assuming Extended-ASCII hex this is
61 62 63 64 outputs 9E 9C 9B
The output you show is only for 3 characters "abc" because the "œ" is a single character in extended ascii callled "Latin small ligature". So XORing the 3 charaters of input with output to find the first 3 bytes of the key yields hex: FF FE F8
Given that key, entering a string of "@" (hex 40) should yield: "¿¾¸" in extended ascii.
- 7,552
- 2
- 16
- 19