A firm has 10 million files, all ransomware encrypted, but the firm has all of those 10 million files backed up, and almost all of them have not changed. Would comparing all of those files against their unencrypted backups in addition to the other cracking algorithms help discover the key?
-
yes, but it will take forever to bruteforce it because the size of the key is usually(hopefully) very big – JOW Mar 24 '16 at 13:17
-
Depending on the ransomware, some only encrypt enough of the file to render it unusable - it might be possible to combine parts from the encrypted file and parts from the backup version to result in a more up-to-date version. Might also suggest that the backup strategy isn't quite right though... – Matthew Mar 24 '16 at 13:21
-
53If almost all of them are unchanged, is there even any value in trying to discover the key? Nuke all systems from orbit and restore the backed up files and just re-write the changes, it will save a lot of time and headache. – sethmlarson Mar 24 '16 at 13:24
-
1@JOW -- agreed except for the "because". With a known plaintext attack, the key size isn't *necessarily* related to the cracking time. Consider a 1Mb key, but used with a lame encryption *algorithm*: "break the text into 1Mb chunks, and XOR each one against the key." The key length really doesn't help (and not knowing the key length only hurts a little). But in practice, the ransomware is probably using a good algorithm as well as a decent-sized key, so the bottom line is as you said. – TextGeek Mar 24 '16 at 15:03
-
1In the history of Ransomware (not long, but unfortunately rather rich) the only exploits to defeat it have been based on direct key exposure, see TeslaCrypt and Cryptear / variants. No ransomware has ever been seen in the wild with a cryptographic system (key) so weak that a KPA was within reach of even a robust supercomputer. Is it time to start merging KPA/ransomware threads as dupes? – Jeff Meden Mar 24 '16 at 15:19
-
@Oasiscircle: maybe he only have a back-up of a set of files, but not all. – Quora Feans Mar 24 '16 at 16:12
-
5@Oasiscircle I suppose "almost all of them have not changed." means that 9,999,999 files have not changed since the last backup but the one file that did change since the last backup contains crucial business information. – emory Mar 24 '16 at 16:33
-
1@TextGeek yeah, I just wanted to keep it short, thanks for pointing it out. – JOW Mar 24 '16 at 17:41
4 Answers
What you are suggesting is a Known Plaintext Attack, and yes if the encryption algorithm is bad enough, it could be used to discover the key or keys used to encrypt the data, depending on the cipher used. I say keys because some ransomware uses individual keys per file, so cracking one key would only give you the key to that file.
Practically this is unlikely to be useful as unless the ransomware encryption scheme has some sort of flaw (weak cipher, poor pseudo-random data source, small key, etc) or you have access to massive decryption computing resources then your great-grandchildren might just live to see one of the files cracked.
-
35known plaintext attacks only work against *really* bad ciphers like FEAL. – SEJPM Mar 24 '16 at 13:33
-
2Specifically, known plaintext attacks cannot work against asymmetric cyphers. By definition. – John Dvorak Mar 24 '16 at 20:56
-
4@JanDvorak: That's kind of a "no true Scotsman" argument right there. Neither symmetric nor asymmetric ciphers *should* be vulnerable to known plaintext attacks. Whether a particular cipher is vulnerable to the attack is a different question. – Dietrich Epp Mar 25 '16 at 19:59
Would comparing all of those files against their unencrypted backups in addition to the other cracking algorithms help discover the key?
Addendum: usual malware operation
Once activated, the malware will attempt to contact its command-and-control network and either use a compiled-in public key (of which the CCC has the private one) or generate, as securely as possible, a public/private key pair, send the private key to the CCC, and delete its local copy.
Now the malware has a public key. It then either generates a single crypto key, or one crypto key for every file it attacks, and encrypts the crypto key with the public key. This way it can use a symmetric algorithm, very fast, to encrypt the file, while keeping the decryption key protected by a safer, albeit much slower, asymmetric algorithm.
The malware now creates an encrypted copy of the target file, then does its best to destroy all copies of the original (e.g. shadow copies). Finally renames the encrypted copy with the same name of the original, plus some extension.
Without ransoming back the private key and excluding errors on the programmers' part (e.g. they did not securely delete the private key locally, and it can be recovered), there is no chance of getting the symmetric key.
It is still possible to try and force the symmetric part of the encryption, knowing what the decrypted text looks like and the structure of the encrypted file (something like [32 BYTES SIGNATURE][4K OF ASYM-ENCRYPTED KEY][SYM-ENCRYPTED DATA]). This is where the KPA might come in. But this requires the symmetric encryption to be KPA vulnerable.
Theory
Yes... and no. What you propose would be a "known plaintext attack" (KPA).
But even if all the files were encrypted with the same key (which is not at all a given), the time required for a successful attack against a strong, properly implemented algorithm - as most recent malwares employ - is astronomical. You would in practice be running a brute force decryption, using the known plaintext to confirm the correctness of the key and the fact that it is the same for all files (you'll only know this at the very end. Until you break the encryption of the first file, you'll never know whether you gained access to 100% of your files, or only to 0.0000001%).
So you could get the key from the comparison, but you would not be deriving the key directly from it (this can only be done if the algorithm or its implementation has flaws). As @Kevin observed, such algorithms are said to be resistant to known plaintext attack.
Practice
If you have a backup, restore everything from it. Files that are still encrypted and for which no current enough copy is available can be decrypted using the ransomware tools (i.e., surrendering and paying up), or you can try several tools that try to exploit known flaws in some ransomware implementations that expose the key, leave the original data in some recoverable form, or allow shortcuts in key bruteforcing.
Keep in mind that some ransomware authors are also behind some of these so-called "tools". At the very least, they should be purchased using a capped, traceable credit card with limited balance (e.g. a prepaid).
Further considerations
I imagine you've already concluded that a single user account capable of accessing all those ten million files, in the hands of someone not knowledgeable enough to realize something suspicious is happening - you don't encrypt 10 M files with a snap of the fingers - is a Very Bad Thing.
To date, this kind of malware has few attack vectors and they're almost all based on exploiting unnecessary user privileges. Removing or thwarting these privileges will effectively defang most malware of this kind, and it might be a good idea to review local and group security policies in large organizations (also, place some limits/checks on BYOD policies. I've heard it rumored that some malware variants will go stealth and wait before encrypting, depending on how many files/network shares they are able to "see". I could not verify this, but I think the idea is not difficult to have, and even if it's not true now and would require a different approach in the attack - go resident instead of running straight away - it might well become reality in the future).
Traffic analysis - just policing bandwidth to the various workstations - should have alerted that something was afoot and even pinpointed the culprit, even if this would have come too late for a lot of the files.
Further, if most files have not changed, then more frequent incremental backups are in order. If only 1% of the files have changed, it means that with the same resources you can run incremental backups with a frequency two orders of magnitude greater (actually it's not so straightforward, but still).
- 22,521
- 4
- 51
- 60
-
4More to the point, if 1% (to go with your order-of-magnitude figure) of 10M files has changed since the most recent backup, that means that 100k files have changed since they were last backed up. That is a *lot* of files. Even if only 1% of *those* are ones that you'd actually keep and refer to later, that's a thousand important files that would have changes lost if something like this happened in real life. – user Mar 24 '16 at 16:37
-
1Most of these systems use (some degree of) public key cryptography, which as a design requirement must be resistant to [*chosen*-plaintext attacks](https://en.wikipedia.org/wiki/Chosen-plaintext_attack). That resistance is a stronger condition than resistance to known-plaintext attacks; any system resistant to CPA is by definition resistant to KPA. – Kevin Mar 26 '16 at 01:45
Supposed that the ransomware is doing a "good" job, all files will be encrypted with something like AES-GCM, which is (at the time of writing) not (known to be) vulnerable to a known plaintext attack.
In that case the files might not aid much, only to verify the correctness of a brute forced key.
Also, if the ransomware is doing a "good" job, the key will differ on a per-machine basis, so brute-forcing the key will not help much to retrieve other data encrypted on another machine by the same ransomware.
In reality, no it does not. Encryption has long since passed the point where it is trivial to reverse by methods like Known Plaintext and such. It's not just a matter of XOR-ing the biggest file you can find against the encrypted version to get the keystream, that's not how things work in encryption these days. Ransomware uses standard high-strength encryption algorithms such as RSA that aren't subject to such simple solutions.
It is actually quite interesting to see how they have changed over the last few years.
What you can do is use a sample of the original files to identify the files that need to be recovered from backups, and consequently how much you can't recover that is lost for good. Each encryption virus variant has its own signature. Some of them change the filenames in specific ways, some have headers that you can read to identify which files are ruined, etc.
And of course the primary use of your backups is to allow you to recover. Don't rely on shadow copies, most ransomware these days deletes them as part of the process. It really helps to have multiple points through the day that you can use as references.
- 190
- 5