Filling the disk with random data prior to encryption?

Question

Filling the disk with random data prior to encrypting it will supposedly make it harder for the attacker to perform any cryptanalysis. Most sources seem to state this is because it will be harder for the attacker to determine what data is actually encrypted (and which is just random garbage).

However, is this strictly necessary? It can take a prohibitively long time to fill the entire disk with random data, for large disks. If the data could be attacked and decrypted with any form of attack, then how much is this extra hurdle really worth? What is the real concern and attack scenario where this sort of prevention technique is actually any useful?

Has any encrypted data ever been decrypted because the owner failed to fill the disk with random data prior to encrypting it? Or is this practice just an overly paranoid extra measure that in reality provides no real additional security?

If needed to provide any specific answer, then assume the system is GNU/Linux with LUKS, AES 256 bit, having encrypted data on a normal HDD. Two partitions: One /boot partition with no encryption only used for booting and one root partition with said encryption.

Attack scenario: The attacker obtains the computer with the power turned off. Assuming no cold boot attack or evil maid attack is possible.

score 11 · Accepted Answer · answered Jan 07 '13 at 09:13

11

There are a few reasons why this is necessary. First, as you stated, it makes cryptanalysis of the data difficult due to being unable to identify the boundary between ciphertext and background noise. This could be defeated by capturing two snapshots of the volume and identifying the locations that change, so it's hardly a concrete security measure in this sense. This issue actually brings us onto a more important one - file systems don't evenly spread their data around on the disk, and often leaves remnant data hanging around.

This is where the background randomness is important. If an attacker can identify old blocks of ciphertext, e.g. from a file whose data was recently updated, he now has access to two versions of ciphertext using the same key (and likely the same IV, too) for two different plaintexts. This can lead to certain attack scenarios becoming more feasible, e.g. differential cryptanalysis. The background randomness makes identifying latent blocks of ciphertext exceedingly difficult.

Another case where background random data is mandatory is when deniability is required, such as TrueCrypt's hidden volumes feature. If an attacker can see that the volume spans over 10GB, but the volume only shows as 4GB when mounted, he can tell that there is a 6GB hidden volume too. By making the entire disk's data completely random, the ciphertext becomes indistinguishable from that background data, making identification of the hidden volume difficult if not impossible.

answered Jan 07 '13 at 09:13

Polynomial

132,208
43
298
379

Interesting, thank you for the input. In the case of AES, is differential cryptanalysis in this context (with typical file systems) a realistic security concern though, or is it simply something that could in the future become an issue? – ioctlvoid Jan 07 '13 at 10:08
It depends, really. The recent attacks such as CRIME on AES-CBC might make it practical, but it really depends on the mode of operation. – Polynomial Jan 07 '13 at 10:12
CRIME is about compression and not related to CBC; that's hardly relevant here. You must think about BEAST, which is an exploit of a weakness of CBC-based encryption when the IV is predictable by an attacker who can do a chosen-plaintext attack. In all generality, disk encryption is _hard_ when you want to protect against active attackers _and_ you want to keep reasonable performance. – Thomas Pornin Jan 07 '13 at 12:17
@ThomasPornin Whoops, yeah, I did mean BEAST. Thanks for the correction. – Polynomial Jan 07 '13 at 12:35
I added it in my answer, but it's worth mentioning that this only applies for fully encrypted volumes (otherwise allocation information makes it rather useless). It's also worth pointing out that file size leakage as well as gaps in allocation could reveal useful information about the content as well. – AJ Henderson Jan 07 '13 at 14:36
@Polynomial: The question was about full disk, block-level encryption, so I'm confused about your reference to the file system. The attacker shouldn't be able to see any files changing, just blocks. Also, you're assuming a persistent threat that keeps snooping on the encrypted disk over two or more timepoints. The original question was about a laptop being stolen, so a single timepoint. Assuming block-level encryption and a single timepoint, i don't think any of your concerns are valid. If I am mistaken, it would be great to get a more detailed explanation. Thanks! – taltman Aug 17 '16 at 05:24
@taltman I don't know where you saw me mention the file system. You're right that a single-point attacker wouldn't see snapshots, but the part about deniability was an aside point to further explain the need for a fully random disk. – Polynomial Aug 17 '16 at 10:52
@Polynomial: To quote: "This issue actually brings us onto a more important one - file systems don't evenly spread their data around on the disk, and often leaves remnant data hanging around." Since all of the file system manipulations are happening on top of the block-level full disk encryption, I don't understand why this point is relevant. An explanation would be great, thanks! – taltman Aug 17 '16 at 15:22
@taltman Yes, the point is that blocks early on would be modified by the filesystem, and the blocks past the "hidden" section could only be modified if the hidden section was mounted. – Polynomial Aug 17 '16 at 15:27
@Polynomial: If by "hidden" you mean providing deniability of a hidden section/partition of the full disk encryption, then this doesn't fit with your answer, as you only address deniability in the third paragraph, whereas you discuss the filesystem in the first. If your entire answer is focused on addressing deniability, then the answer isn't a good fit for the original question, which didn't require deniability. If I am misunderstanding something, a detailed explanation would be great, thanks! – taltman Aug 18 '16 at 03:46
@taltman Everything from paragraph two is essentially about deniability. – Polynomial Aug 18 '16 at 11:19

score 0 · Answer 2 · answered Jan 07 '13 at 14:34

A couple things worth mentioning here. First, this would only make a difference for full disk encryption. If the file allocation table (or other index) is not encrypted, it would be trivial for an attacker to detect where the boundaries of the files are.

The second issue would be a static analysis of a highly sensitive drive. Certain information would leak by being able to tell a) how much data is on the drive and b) the allocation of information on the drive might give away some information about the level of use and/or file sizes.

Third is the situation that Polynomial mentioned where remnants of files would be left on the drive and could be useful for differential analysis or simply leak information about what has changed recently.

Practically speaking, is this information useful, for the first two, probably not in most cases, but the differential analysis is a more practical threat that could result in data recovery. Really anything that reduces entropy makes the encryption weaker though, so having a random background is more secure overall.

Filling the disk with random data prior to encryption?

2 Answers2

Linked