7

I have an encrypted file (AES Symmetric encryption).For backup purposes and to save disk space, Can I compress (lossless) the file without worrying about messing up the decryption? If so, can you recommend some good compression programs for this purpose?

BlueGene
  • 2,191
  • 9
  • 29
  • 33

10 Answers10

18

You can compress it, but it is unlikely to save much disk space. By its nature, encryption rarely leaves a file compressible by much.

Try it for yourself to see if there is any file size savings.

One data point:

-rw-r----- 1 gene    gene    2428671 2009-06-02 12:39 test.log
-rw-r----- 1 gene    gene     134524 2009-06-02 12:39 test.log.bz2
-rw-r----- 1 gene    gene     217162 2009-06-02 12:38 test.log.gz
-rw-r--r-- 1 gene    gene     263229 2009-06-02 12:47 test-AES.gpg
-rw-r--r-- 1 gene    gene     264833 2009-06-02 12:42 test-AES.gpg.bz2
-rw-r--r-- 1 gene    gene     263302 2009-06-02 12:41 test-AES.gpg.gz
-rw-r--r-- 1 gene    gene     134609 2009-06-02 12:43 test-bz2-AES.gpg
-rw-r--r-- 1 gene    gene     217246 2009-06-02 12:43 test-gz-AES.gpg

test.log is the original, and test.log.bz2 and test.log.gz are simply compressed with bzip2 and gzip, respectively.

If I encrypt it (gpg --symmetric --cipher-algo AES --output test-AES.gpg test.log) the encrypted file (test-AES.gpg) is slightly larger than compressed versions. Compressing the encrypted file actually adds a little size (test-AES.gpg.bz2 and test-AES.gpg.gz).

Compressing first then encrypting does show some savings (test-bz2-AES.gpg and test-gz-AES.gpg), especially with bzip2.

Of course, your experience may differ given different encryption software and/or different compression software.

You should consider whether the file size savings you get simply via encryption is enough, or if compressing then encrypting is worth the extra step in the process.

Gene Gotimer
  • 2,442
  • 20
  • 16
  • Really? never thought of that before – Chopper3 Jun 02 '09 at 16:09
  • 1
    Of course, if you encrypted the file yourself and know how it can be decrypted, decrypting it and storing it compressed together with a program to encrypt it again will save much more space. – schnaader Jun 02 '09 at 16:22
  • 1
    "Of course, your experience may differ given different encryption software and/or different compression software." Actually, no. Any encryption worth the name will produce ciphertext that is practically incompressible (see other answers for the reasons). – sleske May 10 '10 at 00:38
11

Not if the encryption is any good. Compression deals with recognizing patterns in data and creating a "shorthand" that refers to those patterns for later extraction.

If your encryption is good, the file looks like random noise, and that's not going to compress very much due to the absence of patterns. Of course you can put it in to an archive file (.zip, .gz, etc.), but you aren't likely to make it get much smaller.

Brad Beyenhof
  • 544
  • 2
  • 7
8

Compression programs don't modify the actual data in any way - if they did, they would be useless. (Sound and image compression is an exception, as the human's eye doesn't see such small changes, while a computer can choke on a single flipped bit.) So yes, you can compress encrypted files.

But since encrypted data is very similar to random data, it doesn't compress very well - so if you can, compress before encrypting. Otherwise the "compression" will be fairly useless.

For a compression program, Unix world prefers tar along with gzip/bzip2 (usually used from within tar, as in tar czf foo.tar.gz foo), while Windows users prefer ZIP, RAR or 7z.

user1686
  • 8,717
  • 25
  • 38
  • 1
    compression before encryption can significantly weaken the encryption. It's best to use a tool designed to do both in order to avoid the common pitfalls – JamesRyan Aug 17 '09 at 16:05
  • Or it can strenghten it - known plaintext attacks are much harder. – user1686 Aug 18 '09 at 19:17
  • 1
    @EK: I've never heard of this. Any serious encryption algorithm should be secure for *any* source data, no matter its nature. Do you have any references for your claim? – sleske May 10 '10 at 00:33
4

Using any compression program (7z, zip, gzip, bzip2) is lossless and does not affect your ability to decrypt the data.

However, due to the nature of the encrypted data, you will probably not gain much from it.

The proper thing to do is compress it before the encryption step. Existing utilities such as gpg do this. The behaviour to compress before encryption is the default:

michael:~> dd if=/dev/zero of=testfile bs=1048576 count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00300552 s, 349 MB/s
michael:~> gpg --symmetric --cipher-algo aes --batch --passphrase cheesestring testfile
michael:~> ls -al testfile testfile.gpg
-rw-r--r-- 1 michael users 1048576 2009-06-02 12:42 testfile
-rw-r--r-- 1 michael users    1123 2009-06-02 12:43 testfile.gpg
MikeyB
  • 38,725
  • 10
  • 102
  • 186
3

An encrypted file will lose the statistical properties that make compression work, so compressing the encrypted file will save little if any space. You should compress the file first (while it still behaves in a way that compresses well) before encrypting the compressed file. Aside from that, the compression will not affect the original content of the file when you come to uncompress it.

2

A file that can be compressed after encryption was by definition not encrypted. Perhaps it was "scrambled" or "obfuscated". Encrypted data is indistinguishable from random data.

Encryption software that does not first compress a file before doing the encryption is committing an act of negligence.

You can run an encrypted file through a lossless compression algorithm without destroying the data. This is the guarantee of compression - that whatever data you give it as input will be recovered as output from decompression. By definition, a lossless compression algorithm will return any data to you if you compress and uncompress.

carlito
  • 2,489
  • 18
  • 12
2

Usually in these situations you compress first, then encrypt, as you get better compression ratios that way.

Evan
  • 349
  • 1
  • 3
  • 6
0

Yes, it should not cause any issues. As far as the encryption program is concerned, its just data. However, it would be hard to recover the data, so you might want to use PAR2 after you create the archive.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
0

I think on balance the amount of space you would save would not be worth the potential problems it would cause.

Of course this will depend on what operating system you are using, whether your files are local or on a network, what sort of backup your doing, what your using for encryption and what kind of files your working with.

The main problem would be access speed, in that you will have to first uncompress then decrypt and whether the files are large or small its going to add processes. Also you will be compounding the risk of failure by adding processes.

Finally remember that your decryption software will want to decrypt an uncompressed file so you could end up with a compressed and uncompressded version existing at the same time, which would take twice the disk space at that moment.

-3

For the people that say you should compress before encryption, the reason why that is less secure is because of "known plaintext attacks". If someone knows that you compressed your data with gzip before encrypting, that means that they know the first handful of bytes of your plaintext already, since it will be the gzip header. From here they have a bit more of a foothold for cracking your encrypted data.

As always, there is no such thing as perfect security, and encrypting first may be perfectly fine for most uses but just FYI, it does make it less secure compressing before encrypting.

For folks who like this sort of stuff, I'm working on a few articles that talk about the basics of cryptography (aimed at programmers and other technical folk):

http://blog.demofox.org/category/cryptography/

  • 1
    Good crypto is resilient against known plaintext attack. If you're using AES with any sane block chaining method you're safe. – Hubert Kario Sep 16 '12 at 13:58