15

I do this to encrypt a single file:

openssl aes-256-cbc -a -salt -in file.txt -out file.enc

and then type in some regular plaintext password.

I do not understand how -salt enhances the security of this. The reason is that the salt is stored right there in the beginning of the file like this:

Salted__<eight salt bytes>

The salt being available to the cracker in such an obvious manner, what is the purpose of it? I don't see how it would make a dictionary attack harder ... especially given the fact that, as far as I know, openssl only uses one iteration to generate the IV from the password/salt - correct me if I'm wrong.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
SecurityClown
  • 255
  • 1
  • 3
  • 7

2 Answers2

25

The point of the salt is to prevent precomputation attacks, such as rainbow tables. Without a salt, anyone could just generate a huge dictionary of hashes and their associated plaintexts, and immediately crack any known hash. With the salt, such a dictionary is useless, since it's infeasible to generate such a dictionary for all possible salts.

I did a pretty in-depth writeup that covers this problem over at How to store salt?, which is worth a read.

Polynomial
  • 132,208
  • 43
  • 298
  • 379
  • 5
    The salt also makes it impractical to find locations where the same value is encrypted twice. Without salt, once you retrieve one password from a hash, you can easily search for any other occurrences of that same password. – Henning Klevjer Oct 26 '12 at 13:21
  • This, however, doesn't explain why the salt is a parameter! The `salt` parameter allows the salt to be badly chosen, which weakens it. – Kaz Nov 16 '16 at 21:06
  • @Kaz See Thomas' answer below for a more detailed explanation. – Polynomial Nov 16 '16 at 23:22
11

OpenSSL uses the salt in combination with the password to generate two values: the IV, and the actual encryption key.

  • The encryption key must be derived from the password and whatever data is present in the file header (because we want to be able to decrypt the file with knowledge of the password only). However, we do not want to get the exact same key every time we use the same password, because otherwise attackers could try to break N files for less cost than N times the cost of breaking one. An example of cost sharing is precomputed tables (tables of password-to-key mappings), rainbow tables being a special case of precomputed tables.

  • The IV must be as uniformly random as possible; there again, two distinct files should use distinct IV, even if encrypted with the same password. One possible design would have been to add the IV in the file header (after all, the IV is not meant to be secret -- otherwise we would call it a key, not an IV). The OpenSSL developers preferred to derive the IV from the password, just like the key (i.e. they produce from the password a long sequence, which they split in two, one half being the encryption key, the other half being the IV). This is valid, as long as some distinct element is added in the mix, because two distinct files must have distinct IV. The salt does that, too.

Note that the encryption method used by OpenSSL is not standard; it is "what OpenSSL does" but is not documented anywhere except in the OpenSSL source code. As an encryption file format, it is a bit lousy. In particular:

  • There seems to be no way to configure how many iterations should be used in the password-to-key derivation. This is an issue because configurable slowness is important to cope with the main issue of passwords, which is that they are, after all, passwords, i.e. things with not too high entropy in the best case.

  • The encryption system does not include a MAC so alterations are not necessarily detected (they are not detected at all if the last two blocks of the padded file are unmodified).

Therefore you are encouraged to use, if possible, a better tool for password-based encryption of files. For instance GnuPG.

StackzOfZtuff
  • 17,783
  • 1
  • 50
  • 86
Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • The point of openssl command line functions is to just run the encryption algorithm function from the command line so one can build full applications. It's just another method of access to the library for things like shell scripts, it's not designed to be a complete system for encrypting files securely. – ewanm89 Oct 26 '12 at 20:06
  • ewanm89: If the ONLY point of command line functions was to access the library, iterations would have long been made available. Its been part of the library for a VERY long time. – anthony Feb 07 '19 at 06:26