Password-based encryption is inherently vulnerable to dictionary attacks. Indeed, when you have the encrypted file, you can always try to decrypt it (or parts of it) with a given password and see if the result makes sense. Thus, an attacker who could get his hands on the encrypted file (and we assume that this is possible, since otherwise there would be no point in encrypting), could "try passwords" on his own machines until a match is found. It shall be noted that the attacker does not need a 100% reliable test for the "result makes sense" part: if the attacker's test is sufficient to reject 99.9999% of wrong passwords, then he still reduced his list of passwords to double-check by a factor of one million.
That a tool such as ccrypt provides a test for rejecting wrong passwords does not change this fact. Sure, the attacker can then use the test as well, but it won't give him much of an additional advantage. The attacker could already work comfortably without that test, because real-life data has a lot of structure (most file formats have very recognizable headers) and if the attacker is interested in breaking the password, then the data must be valuable, hence "real-life".
There are two methods to make password-based encryption stronger; they are cumulative, and you should be using both. The first method is to use a strong password, which means a password with a lot of randomness in it (as usual, see this question for discussions on that subject). The second method is to convert the password into an encryption key using a good, properly configured password hashing function; the "proper configuration" there means that a salt will be used to prevent parallel attacks (when there are several encrypted files with distinct passwords, and the attacker would like to break one or several of them), and that the password hashing function is made deliberately expensive through many iterations to make dictionary attacks harder.
On the latter point, ccrypt appears to be defective. It hashes the password with what the FAQ describes as:
Ccrypt uses a hashing algorithm based on AES (i.e., on the Rijndael block cipher). It is not one of the standard hashing algorithms such as SHA1. Thus, for your application, you should be fine. From the viewpoint of security, it does not matter much which hashing algorithm is used, as long as it is collision-free.
which does not bode well. First it is, by the author's own word, a custom homemade function, which is never a good sign. Moreover, the author talks about the function needing to be "collision-free", which is completely irrelevant for password-based key derivation, and thus indicates a relatively poor grasp of the requirements of such things; this is not a good sign either, especially when combined with a homemade function.
Looking at the source code (that's the nice thing with opensource: you can see for yourself what the author did not see fit to document), the custom hash appears to work in the following way:
- Set K and H to both be sequences of 32 bytes of value 0x00.
- For each password character c (a single byte; we are using an ASCII-compatible encoding such as UTF-8 here), do:
- XOR every byte of K with c.
- Replace H with the encryption of H using Rijndael with key K and block size 256 bits.
- The final value of H is the "hashed password".
This calls for the following comments:
The FAQ says "AES" but it is not AES. AES is a family of three functions, which is a subset of the family of functions known as Rijndael. AES always uses 128-bit blocks; here, Rijndael is used with 256-bit blocks, so it is not AES stricto sensu.
As a hash function, it sucks. The process basically boils down to repeatedly encrypting a single block, using each time one in 256 possible keys (the 32 bytes of each encryption key are always identical to each other). Since this is symmetric encryption, symmetric decryption also works, which allows a meet-in-the-middle attack. Basically, the output size is 256 bits, but the preimage resistance is at best 2128, not 2256; while this is not an immediate concern for password hashing, it still demonstrates the fundamental flaw with custom schemes.
As a password hashing function, it stinks. Not only it is unsalted, but it is also extremely fast. An attacker with a basic PC will use the AES-NI instructions to compute a Rijndael encryption in a matter of a few dozens of clock cycles (the AES-NI instructions are meant to support AES specifically, but they can still be used to efficiently support 256-bit blocks; see section 2.1 of this paper). Moreover, since the password bytes are processed one-by-one, a lot of the computation cost can be shared between successive password trials: from hashed password 'foo', the hashes for 'fooA', 'fooB', 'fooC'... can be computed with a single Rijndael-256 invocation for each.
It can be estimated that a cheap, off-the-shelf PC would be able to hash at least 100 millions of potential passwords per second. Very few human-chosen passwords will resist such an onslaught for more than a few minutes at best. As far as password hashing is concerned, this is atrocious.
Summary: ccrypt is unduly weak against dictionary attacks, but not because it has a test for password correctness. It is weak because its password hashing procedure is a homemade construction that accumulates flaws making dictionary attacks extremely efficient. With a properly configured password hashing function (e.g. bcrypt), it could literally be made a million times stronger (the iteration count would be adjusted so that a PC could hash only 100 passwords per second, instead of 100 millions; it would still be highly usage because 1/100th of a second is not a large latency in human terms).