3

I have the following process for encrypting and decrypting data in a python script using the PyCrypto module:

Encryption - Server A

  1. AES 256 shared key is generated
  2. Associated IV is generated
  3. Data is encrypted using AES 256 shared key and associated IV using CBC mode and stored into the db
  4. RSA 4096 public key is used to encrypt the AES 256 key and associated IV which are also stored in the db

Decryption - Server B

  1. Encrypted AES 256 shared key and associated IV from db are decrypted using RSA 4096 private key
  2. Data from db is now decrypted using the decrypted AES 256 shared key and associated IV

Does the above process ensure the security of data against an attack model where the attacker has managed to gain access to the database?

Imran Azad
  • 813
  • 2
  • 9
  • 11
  • Why store the IV in the database at all? – user Nov 02 '11 at 13:59
  • @MichaelKjörling My understanding is that in order to decrypt AES encryption you need to generate the cipher for decryption using the secret key and the IV (I'm using a unique IV and a secret key for different data sets for tighter security). – Imran Azad Nov 02 '11 at 15:13
  • Sounds to me like you are confused about the function of some basic crypto primitives (as someone else pointed out, the IV only provides initial data entropy). You may very well want to drop the home-grown scheme entirely and use something tried-and-true instead - it is vastly more likely to be secure. – user Nov 03 '11 at 10:28
  • @MichaelKjörling Justice states "The CBC IV is not a secret, so long as you never reuse an IV." I am encrypting multiple data sets at any given time, therefore I cannot use the same IV. Do you disagree with not reusing an IV? – Imran Azad Nov 03 '11 at 23:39
  • I am arguing that (in addition to what others have pointed out) you don't need to save the IV separately along with the ciphertext. If anything, I would imagine that doing so completely unnecessarily exposes you to known-plaintext attacks in your scenario. – user Nov 04 '11 at 13:41
  • @MichaelKjörling Ah I see, thanks for the clarification. – Imran Azad Nov 04 '11 at 15:30

3 Answers3

6

My main feedback: You don't provide enough technical detail to provide a complete critique of your proposal, but you have provided enough information that I can see that you are making several common mistakes. Here are the main mistakes I can see so far:

  • Mistake #1: inventing your own encryption format. Usually, designing your own format for storing encrypted data is not a good idea; you are likely to get something wrong. It is better to use a standard format, like GPG or the OpenPGP Message Format.

  • Mistake #2: failure to include message integrity protection. Encrypting data without also authenticating opens you up to subtle but serious attacks. This is highly counter-intuitive, and a very common mistake. It is tempting to think, gee, I want to keep this secret, so if I encrypt it with a good encryption algorithm, I'll be fine. But nope, you won't be fine. You also need message authentication, to defend against chosen-ciphertext attacks. And you need to apply with a proper mode (e.g., authenticated encryption, or Encrypt-then-MAC) and with proper key management (independent keys for authentication and encryption, or appropriate use of key separation).

To avoid these problems, follow the advice at the links I gave above.

Other miscellaneous feedback:

  • There may well be other problems; you haven't provided us enough information to identify them all. Here are some examples of potential problems:

    • For instance, you don't describe how the IV is generated. In past systems, poor IV generation has occasionally led to security problems. (The IV needs to be generated using a crypto-strength pseudorandom number generator.)

    • You don't describe how the AES key is encrypted. (You need to use a proper padding scheme, e.g., OAEP or PKCS#2.)

  • The key lengths you have selected are overkill.

  • Keep in mind that, when modern cryptography is properly implemented and used, it is almost never the weakest link in the system.

    Instead, attackers usually defeat crypto not by breaking the crypto algorithms, but by bypassing the crypto and attacking some other aspect of the system -- maybe applying social engineering to the humans, maybe finding a security hole in the code and compromising an endpoint, maybe exploiting errors in the key management, or any of a number of other ways of attacking a system.

D.W.
  • 98,420
  • 30
  • 267
  • 572
3

There is no "secure enough" unless you define an attack model, which is the list of powers that you consider your predicted attacker to have.

Although, as a basic comment, there is no integrity check here, so active attackers will have fun with your data. Also, you do not tell anything about encryption mode (ECB, CBC, CTR... ?) and associated IV management, so it is possible that you messed it up (no hard feelings, it is easy to mess up, difficult to get right). The key sizes are overkill so you are either a governmental administration with more CPU cycles that you know what to do with, or you are somewhat paranoid, or both. Moreover, you are inventing your own crypto, and that's bad, because it is much easier to mess it up than it is to actually notice that you messed it up.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • I would disagree on the too much encryption point. 4096 bits of public key encryption encrypting a 256 bit shared key is relatively common practise. Without knowing the value of the data and the reason for it to be kept private this can not be assessed. However totally agree on your first and last points. Imran, you should always try to use a security framework that already exists and is maintained if possible. – Bernie White Nov 01 '11 at 21:10
  • @TomLeek Firstly, thanks for the great response, really appreciate it. I've updated the question that answers some of your questions. It's a case of both :-) The data that is being encrypted/decrypted is quite small so CPU cycles aren't really an issue. Also could you please elaborate on 'integrity check' How would the absence of integrity checks weaken security? – Imran Azad Nov 01 '11 at 21:40
  • @BernieWhite Thanks for the response. The data is patient data and therefore needs to be kept highly private. I'm using the PyCrypto module. – Imran Azad Nov 01 '11 at 21:45
  • 1
    @BernieWhite: well, it is common practice -- but government administrations and slightly paranoid people are many. 2048-bit RSA and 128-bit AES are already way farther than what can realistically be broken by even the richest of existing governments (where "way farther" means "by a factor of more than one freakin' billion" so there's kind of a security margin here). I thus feel entitled to use the term "overkill". 4096-bit RSA is not "less breakable" than 2048-bit RSA since the latter is already a "can't break it" algorithm. – Tom Leek Nov 01 '11 at 22:00
  • 1
    @Imran: absence of integrity check means that an attacker can modify the data without you noticing. With CBC, if you flip one bit of the encrypted text, the decrypted text will have one block replaced by mangled junk, and one bit flipped in the next block, in ways which the attacker can predict quite accurately. Depending on your data, this can give a lot of power to the attacker. Also, the attacker might be able to swap records if you have several such messages in your database. To have encryption _and_ integrity, lookup [EAX](http://en.wikipedia.org/wiki/EAX_mode). – Tom Leek Nov 01 '11 at 22:04
  • @TomLeek Thanks. Just to clarify, would I be correct in saying that integrity checks serve to ensure the authenticity of the data rather than facilitating the strength of the encryption from being broken? – Imran Azad Nov 01 '11 at 22:25
  • @Imran: yes, that's it. But in practice, weaknesses in your system will never be in the algorithms themselves anyway; once you said AES you've said it all. Concentrating on the key length usually means missing the point: 128 bits, 256 bits... that's not what really matters. Having a properly random IV, now _that's_ important. – Tom Leek Nov 01 '11 at 22:28
  • @TomLeek Thank you! I love clarity, have a lovely evening! :-) – Imran Azad Nov 01 '11 at 22:30
  • 2
    @Imran: No, that is not correct. Integrity checks are needed both for integrity and for confidentiality. (This is a subtle point that most non-cryptographers do not appreciate. Without integrity checks, there are subtle chosen-ciphertext attacks that breach confidentiality.) – D.W. Nov 03 '11 at 05:09
  • @D.W. Ah I see, thanks. Just to clarify when you use the word "integrity" do you mean ensuring the authenticity of the data and when you use the word "confidentiality" do you mean ensuring that the data is not decrypted? – Imran Azad Nov 04 '11 at 11:38
  • @Imran, yup, that sounds right. Roughly, confidentiality = keeping the message secret from the attacker; integrity = preventing the attacker from modifying the message. – D.W. Nov 04 '11 at 15:40
3
  1. Only the AES key is secret. The CBC IV is not a secret, so long as you never reuse an IV. Every time you encrypt a message with CBC, you may prefix the ciphertext with the IV and store that as the ciphertext. When you decrypt the message, simply remember that the first block is the IV.

  2. You don't include any integrity checking. You can generate a signature in various ways - but you must remember never to sign and encrypt using the same RSA keypair. One way you can generate a signature is to cyrptorandomly generate a second key for HMAC, obtain a digest of the ciphertext with HMAC-SHA512 and the generated key, and then encrypt and store the generated HMAC key alongside the generated AES key. If you follow the practice of concatenating the IV with the ciphertext, you should apply HMAC-SHA512 to the concatenated IV+ciphertext, not just the original ciphertext.

  3. You did not specify this one way or the other, but the AES key, the CBC IV, and the HMAC key must all be generated by a cryptographically secure pseudo-random number generator (cryptorandom PRNG), unless you happen to have a true RNG handy.

  4. There exist general standard formats for serializing encrypted messages. You may elect to use them in place of storing the raw bytes of the ciphertexts and digests. Standards include the OpenPGP Message Format and the Cryptographic Message Syntax.

Note that #1 is not strictly a security issue, but the issue in practice is that people are often confused about which parts of the system must be secret and which parts may be public, and what the correct context for these rules are, and such confusion often leads to errors. The purpose of the IV is to provide a high degree of entropy to the encryption of the first block of the plaintext, not to be some sort of second key.

yfeldblum
  • 2,807
  • 20
  • 13