6

I'm rolling my own, personal threetags.com-style 'encrypted data in the cloud' webapp (I didn't like the UI, and lack of non-browser client).

However, I have absolutely no experience with security and encryption, and cobbled together this scheme after reading up on threetags' security, Wikipedia articles, and documentation for the .NET cryptography library.

Before I get to my scheme, here's a quick overview of how I understand threetags' security scheme: They encrypt and decrypt data on the client using a key derived from the plaintext user credentials, and send only encrypted data to the server. For authentication, they use an unrelated hash of the plaintext user credentials. Since, theoretically, the credentials stored on the server used for retrieving data cannot be use to actually decrypt the data, nobody, not even the server owners and client programmers, can decrypt the data unless they get the original plaintext credentials from the user.

Assuming they have the right idea, and such a setup does guarantee safety of the data (as long as the user's password isn't compromised by a keylogger, the user doesn't save the password in plaintext form somewhere, etc.), I want to know if my implementation does the job.

The procedure I use for sending encrypted data is:

  1. User enters a plaintext username, password and data into the client.
  2. User registers or logs into the server by sending SHA-256 hashes of the plaintext username and password. Each hash is salted* with a 5-byte salt.
  3. Client compresses the data with gzip, then encrypts it using AES-256; the 256-bit (256/8 bytes) key is derived from the plaintext password using PBKDF2 with an 8-byte salt* and exactly 11368 rounds. The IV is automatically generated by the library class, and I append it unencrypted to the encrypted data for later retrieval.
  4. Encrypted data is encoded as a Base 64 string and sent to the server over HTTP, because who needs HTTPS if the data's already encrypted as can be?
  5. Client is free to retrieve the Base 64 data, decode it, decrypt it, decompress it, and have it back in its original plaintext form - so long as they can provide the plaintext password for the decryption key.

*For an N-byte salt, I take the unsalted SHA-512 hash of the plaintext I want to salt (or the plaintext I want to feed to the algorithm that takes a salt), and pick a deterministic subset of N bytes from it (say, for a 3 byte salt, the 1st, 3rd and 5th bytes of the unsalted hash - is simpler better for this?) - that sequence of bytes is my salt.

If it matters, I'm using .NET Framework 4, System.Security.Cryptography classes. I programatically specified the number of rounds only for their PBKDF2 implementation, since that's the only case where I saw an obvious property/method to set it with. I suppose that level of detail is a thing for Stack Overflow though.

In any case, I'm more concerned about absolute, conceptual security than speedy computing, but I'll gladly take any suggestions for improved efficiency (if anything I do is redundant), as well.

cervellous
  • 71
  • 4
  • 6
    You will get several answers saying 'Don't roll your own'. I prefer to try to answer the question in the answer, so I'm putting this in the comment. *Don't roll you own.* Security experts with decades of experience make mistakes in security schemes. Given that you have no experience with security and encryption *please* **don't roll your own**. RSA, a company with decades of experience, that created the RSA algorithm, with a significant amount of resources dedicated to keeping their network secure, was compromised. They discovered the compromise. Would you? – this.josh Jul 23 '11 at 05:30
  • 1
    @cervellous, welcome to the site! I'm glad you decided to come ask here - as the comments and the answers so far have explained, there is a lot to learn, and asking is the best first step! I hope you stick around, browse the site, and I'm sure you'll find many more interesting questions (and answers) that will help improve the security of your system. – AviD Jul 24 '11 at 08:14
  • "I'm rolling my own, ... I have absolutely no experience with security and encryption" Is all the information we needed to say with high probability that the answer to your question is "no". D.W. has done a great job of explaining what at least six of the flaws are (which wouldn't have been possible without the extra info). Don't feel bad - at least you asked! – Martin Bonner supports Monica Mar 02 '16 at 16:45

1 Answers1

21

"I have absolutely no experience with security and encryption": Yikes. It is probably not the best idea to be designing new cryptographic schemes, given this sentence, if security is important for your application. Cryptosystem design is a tricky subject. Folks who try to invent their own schemes without knowledge of the field often make non-obvious mistakes. If I can use an analogy, I wouldn't try to perform surgery on myself; I'd go to a qualified surgeon. If security is important, you should do the same.

On your particular scheme: It definitely has some positive aspects, but it also has some significant shortcomings that I spotted (with no guarantees that this list is exhaustive):

  • Rolls its own crypto. As a very general comment, the number-one lesson from the history of cryptography is that rolling your own crypto is highly error-prone. This is particularly a risk factor if you haven't studied the area deeply. Therefore, you are skating on thin ice, and you should expect that your scheme will likely have security problems that you didn't anticipate.

  • Leaks the password. User passwords tend have little entropy. Therefore, it is critical to reduce or eliminate opportunities for offline dictionary search against the password. Using PBKDF2 is a good idea, but you then shoot yourself in the foot by sending a (plain) hash of the password to the server. That enables rapid offline dictionary search. And there is no need for it. You shouldn't send a password hash to the server. For instance, it would suffice to send only the username to the server (and not anything derived from the password).

  • Insufficient iterations of PBKDF2 to protect the password. In addition, I suspect that 11368 iterations of PBKDF2 is not enough to prevent dictionary search attacks. Best practice is to select a number that is large enough to make the operation take, say, 1 second. See, e.g., this question on IT Security.

  • Insecure salting. The scheme you use for generating the salt is totally broken. You are using a deterministic function of the password as the salt; this totally defeats the purpose of a salt. The salt should be a crypto-strength random value, say 64 bits in length or more.

  • Doesn't authenticate data. You encrypt the data, but you don't authenticate it. Without authentication, attackers can tamper with the data, and you won't be able to detect it. Also, for subtle reasons, encryption without authentication is insecure.

    Fix: After encrypting, you should compute a MAC on the ciphertext using a strong message authentication code (e.g., AES-CMAC) and append the MAC. Before decrypting, you should check the MAC. Don't use the same key for encryption and for the MAC; when you derive keys from the password, you should derive two independent keys (256 bits).

  • Doesn't use HTTPS. I'd recommend using HTTPS. Encryption and authentication only defend against some threats. For instance, they don't ensure that you get the latest version of the data. This means that, in your scheme, if a client is connected to the Internet over an open wireless connection (for instance), a man-in-the-middle attacker might mount replay attacks that replay old versions of the data. I suggest that you connect to the server by HTTPS.

  • May be susceptible to traffic analysis. Encryption does not conceal the length of the data being encrypted. Also, when compressing data, the length of the compressed data can potentially reveal some information about the data itself being compressed (not just its length), which may pose additional risks. These risks might be acceptable or unavoidable, but you'll need to analyze them in the context of your particular application and the types of data typically being stored.

Even if you fix all of these problems, there is no guarantee that the result will be adequately secure. Generally speaking, if initial review of a design indicates 4 problems, even if you fix the problems you know of, one should be very suspicious: how do you know there isn't a 5th problem that you just missed?

That said, there are also a number of positive elements in your design. For instance, you are using standard well-vetted algorithms, which is good. Overall, I think you're on a reasonable path -- there are just some non-obvious issues you probably haven't anticipated.

D.W.
  • 98,420
  • 30
  • 267
  • 572
  • Whoops. I was under the impression that "roll your own" referred mostly to low-level algorithms, but I suppose I should have given that a little bit more thought. Thanks for the detailed answer, though. If nothing else, I learned a bit. – cervellous Jul 23 '11 at 13:14
  • @cervellous, yeah, that is indeed what most people mean (primarily) when they say "don't roll your own". But in this case, where others have already thought about how to store data encrypted by a password, it's often better to use a well-vetted high-level scheme (like, say, GPG's file encryption format) than to assemble one yourself with a collection of low-level algorithms. – D.W. Jul 23 '11 at 18:42