Broadcast encryption protocol for file sharing service (a-la Dropbox)

Question

I'm building a prototype of a (rather simple) Dropbox-type file sharing service, where users can upload & share file(s) with other user(s).

But here's a feature request:

I want file(s) to be stored on the server in encrypted form, and
I do not want to store any keys on the server. (i.e. I don't want to give the service's maintainer the ability to access user's data.)
I don't want to force users to use 2nd (decryption?) password(s) for shared files.

Is there any cryptographic protocol that fits this task?

My previous read is:

Pattern to allow multiple persons to decrypt a document, without sharing the encryption key?

but access right revocation seems pretty complex in this case.

Broadcast/multicast encryption, maybe? I just don't know where to dig.

Go back and read again the answer to which you link. The keys stored on the server do not give the service's maintainer the ability to access user data because those keys are themselves encrypted by the users' public keys. Access revocation is as simple as deleting a user ID/key pair for a particular file. — Bob Brown, Dec 25 '14 at 17:38
@BobBrown My bad - I didn't pay attention to details. Any way you can post your comment as an answer? I'll accept it. — Dmitriy Khudorozhkov, Dec 25 '14 at 21:28

Bob Brown · Accepted Answer · 2014-12-25T21:44:11.027

First let me say that this is essentially a restatement of the answer linked in your question.

I suggest that you use symmetric key encryption to guard files and asymmetric (public) key encryption to guard the symmetric encryption keys. You could use an existing PGP/GPG key pair, or, to make this transparent, generate a key pair explicitly for your application. If you do the latter, please use tried and tested code to generate the keys.

You stated a requirement that no keys be stored on the server because " I don't want to give the service's maintainer the ability to access user's data." As long as keys stored on the server are encrypted, the service's maintainer has no access to the data, so I've interpreted "no keys on the server" as "no keys which would allow access to the data."

The process works like this: When a file is uploaded, it must be encrypted (by an app on the sender's client) using a symmetric algorithm like AES and a randomly-generated key. The key must be generated by a cryptographically secure random number generator (CSPRNG)

The encrypted file is uploaded to the server. The symmetric key, encrypted with the sender's public key is uploaded as file metadata, associated with the sender's ID. If a file is uploaded by a user with the ID alice@example.com, the metadata would look like:

alice@example.com | encrypted (with K_A) symmetric key

At this point, only Alice can read the file because only Alice has the private key that will decrypt the file's encrypted symmetric key. (The application must have discarded the unencrypted copy of the symmetric key soon after the metadata were updated.)

If Alice wants to allow bill@elsewhere.com to retrieve the file, she adds to the metadata a line like this:

bill@elsewhere.com | encrypted (with K_B) symmetric key

Alice does that by retrieving her own metadata, decrypting the symmetric key with her private key, and re-encrypting with Bill's public key.

Assuming the metadata file itself is unprotected, Bill can pass access to Charlie by performing the same kind of operation. That is not only OK, it is good. Bill can obviously download the file and give a copy to whomever he pleases. If access is passed through the file's metadata, then there's a record of what happened. (Bill can still pass the file to others, but is no longer forced to take that route.)

Because of your requirement that only one password be used, you'll need to use the password that unlocks the private key as a login password. One way to do that might be to store on the server an authenticator for each user, encrypted with the user's public key. Only a user with the corresponding private key can decrypt it and return the plain-text authenticator to the server.

Everything up to the login involves end-to-end encryption, and so everything passing "over the wire" is encrypted. For the login authentication, the client must return the plain text authenticator, and that means a TLS connection is needed.

That's a good, standard answer, and the first thing that came to my mind. I'd be careful of storing the meta-data though. Traffic analysis is often times just as useful to people as the actual data being analysed. — Steve Sether, Dec 26 '14 at 21:46
Y'know what? If people connect to the service, even with TLS, the traffic is visible, metadata or not. One *might* be able to foil traffic analysis with something like Tor, but not if the adversary is the government. — Bob Brown, Dec 26 '14 at 23:38
I'm actually referring to the meta-data about who shares what with whom. Ad adversary doing traffic analysis on who connects to your server is going to be hard to eliminate, but I wouldn't count out stopping the NSA with Tor. As we know from the Snowden documents, they aren't as all powerful as many believe. — Steve Sether, Dec 27 '14 at 00:08
The metadata are not accessible through traffic analysis, and note the requirement for TLS because single-password. Sharing patterns cannot be determined without compromising the server. — Bob Brown, Dec 27 '14 at 03:31
Hi Bob, I'm talking about the server maintainer having access to the metadata. Perhaps I'm using the word "traffic analysis" more broadly than you're accustomed to, but I think who's sharing with whom is largely the same concept as who's talking with whom. The OP has stated he doesn't want the server maintainer to have access to the data. Why should they have access to the meta-data? — Steve Sether, Dec 27 '14 at 07:21
@SteveSether: The server operator can see the metadata, but doesn't need to. The operator can see who logs on, who uploads files, and who downloads files. That gives much more fine-grained information about who's sharing with whom than the metadata. *Ergo,* worrying about the metadata isn't helpful. — Bob Brown, Dec 27 '14 at 22:24
There's a difference between having the ability to record information, and recording it. Recorded information is available for discovery in lawsuits, national security letters, subpoena, and breakin. So recording metadata is really completely different than having the potential to log upload/downloads. If you record the metadata, it can be analyzed by any of the above. If you don't, it can't. One example of this is Apple no longer having the ability to decrypt iPhone devices when government entities request it. — Steve Sether, Dec 29 '14 at 00:49

score 0 · Answer 2 · answered Dec 25 '14 at 17:34

If you want to have a single, non-duplicated copy at-rest, and you do not want the storage-provider to be able to decipher the at-rest copy, and the sender can add and remove recipients after initial upload time, then you limit the options. If one or more of the constraints are negotiable, it becomes easier.

Sender enciphers with a master (symmetric) key, then enciphers a copy of the master key with a per-recipient public key and uploads a revocation: requires either the sender to persist the symmetric master key or for the storage provider to persist the symmetric master key
1a. This means the recipients can learn the master key, since that is what the recipient's private key would unlock
Sender enciphers a copy per-recipient of the message with a unique per-recipient public or shared key. This means persisting many copies
Sender enciphers with the storage provider's public key, storage provider deciphers it, storage provider persist's it protected by the storage provider and makes available a copy to each recipient with the recipient's public key
A secrecy system like Shamir's Secret Sharing system1 with a minimum quorum size of 1 could be used (I have never heard of this used in the real world, only in crypto discussions)

Broadcast encryption protocol for file sharing service (a-la Dropbox)

2 Answers2