2

I'm interested in building a service that stores client-side-encrypted text-documents.

Would it be possible to implement a search where the user inputs a search string (this search string would also be encrypted on the client's side) and where the server can perform the search over the encrypted documents without knowing their contents nor the contents of the encrypted search string?

I don't know very much about encryption and I just thought I'd see if I could somehow get a very naive example working. However, using the 'aes-256-ctr' algorithm, I get the following results for my little encryption program:

$ node encrypt.js 123 'hello my friend'
90cbf635540412a202eb46dada1fcf

$ node encrypt.js 123 'hello'
90cbf63554
$ node encrypt.js 123 ' my '
d8c3e379
node encrypt.js 123 'friend'
9edcf33c5540

What kind of encryption algorithms should I be looking at to perform a text search on an encrypted document with an encrypted search string without being able to decipher any of it?

Thanks a lot for your help!

Macks
  • 123
  • 3
  • 6
    Something is wrong with your examples. Encryption of 'hello' should not match the beginning of the bigger encryption. This allows for an easy attack. Perhaps you need to specify a random initialization vector? – Neil Smithline Nov 29 '15 at 16:43

5 Answers5

2

The only usable solution that I am aware of that hasn't been found to have serious code vulnerabilities is ZeroDB (www.zerodb.io) which allows running queries over encrypted databases.

The ZeroDB developers have provided instructions on how to test in Python on their website in the Documentation section.

1

Use Fully Homomorphic Encryption as described here: https://security.stackexchange.com/a/20218/13111

It allows arbitrary computations on encrypted data. It encrypts the data and encrypts the computations themselves. The computations can be performed by untrusted systems without ever exposing the real data.

Notably homomorphic encryption is pretty nascent technology so you're at risk of being sold snake oil, so be careful when choosing libraries , services, or packaged solutions.

Alain O'Dea
  • 1,615
  • 9
  • 13
1

Take a look the Mylar (https://css.csail.mit.edu/mylar/) database. It's using somewhat homomorphic encryption that allows you to use encrypted searches on encrypted data.

The tradeoff is in speed, so it's most useful for small data units like messages.

Geir Emblemsvag
  • 1,589
  • 1
  • 11
  • 14
0

You should not over look the point that it would be useless to encrypt each and every word separately. There are limited words in a said language and hash for every string can be calculated. And it can be decrypted very easily. If the server is performing the search in your server, program it in a way in which it doesn't keep log of any information.

Sanidhay
  • 191
  • 12
0

No.

When you encrypt a text it has a specific hash and also the result of the encryption is based on the whole text.

If you encrypt or hash just a single word, the result is totally different and you can not search for it in the encrypted data.

Without decryption it is not possible. A search makes no sense when the data is encrypted.

Take a look at the result of the encryption of my and friend. You can not compare it.

Encryption is meant to protect data. If you can search in it, it would not be secure.

Daniel Ruf
  • 1,682
  • 14
  • 18