32

As described in Perfecting the Art of Sensible Nonsense, a major breakthrough in cryptography research was published in the summer of 2013:

In principle it seems to allow for what most computer scientists had assumed was impossible: obfuscating computer code so that secrets within it are preserved even by attackers who can run the code fully under their own control. This kind of obfuscation is enormously powerful, and can be used, for example, to create new forms of public key encryption. It is related to breakthroughs in functional encryption and deniable encryption.

It is described this way in the article:

The ... obfuscator works by transforming a computer program into what Sahai calls a “multilinear jigsaw puzzle.” Each piece of the program gets obfuscated by mixing in random elements that are carefully chosen so that if you run the garbled program in the intended way, the randomness cancels out and the pieces fit together to compute the correct output. But if you try to do anything else with the program, the randomness makes each individual puzzle piece look meaningless.

But it is also described as "... far from ready for commercial applications. The technique turns short, simple programs into giant, unwieldy albatrosses."

A reason for the unwieldiness is the need to use Fully Homomorphic Encryption (FHE), which itself remains unwieldy as described in related questions here.

Note that this has been discussed on other stack exchanges:

Crypto is probably a more suitable forum for general discussion, but many discussions of possible practical aspects (or the absence thereof) may be best done here.

Given the current state of the art, it seems that practical applications don't exist. But that statement is hard to accept without an example. So here is a question that should be relevant here: can someone provide a concrete example of the sort of useful task that indistinguishability obfuscation or functional encryption might some day be used for, and describe just how unwieldy it is at this point (size, performance, etc)?

nealmcb
  • 20,544
  • 6
  • 69
  • 116

2 Answers2

31

The article actually describes two constructions, the second one using the first one as a building tool. The first construction provides indistinguishability obfuscation while the second one is functional encryption.

Indistinguishability obfuscation is a rather esoteric property, which is not what non-academic think about when they hear "obfuscation"; the terminology is a misnomer. That property means that if you can encode some "processing" as a circuit (roughly speaking a piece of code which can be unrolled, with no infinite loop), such that there are several possible distinct circuits which yield the same results, then indistinguishability obfuscation allows publication of an "obfuscated circuit" such that anybody can run the circuit and obtain the result, but outsider cannot know which of the possible circuits was used as a basis internally.

What good IO can make, by itself, is unclear. The authors, in section 1.7 of their article, still present an example, which is rather far-fetched:

Software developers will often want to release a demo or restricted use version of their software that limits the features that are available in a full version. In some cases a commercial software developer will do this to demonstrate their product; in other cases the developer will want to make multiple tiers of a product with different price points. In other domains, the software might be given to a partner that is only partially trusted and the developer only wants to release the features needed for the task.

Ideally, a developer could create a downgraded version of software simply by starting with the full version and then turning off certain features at the interface level -- requiring minimal additional effort. However, if this is all that is done, it could be easy for an attacker to bypass these controls and gain access to the full version or the code behind it. The other alternative is for a software development team to carefully excise all unused functionality from the core of the software. Removing functionality can become a very time consuming task that could itself lead to the introduction of software bugs. In addition, in many applications it might be unclear what can and cannot remain for a restricted use version.

One immediate solution is for a developer to restrict the use at the interface level and then release an obfuscated version of the program. For this application indistinguishability obfuscation suffices, since by definition a version restricted in the interface is indistinguishable from an obfuscated program with equivalent behavior that has its smarts removed at the start.

The use case is dubious, at best. However, IO has a more useful (theoretical) functionality, which the authors expose afterwards: it enables them to build functional encryption.

Functional encryption is about providing a computable circuit (obfuscated with IO) which receives as input encrypted versions of some value x, and returns F(x) for some function F, without revealing anything else about x. The authors show how they can do that for any function F which can be encoded as a circuit, and the resulting obfuscated circuit is "polynomially-sized" with regards to the original unobfuscated circuit implementing F.

Now this "polynomially-sized" expression tells us that we are in the abstract world of mathematics, and not in the practical world. It relates to asymptotic behaviour when circuit size grows towards infinity. It does not tell much about how much the construction would cost in any practical, finite situation; only that if God plays with Universe-sized computers, then He will find the construction to be "tolerably efficient" provided that He first creates a large enough Universe to accommodate the bulk of the involved computers -- with no a priori measure of how large that Universe has to be for the theoretical result to apply. Empirically, we find that most mundane algorithms that we manipulate and that offer polynomial complexity are "somewhat fast", but that's mostly because these algorithms are very simple things; there is no proof or even suggestion that the construction described in the article would be as "mundane".

Functional encryption, if it can be made to work within reasonable CPU and RAM budgets, can be useful for security in a number of situations. For instance, imagine a FPS game opposing players on networked computers. For efficient 3D rendering, the local machine of each player must know the terrain map and the current location of all objects, including other players, so that it may decide whether, from the point of view of the local player, any other player is visible (and must be drawn on the screen) or hidden (e.g. sneakily poised behind a wall, ready to lob a hand grenade) -- but cheaters would find it very convenient to know these locations in real-time. Each player is assumed to have complete control of his machine (he has physical access). With function encryption, the player's machine may compute the rendering (that's the F function) based on the locations as sent by the server, without obtaining the locations themselves.

Unfortunately, for practical applications, we are far from having a workable solution. What must be understood is that in all these constructions, each circuit gate must map to an instance of Gentry's fully homomorphic encryption scheme, and at every clock cycle for the obfuscated circuit, all gates must be processed, regardless of whether they are "active" or not in the circuit (this is a big part of why the obfuscation theoretically works: it does not reveal active gates, by always making them all active). This article gives performance result: on a rather big PC, we are up for minutes of computation. That's for each gate in the obfuscated circuit, and for each clock cycle.

There are millions or even probably billions of gates in the envisioned circuit, given the setup of functional encryption: the "obfuscated circuit" must include asymmetric decryption and validation of zero-knowledge proofs. So now we are talking about using all the computers currently running on Earth, all collaborating on the computation, and they might make non-negligible progress towards running one instance of functional encryption within a few centuries.

These are just estimates that I make from what I saw in the articles, but they should give you the order of magnitude. Namely, that the albatross metaphor falls very short of the reality; you'd rather have to imagine a full army of spaceships bent on galactic domination. If it flies at all, despite bureaucratic heaviness, it will have tremendous inertia and be quite expensive.

Erwin Mayer
  • 103
  • 3
Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • 1
    You had me at "minutes of computation", then I read "for each gate and at every clock cycle" :( – Jus12 Oct 09 '16 at 07:55
5

Just adding a simpler explanation of Functional Encryption Fully Homomorphic Encryption before proceeding to Functional Encryption and Indistinguishability Obfuscation as it is easier to explain that way. Taken mainly from http://crypto.stanford.edu/craig/easy-fhe.pdf The above is a paper by Craig Gentry (one of the pioneers in this field), and he gives a very easy to follow Alice example with some math.

Fully Homomorphic Encryption

Excluding the math Functional Encryption Fully Homomorphic Encryption is something like this:

Lets say there is some Unencrypted Data - UD. There is an encryption of the Unencrypted Data called Encrypted Data - ED. We perform some analysis/manipulations (lets call the analysis a function F) on UD get some unencrypted output - UO.

The question posed/answered by Functional Encryption is - Can one run the analysis/manipulation function F, on the encrypted data ED, and get a corresponding encrypted output - EO, SUCH THAT if one decrypts EO, one can get the unencrypted output defined above - UO?

A simple diagram (process 1 should give the same output as process 2):

  1. Unencrypted Data UD -> passed through some analysis function F -> gives Unencrypted Output UO.

  2. a) Unencrypted Data UD -> encryption algorithm -> Encrypted Data ED [This process is on a trusted environment]

    b) Transfer ED anywhere. -> Perform analysis function F on ED -> Get EO (Encrypted Output). [Can be done anywhere including a non trusted environment, say the cloud]

    c) Get EO. -> decryption algorithm -> Get UO (Unencrypted Output) [On a trusted environment]

A working scheme at the bit level just needs to support ADD, SUBTRACT and MULTIPLY as explained in the above paper. And if it works at the bit level, it can be generally extended to all computation.

As an extension, given general recursion, one might even encrypt the analysis function F, such that even what analysis is being performed is not known to untrusted environments.

Answer to the comment below, since I don't have enough reputation to comment!

Your question did strike me, while writing the above answer. Possibly the above example is bit too simple and the concept a bit abstract.

Generally speaking there is a one-to-one correspondence between plain text and encrypted text to ensure decryption.

So lets say one wanted to search the data for the text "blue". Continuing with the example above, lets say the text "blue" is encrypted to "xChd". So the functioning searching on encrypted data will search for "xChd" instead of "blue". Algorithmically speaking the functions are the same i.e. "search for some text". Individual instances of the function will search for a different phrase, depending on the context. But again I say this is too much of a high level view.

At the bit level one only has ADD (commonly known as OR), SUBTRACT (NOT and OR) and MULTIPLY (AND) - this is as described in the paper. Otherwise at the bit level we only have 3 fundamental operations - AND, OR and NOT.

Typical encryption in today's world involves generating a unique but random stream of bits which are XORed with the plain text to give encrypted text. In decryption encrypted text is XORed with the cipher to give plain text.

So what the author (Craig) and/or others are trying to achieve at the bit level is:

  1. [AND,OR,NOT or any combination of these fundamental operations] on unencrypted input(s) give some unencrypted output(s).

  2. Is it possible to develop an encryption scheme SUCH THAT [AND,OR,NOT or any combination of these fundamental operations] on encrypted input(s) give some encrypted output(s) that can be decrypted to give output identical to the unencrypted output(s).

In literature Fully Homomorphic Encryption is described as follows with a so called Evaluate function.

Assume the following:

a. f is a function reduced to a boolean circuit C (i.e. consists of only gates and various inputs and outputs)

b. Key pair - pk is the public key, sk is the secret key

c. UI and UO are unencrypted inputs and outputs respectively

d. EI and EO are encrypted inputs and outputs respectively.

Then FHE (Fully Homomorphic Encryption) is described as

Evaluate(pk,C,EI) gives us EO such that Decrypt(sk, EO) is identical to running circuit C on UI.

So to answer your question in FHE the function f or the circuit C are the same, because the logic or algo has to be the same. The Evaluate function requires the public key so that it can translate any unencrypted inputs/intermediates in the circuit C to the encrypted form (for example if one wanted to search for "blue", the function f may have "blue" embedded somewhere in the code, but the encrypted data would need to be searched for "xChd". Again this is a very high level example and doesn't fully correspond to the bit level, because at the bit level the Evaluate function is required to be independent of the circuit C. Further at the bit level functions are defined as having a single output - the basic gates AND, OR and NOT are all single output. The normal multi-input multi-output functions that we talk about in algorithms and high level language programs are all built from basic gates. Here the high-level analogy fails, and the public key is required by the Evaluate function. Fitting in with the asymmetric encryption public/private key model - as in I send you data encrypted with my private key and a function and my public key, FHE can be summarized as - You can compute the function on my encrypted data using my public key and send me the encrypted output).

Functional Encryption

Moving on to Functional Encryption. FE (Functional Encryption) is not defined very simply or clearly. In fact there an entire research paper dedicated to formalizing a definition of FE - http://eprint.iacr.org/2010/543.pdf

FE sounds like a bit of magic - a bit like when someone is introduced to public key/private key encryption.

In FE again there is a Master Secret Key - msk, also referred to as master key. There is a Function Key - fk, often referred to as skf, secret function key, secret token, secret key (not to be confused with the Master Secret Key). The msk/fk pair is like a private key/public pair.

Lets call unencrypted input as 'x'. Let there be any function F (or circuit of gates). Let there be an encryption of x using the msk known as E(x) [Encrypted x]. Then if we have the function key - fk we can compute F(x) using only E(x). Note that the output F(x) is not encrypted.

To state it mathematically F(x) = Decrpyt(fk, E(x)) where E(x) is Encrypt(x,msk) for a key-pair msk,fk

While it may look very counter-intuitive, the catch is that the size or length of E(x) is a function of the size of the function F and typically not scaling linearly. Intuitively one can think of it as the algorithm or logic of Function F is being mapped onto the Decryption algorithm. Also note that the output of Decrypt(fk, E(x)) is not encrypted. Hence we have achieved encryption of the algorithm! I can give any third party some encrypted input and a function key, and the third party can get the result of running the function on unencrypted data. For example a spam filter can decide whether a mail is spam or not, without decrypting the email and also without knowing what my algorithm for deciding spam is.

In FHE example above a dictionary attack can be launched easily because there is a one-to-one correspondence between between the unencrypted data and encrypted data (there are encryption schemes which do not always generate the same output for the same input - in fact almost all the encryption schemes that are widely used ssl, ssh etc do not generate the same output for the same input always, but we will not go into that here), commonly words can be arranged in order of occurrence and easily guessed. But with FE nothing can be guessed! Both the data and algorithm are secure!

The main advantage of FE is that one can generate different function keys for different functions. And we can give different persons access to different functions by distributing function keys appropriately. Everyone need not know everything [Of course if they collude, then it is a problem]. A subclass of FE is ABE(Attribute Based Encryption). ABE was formalized first and then later generalized to FE. In ABE different people get different levels of access to information. The encrypted information is publicly available. However the information that each person can decrypt is dependent on the key that they hold. So in a full product launch one can issue keys such that engineering can see only the drawings, sales can only see artistic renditions of the final product and description of the features it offers and management can see everything. See this contrived example here - http://www.cs.uiuc.edu/class/fa07/cs498mmp/presentations/john.pdf The biggest advantage of ABE is that it makes key distribution and management easier compared to the current public key infrastructure.

A very nice slightly mathematical introduction to both FHE and FE is here http://outsourcedbits.org/2013/10/06/how-to-search-on-encrypted-data-part-1/ and here http://outsourcedbits.org/2012/06/26/applying-fully-homomorphic-encryption-part-1/

The relation between FE and FHE is explored here https://www.usukita.org/sites/default/files/func_enc.pdf

However the problem with FE was that it was implementable only for a few functions. It could not be implemented for any general or arbitrary function. That is where Indistinguishability Obfuscation (IO) comes in. The author above is incorrect about the definition of IO.

Indistinguishability Obfuscation

According to the original paper (http://www.wisdom.weizmann.ac.il/~oded/PS/obf4.pdf) Let there be two implementations of the same functionality F (boolean circuit) i.e. F1 and F2. Given the obfuscations of F1 and F2, IO is achieved if it is impossible (there is negligible probability of success) to distinguish which is the obfuscation of F1 and which is the obfuscation of F2 (which obfuscation is the result of obfuscating F1 and which obfuscation is the result of obfuscating F2).

As mentioned above IO the only use of IO so far is that it is used as a stepping stone in achieving full general FE for any function. Full FE was achieved first here - https://eprint.iacr.org/2013/451.pdf There is no such thing as "use IO to obfuscate a program".

Concluding thoughts

So far all these developments do not seem to be making much in terms of practical uses and even to some extent conceptually, especially in terms of protecting code and algorithms (as opposed to data which is currently protected quite well. breach of data happens mainly via the code used to compute on it). Leaving aside the huge computing resources required to implement these there are still various security leaks in the above protocols that will make it difficult to thwart a powerful adversary. See http://outsourcedbits.org/2013/10/30/how-to-search-on-encrypted-data-part-3/ and http://windowsontheory.org/2014/01/14/progress-and-challenges-in-code-obfuscation-part-ii/ The cutting edge of security in protecting code is probably something that is described here http://outsourcedbits.org/2013/12/20/how-to-search-on-encrypted-data-part-4-oblivious-rams/

Mavin
  • 51
  • 1
  • 2
  • 1
    Is F really the same when acting on both encrypted and unencrypted data? Or is it something like F-prime that is applied to the encrypted data so that it all works? – nealmcb Oct 07 '14 at 14:42