8

I'm trying to learn little bit about armoring application against reverse engineering. In one article I read that initializing variables to NULL or 0 or -1 is as secure (vs RE) as using common passwords in applications. In short, it is said we should use RNG to produce random data which we initially "store" to new declared variables. Other approach would be by using XOR operator. Unfortunately, I forgot to bookmark this article and I've not had the time to read it whole.

The question would be - is it true that if we use random data and/or XOR operator when initializing variables we add to application's defense against reverse engineering?

EDIT: It seems my question wasn't clear enough. What I'm asking is which one is better (or there are no differences at all)

  • int variable = 0;

or

  • int variable = random(seed);

Let's assume, for the sake of the question, random(seed) is indeed true randomness.

About XOR operator. As I said I only partial read the article. I saw two paragraphs one named something like "Using random number generator" and the other "Using XOR to initialize variables". How is XOR used is unclear by me and I only mention it here to draw attention to it if someone knows something more about it.

StupidOne
  • 2,802
  • 21
  • 35

3 Answers3

10

In many programming languages, initialization of local variables is forced, or the engine will flatly refuse to read uninitialized data. Even in languages where you can read uninitialized variables and thus get a copy of what remained in RAM at that emplacement, you cannot count on it to be "random"; it will have a tendency to contain always the same value, depending on what the application did previously. Indeed, in C, local variables are from the stack and/or cached in CPU registers, and these are resources which are constantly reused throughout the application.

If the article you read recommends not to initialize local variable, and then to read them, and expects this to yield "randomness", then this article shall be burnt to the stake, for many reasons:

  • Reading uninitialized data is a clear breach of specification; in the C programming language standard, this is called "undefined behaviour" and may yield problematic results, including an application crash, or, even worse, silent memory corruption.

  • This kind of randomness will not be any good, by any notion of "randomness" worth speaking of.

  • Reproducible behaviour is good. By trying to obtain non-reproducible values from local variables, the article just promotes making the code impossible to debug, which can only be described as a bloody stupid thing to do. In particular for security. This will almost guarantee the presence of long-standing, hard-to-detect security holes.

If you want randomness, use what the OS provides, e.g. /dev/urandom. It is incomparably better, from all point of view, than irrational home-made rituals meant to propitiate the gods of randomness.


Edit: As @TerryChia points out, your question might be about forcing a random initialization of variables (local or not) from a PRNG, instead of leaving the default value there (if there is a default value, of course; in many programming languages, local variables have no default value at all). What you call "XOR" in that context is unclear.

In that case, either you do not read the said variables before storing a meaningful value in them, in which case what the variables initially contained is completely irrelevant: it does not impact the behaviour of your code. Or you do read the variables and then get these more-or-less random values from them, making the application code not reproducible and leading to the problems explained above; namely, that the code will be hard to debug, increasing the probability and density of actual vulnerabilities.

Thomas Pornin
  • 320,799
  • 57
  • 780
  • 949
  • I *think* you and @tylerl may have understood what the OP is asking. I'm reading it as "Should I initialize my variables with values derived from an RNG" instead of "should I use uninitialized variables to obtain randomness". –  Aug 06 '13 at 00:59
  • Indeed. I have added two paragraphs to cover that case also. – Thomas Pornin Aug 06 '13 at 12:28
  • @ThomasPornin - It seems my Q wasn't clear enough. The answer which I'm looking for is actually your second paragraph. – StupidOne Aug 06 '13 at 16:26
4

You seem to be mixing two security objectives: protection of the code against reverse engineering, and protection of the application against attacks. These objectives are often contradictory: protections against reverse engineering increase the complexity of the code and therefore make it more likely that the code is vulnerable to attacks because it behaves in an unexpected manner in certain circumstances.

Case in point: if you initialize variables to random values, it's more difficult to figure out when reverse engineering that a block of memory is uninitialized. On the other hand, tracing calls to the RNG will show where variables are being initialized. So there is no benefit of initializing variables to random values.

When it comes to security, initializing variables to random values is bad. Zeroes are a lot safer. If it turns out that the variable is used without being initialized, a zero is often less harmful: you can double-check if the value isn't supposed to be 0, and if it's a pointer, it will lead to a clean crash the first time it's dereferenced. On the contrary, a random value would lead to unpredictable behavior, a potential security hole.

Protecting against reverse engineering is so costly (because debugging is harder than programming, and that goes double for obfuscation) that it is extremely rare for it to be worthwhile. If you have determined that you need to expend effort towards obfuscation, you probably underestimate the difficulty. If you choose to obfuscate anyway, initializing variables to random values isn't a very useful technique, because it's easy to see under static or dynamic analysis that the random values aren't used anyway. Obfuscation would require that you use the random values to perform computations, but that the eventual result does not depend on the value.

From a security standpoint, initializing variables to NULL, 0 or other safe and reproducible value is good practice.

Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
2

Initializing variables provides consistent behavior. You should use a RNG that is sufficiently random on its own. XORing your random output with whatever happens to be at that memory location realistically shouldn't provide any benefit.

Relying on such a solution to increase the randomness of a value is not typically recommended.

But be careful you don't break things by fixing them.

tylerl
  • 82,225
  • 25
  • 148
  • 226