46

Is it worth to obfuscate a java web app source code so that the web host cannot make wrong use of the code or even steal your business? If so, how should this be dealt with? How should we obfuscate?

We are a new start up launching a product in market. How can we protect our product/web application's source code?

Lee Quarella
  • 103
  • 3
Rajat Gupta
  • 741
  • 1
  • 6
  • 8
  • This is what fully homomorphic encryption is for. – SLaks Aug 05 '13 at 18:03
  • 13
    if security is such an important matter, hosting your own server is an option or not ? –  Aug 05 '13 at 12:36
  • 12
    It *may* (BIG may) make sense to assume the web host is malicious, but a malicious host also poses a number of other problems. If you're serious about not trusting the host, you have to address all those as well, and at that point it gets rather impractical. –  Aug 05 '13 at 12:38
  • yeah..thats true.. so, is it often a matter of trust ? Do all others, take no step in the fear that this might happen with them? What do others generally do to prevent this ? – Rajat Gupta Aug 05 '13 at 12:56
  • 4
    If you allow access to the binary code to somebody, you cannot be sure it won't be reverse-engineered. It's simple as that. – Deer Hunter Aug 05 '13 at 13:51
  • 2
    If you can steal your entire "product" through the web, then your startup will be short lived anyway. The value of the startup is proving a good idea, getting user buy-in, and establishing the people behind it all as experts at a very specific task. Most people who would acquire you could write their own "Application X", but the non-obvious parts, the people, and the idea are what they buy. –  Aug 05 '13 at 15:27
  • 5
    If you distrust the hosting provider that much, perhaps hosting the site yourself would be a better alternative. – Andy Aug 05 '13 at 18:50
  • This sounds very much like "I can't tell you my idea but it's awesome - will you write an app for me?" The idea that a host would look at your website and go "let's steal that amazing idea & the code - close down this hosting business and fight a lawsuit" is laughable. – Ryan McDonough Aug 06 '13 at 10:45

15 Answers15

85

A malicious hosting provider can do a lot more than simply steal your code. They can modify it to introduce backdoors, they can steal your clients' data, and ruin your whole business. Trust must exist between you and the host.

About the source code. If the attacker is trying to gain access to your source code, they will gain access to your source code, obfuscated or not, compiled or interpreted.

There is a value in obfuscating your code, in that you'll probably make it just a liiiiitle bit more difficult to be obtained by the occasional opportunistic attacker. But if your host is out to get you, they'll get you.

The solution? The law. Sign a contract with them and agree on some form of NDA.

Adi
  • 43,808
  • 16
  • 135
  • 167
  • 1
    it's an agreement btw @Rubens, not argument – Lewis Aug 06 '13 at 08:34
  • In general - if it's a bigger problem for your hosting provider that word got out that they stole your stuff, compared to what they could gain you can easily trust them. E.g, Amazon Web Services would be done very quickly if someone proved that they stole some of their source code, so I trust they don't do that. – freeall Aug 06 '13 at 11:07
  • 4
    The only thing I would disagree on is "_there is a value in obfuscating your code_". – m-smith Aug 07 '13 at 13:03
38

Well, this calls for three comments:

  • You cannot protect secrets with code obfuscation. Not really. Code obfuscation somehow works against unmotivated attackers, but it is not strong. If there is commercial value in breaking through it, then it will happen.

  • If you don't trust your hosting service then look for another hosting service. If the secrecy of your code is important and worth more than a few hundred bucks, then you should use your own hardware: you rent some isolated bay space, with locks and guards, and you run your own machine in it.

  • Your code exists as compiled (byte)code on the servers, but also as source code in your development systems, and in the heads of your developers. It cannot be that secret. As they say, one million dollar is always enough to unravel secrets, if only by bribing one of the individuals who has been made privy to it.

    (In this last case, that might be your plan: you might wish to see some bigger competitor simply buy you out.)

Protection against reverse-engineering and pilfering of your intellectual property is normally ensured through legal means, not technical.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • "legal means".. do I need to do any agreement with web host before hosting or in case he makes wrong use of my code ? – Rajat Gupta Aug 05 '13 at 14:33
  • 9
    "Legal means" include copyright, NDA, patents... all the weaponry of lawyers. Ultimately this is a matter of relative cost: attackers will choose what is cheapest, be it buying your start-up or doing some spying and reverse engineering, with the risk of legal retaliation. Make your IP case strong, and the trial will become potentially way too expensive, prompting your enemies to choose the legal way, also known as "dropping a lot of dollars on your lap" (hopefully). – Tom Leek Aug 05 '13 at 14:36
29

No, it isn’t worth it. Nobody wants to steal your code. A thousand million SaaS products have been launched by individuals and companies using third-party hosting of some description or another, and roughly none of them have found themselves to be competing against themselves after having the code for their products stolen by their hosts.

So, should you obfuscate your code? Sure, if it makes you feel better. No harm done. Are you protecting your IP against a valid threat? No, not really. It’s kind of the web developer’s version of a tin-foil hat.

Whatever you do, don't waste a lot of time and energy thinking about it. Make a decision to obfuscate or not, and then move on to worrying about mitigating real threats.

TRiG
  • 609
  • 5
  • 14
Xander
  • 35,525
  • 27
  • 113
  • 141
  • 6
    "Sure, if it makes you feet better. No harm done." I'm not even sure if there is really no harm done. At least there is an extra compile step, at worse it put a whole lot of extra burden on your developers. What if they have to remote debug a problem? What if there are remote error messages, will the contain sensible data? Are the developers able to work with non-obfuscated code locally? – Dorus Aug 05 '13 at 15:01
  • 1
    @Dorus So yes, it adds complexity, which is bad. I personally wouldn't choose it. However, if you really can't sleep at night because you're worried about sticky fingers bandits making off with your valuable things, I'd argue that it's a worthwhile trade-off. Having worked with (and reverse engineered) quite a bit of obfuscated code, I have indeed seen it cause some of the issues you point out (and introduce bugs, in a couple of cases) but I've never seen it add a huge burden. To sum up, not ideal, but not the end of the world either. – Xander Aug 05 '13 at 15:07
  • @Xander: it can be enough to get some competitive *disadvantage*, but the point is that you won't ever see it. It's mostly a matter of spending time on something useless instead of working on the next site improvement, or slower reaction time to bad user experience when a bug has to be fixed. – kriss Aug 05 '13 at 16:03
  • @kriss Agreed, certainly, since I don't obfuscate when I have a choice. However, in my experience, configuring and dealing with any impacts of obfuscation tends to require only an insignificant amount of time and effort. It does of course require time and effort which is wasted, but you're not wasting that much. Any feature you could build in its stead would be trivial. – Xander Aug 05 '13 at 16:13
  • @Xander - basically one of those check-off-the-list items on audits that everybody in security knows to be an exercise in futility, but we include it so the uneducated management knows we've filled the box, whether it does anything or not. – Fiasco Labs Aug 05 '13 at 17:10
  • @FiascoLabs Exactly. That is unfortunately too often the case. – Xander Aug 05 '13 at 17:11
18

Ultimately you're the only one who can make that risk assessment. If you'll sleep better obfuscating your code, go for it.

Personally I wouldn't bother. Reputable web hosting services are in the web hosting business not the source code theft business. If they steal your code they'd still have to install it, sell the service, find customers, and fight off the lawsuit you'd file against them. It's too much effort for them with too much risk for too little reward.

Dan Pichelman
  • 346
  • 1
  • 3
11

Obfuscation is ineffective against a determined attacker, it only makes it slightly more difficult. If you have a particular reason to distrust your hosting provider, get another.

If you just want to be safe, get a non-disclosure agreement and other legal assurances that allow you to go after your host if they abuse things.

If you still don't trust them even with those legal assurances, get dedicated servers that you can control and encrypt such that the hosting provider can not access the data unless they hack in to the operating server.

If you are worried about someone having physical access to your servers, then setup your own data center or physically lock them in an enclosure in a collocated facility with monitoring of the physical property.

Simply using an NDA should be sufficient with any reputable hosting provider though.

AJ Henderson
  • 41,816
  • 5
  • 63
  • 110
9

I wouldn't bother.

Two reasons:

Runtime interpreted languages really cannot be fully protected that way. To completely obfuscate it you would have to obfuscate it from the runtime as well, then there would be no way to execute it. Obfuscation just makes the task slightly more annoying. It can also make debugging and deployment more time consuming for your own staff.

Very few applications really contain the kind of secret sauce code people really would go to that much trouble to steal. It is a lot easier just to get a feature list and some contractors and say "make something like this" than it is to copy function by function with most common applications.

Bill
  • 386
  • 2
  • 7
5

When you don't trust your hoster to not steal your data, you should look for a more trustworthy hoster or host yourself.

You are not just entrusting your program code to them, you also entrust all your data and the data of all of your users. When you assume that your hoster is malicious enough to steal your programming, they are also malicious enough to steal your user-data and sell it to the highest bidder. Data you likely vowed to protect under a privacy policy.

When you come to the conclusion that you don't trust any hoster but hosting yourself is too expensive, you have the option of buying your own physical server and let someone else host it, but 1. fully encrypt its filesystem so they can't clone the hard drives during maintenance and 2. make sure all network communication is encrypted so they can't sniff on the traffic.

Philipp
  • 48,867
  • 8
  • 127
  • 157
  • Hi! I've a concern. I'm not a programmer, but use MATLAB / MSVS for mathematical (academic) DSP algorithm development.These programs have connection to internet, and hence they are able to read & transfer **all users** plain source codes. Do you think that Microsoft or Mathworks would read them? What about using *AI* based automated systems to extract and analyse source codes from their users, and make some use of them? Math-dsp research is interwoven with short matlab files. If they r read, so will be the math papers. I don't want to turn-off wifi for security. So any helps ? – Fat32 Dec 27 '20 at 17:49
  • *cont*, what about SE sites? When you open a browser with *stackoverflow.se* on it, can it read data from existing applications such as matlab / msvs / python / gcc / netbeans ? Can it access their private data ? I have doubts that either companies (mathwork-microsoft) or web sites like stackexchange do SPY on the connected user computers... There's no way to prove that they read your code. – Fat32 Dec 27 '20 at 17:58
  • My concern is more about stopping the spying companies, than securing my code. Turn off the internet, encrypt your files and bingo ! But how can we do that in this cloud based, wi-fi connected iOt world? – Fat32 Dec 27 '20 at 18:02
  • @Fat32 Running stuff on other people's computers (what marketing people like to call "The Cloud") just isn't advisable in cases where you have reasons to be concerned about the privacy of that stuff. Your questions regarding matlab and what a website can learn about you and your applications when you visit it are unrelated to this question. Please ask them separately, or even better use the search function to find the many questions we already have about this topic. – Philipp Dec 27 '20 at 18:11
  • Thanks! I'm not using cloud, but the market is very determined to force everyone to use it, by eventually disabling non-cloud based services one by one. I've made a 10-min search on google, and it seems to me that this issue is successfully ignored. My concern is about big companies: Google, facebook, Microsoft, Apple, Mathworks, Intel, Qualcom, AMD, Nvidia secretly accessing information on user computers. And how to stop that. Looking at some of the answers, it's not possible to get to the point. ok. Thanks anyway. – Fat32 Dec 27 '20 at 18:56
5

You should take precautions to protect yourself, but not for the reasons or from the threat that you're imagining.

First of all, if you can't trust your hosting provider, get a new hosting provider. No more needs to be said about that.

Second, even though you think that the intellectual property in the web application you've built is valuable, chances are you're the only one. Would you steal your competitor's website? Probably not; it wouldn't be worth the trouble. As a rule, people aren't interested in stealing your site. Typically the cost of customizing someone else's site to fit your own needs approximates or exceeds the cost of building it yourself.

Finally, you should be worried about attackers retrieving your code and using that to attack you. Saved passwords, user databases, programming errors and vulnerabilities -- your code probably presents a juicy target for malicious attackers. This is particularly true if your code is poorly-written or your service is popular.

Obfuscating your code is not a solution, and wouldn't help anyway. But good security practices, including following the principle of least privilege, should help you. Segregate out your access so compromising one system or component doesn't give your attacker the whole app in a neat little bundle. The more independent and disconnected the elements of your business, the less an attacker can snag in a single go.

tylerl
  • 82,225
  • 25
  • 148
  • 226
5

I have many web applications hosted online. Some code is valuable yes. However, I did not obfuscate any of it since even if it gets stolen, no one else can maintain it. They will eventually pull their hair out. I tried it with someone who wanted my software badly. He could not install it, understand it nor get anything out of it so how would he be able to find customers and sell it as mentioned above.

The value is in the data and in the support you give and keeping the customer happy.

All my desktop applications could be decompiled one way or another. I think VB6 was the only one that could not be decompiled. No one could maintain them because I use tricky coding and tricky variable names.

4

There is no security merit in attempting to obfuscate any client-side code. A determined enough attacker will bypass any obfuscation methods you throw at them.

If the code is really important, keep it on the business side. Consider any client-side code public and available to anyone and everyone.

4

The world is upside down, the real value is usually not in the application code. It's in the customer data that the code is used to gather/modify. In many web applications the code is secured and the data sits in plaintext often without any simple protection. This is one of the issues that PCI DSS, HIPAA, and other data security standard are meant to address.

With the assumption that your hosting provider is malicious them getting access to customer data will destroy your company far quicker than getting your code.

Besides the issue of security through obscurity, code obfuscation can also introduce security issues and will need to be vetted to the same or a higher level as your regular code base.

Hendrik Brummermann
  • 27,118
  • 6
  • 79
  • 121
4

No.

Clearly: You're planning to invest time in a so called security through obscurity technology.

I'ts not a good idea!

To protect your idea against third part licensing, you could publish them under such a public license like GNU GPL, creative commons or others.

  • I was going to raise this point too, and put up a [_Schlock Mercenary_](http://www.schlockmercenary.com) cartoon about it in my answer. Unfortunately, I can't find the strip about it. – CyberSkull Aug 07 '13 at 07:33
3

Obfuscation never hide or change your Code, it just modifies it into some other format in which it can process it,

Ex: if you want to obfuscate your class name MyHomePage my be changed into M4Page Similarly a method of addWidgets() will be named in some other, and the function name might also be changed, not the Functionality So if i wish to call the method from your obfuscated code i can easily implement it...

3

You can't obfuscate your code or data enough to make them secure. If you were able to, your code and data would be unusable to even you.

Security via obscurity only works so long as the secret of your obfuscation remains intact. Secrets never stay secret. This is the basis for most digital rights systems, and to date all of them have been cracked.

On a more technical note, if you are using an interpreted language (Perl, PHP) or a bytecode-interpreted language (Java), consider using the included native compile tools for them. Many such high level languages include a tool to output a C/C++ version of your scripts or compile natively for your hardware. Having a natively compiled version removes the need to keep your source tree on your remote servers and provides a significant performance boost as well.

CyberSkull
  • 131
  • 4
  • Most of those tool actually provide a somewhat stripped down VM with embedded bytecode/tree of your script, not a fully native binary. Thus they can be easily decompiled to much more readable form than simple assembly, so it is not really a good solution to obfuscate anything. For same reason they don't actually provide speed boost anywhere except the startup either. – Oleg V. Volkov Sep 23 '19 at 12:17
0

Security by obscurity is not bad if is not only thing you rely on. Using obfuscation to protect your code will not work,but using obfuscation with other licensing is not bad.

John The Ripper
  • 129
  • 1
  • 10