How do large companies protect their source code?

Question

I recently read the canonical answer of our ursine overlord to the question on How do certification authorities store their private root keys?

I then just had to ask myself:
How do large companies (e.g. Microsoft, Apple, ...) protect their valuable source code?

In particular I was asking myself, how do they protect their source code against theft, against malicious externally based modification and against malicious insider-based modification.

The first sub-question was already (somewhat) answered in CodeExpress' answer on How to prevent private data being disclosed outside of Organization.

The reasoning for the questions is simple:

If the source code would be stolen, a) would the company be (at least partially) hindered from selling it and b) would the product be at risk of source code based attack search. Just imagine what would happen if the Windows or iOS source code was stolen.
If the code would be modified by malicious external attackers, secret backdoors may be added which can be catastrophic. This is what happened with Juniper lately, where the coordinates of the second DUAL_EC_DRBG point were replaced in their source.
If the code would be modified by an internal attacker (e.g. an Apple iOS engineer?) that person could make a lot of money by selling said backdoors and can put the product at severe risk if the modified version ships.

Please don't come up with "law" and "contracts". While these are effective measures against theft and modification, they certainly don't work as well as technical defenses and won't stop aggressive attackers (i.e. other governments' agencies).

To prevent stealing, there are no removable media slots in the workstations. Employees are not allowed to carry media, cell phone with camera etc. to work. Authorization required for permissions on not related/relevant modules. Separation of duties. For backdoors prevention, they need a source code analysis program built into their secure SDLC. For malicious external attackers, they should undergo Penetration Testing before release. — Krishna Pandey, Dec 27 '15 at 20:50
To be fair, in the U.S., most of Europe, Japan, etc. "law and contracts" (well, contracts are enforced or ignored under "law"; but a petty semantic quibble...) are often effective tools. Both for punishing those who compromise source code and restraining what advantage parties who gain access to that code can take from it. (Not *always* effective, sure.) The big problems much more come from actors in the areas of the world of cyber-lawlessness. Or at least areas of the world where authorities don't give two wits about protecting the claimed legal rights of Western companies. — mostlyinformed, Dec 28 '15 at 02:05
IMHO, that's two questions: Protect source code from being stolen, and from being tampered with. They have different threat models, and should be asked separately. — sleske, Dec 28 '15 at 08:01
FWIW, the core of iOS source code is freely available to anyone: http://opensource.apple.com. Also, Windows source code is available to anyone willing to sign an agreement (not sure if one needs to pay anything): https://www.microsoft.com/en-us/sharedsource/. So for really big companies like Apple and Microsoft, they protect their IP using THE LAW (not the answer you wanted but it's the truth). — slebetman, Dec 28 '15 at 10:21
Also note that unlike Microsoft, whose agreement you must sign prevent you from selling your own version of Windows, Apple cannot prevent/disallow others from building and selling OSes based on the OSX kernel (because it's open source). One such OS is Darwin: http://www.puredarwin.org/ — slebetman, Dec 28 '15 at 10:23
If closed source is the only things that prevents you from vulnerabilities, then I have bad news for you. — Oleg V. Volkov, Dec 28 '15 at 12:41
In a large company, it's unlikely that every developer would need to recompile every product that the company has built. So if developers are only given access to the source code they need to complete their projects, no one developer can leak all of the company's source code. Except perhaps the keeper of the keys, but you can avoid that by using different repositories, with admin access for each project held by different managers. — joeytwiddle, Dec 28 '15 at 15:26
One amusing thing we do is sign up for services that regularly index and search Github and the like for our own company name and internal URLs. You'd be surprised how often proprietary Java packages with hardcoded internal URLs, usernames and passwords get uploaded to external code repositories. — Ivan, Dec 28 '15 at 18:32
As a related aside, large companies do have software stolen sometimes. Here is the result: [Ex-IBM employee from China arrested in U.S. for code theft](http://www.reuters.com/article/us-ibm-crime-china-idUSKBN0TR2X820151208) — Xander, Dec 28 '15 at 18:52
@KrishnaPandey, let's just say that if the Fortune 50 I work for had some of those kinds of rules in place (re: no carrying media to/from company premises -- which implies a hard line against telecommuting), they wouldn't have acquired the startup I came in through. Such measures have real-world costs, and those costs can outweigh the benefits. — Charles Duffy, Dec 28 '15 at 23:10
What makes you think that keeping software soure secure increases security or that having it be open decreases it? eg. here's the source to the OS X kernel: https://opensource.apple.com/source/xnu/xnu-3248.20.55/ does that automatically make OS X less secure? The truth is quite the opposite; more eyes on the code means more people reporting bugs. Another good example is Atlassian (disclaimer, I work for them), who gives the source of their products to any customer who asks (and lets them modify it as long as they don't redistribute their modifications). — Sam Whited, Dec 29 '15 at 03:07
@CharlesDuffy These scenarios are already covered where InfoSec policies are in place. Risk arising due to Acquisition or Take Over or Third party vendors, even your notebook getting lost at airport. As security is as strong as your weakest link. — Krishna Pandey, Dec 29 '15 at 03:48
@KrishnaPandey, what's the point to repeating truisms at someone? — Charles Duffy, Dec 29 '15 at 04:30
"Just imagine what would happen if the Windows" AFAIR Windows source code has leaked, somewhere around NT or 2000. — el.pescado - нет войне, Dec 29 '15 at 08:56
@KrishnaPandey You could still get all the sourcecode, zip it and then E-Mail it. You dont need usb slots for this — BlueWizard, Jan 03 '16 at 06:09
@JonasDralle That will be a lame way to steal, leaving trail everywhere. :) — Krishna Pandey, Jan 04 '16 at 07:16
@KrishnaPandey If you want to, you could encrypt it and automate the PC to send it to yourself in the middle of the night. Then you just need to delete the scheduled task. But please dont send it to yourself at your work-email — BlueWizard, Jan 04 '16 at 08:28

score 74 · Accepted Answer · edited Jun 30 '20 at 14:17

First off, I want to say that just because a company is big doesn't mean their security will be any better.

That said, I'll mention that having done security work in a large number of Fortune 500 companies, including lots of name-brands most people are familiar with, I'll say that currently 60-70% of them don't do as much as you'd think they should do. Some even give hundreds of third-party companies around the world full access to pull from their codebase, but not necessarily write to it.

A few use multiple private Github repositories for separate projects with two-factor authentication enabled and tight control over who they grant access too and have a process to quickly revoke access when anyone leaves.

A few others are very serious about protecting things, so they do everything in house and use what to many other companies would look like excessive levels of security control and employee monitoring. These companies use solutions like Data Loss Prevention (DLP) tools to watch for code exfiltration, internal VPN access to heavily hardened environments just for development with a ton of traditional security controls and monitoring, and, in some cases, full-packet capture of all traffic in the environment where the code is stored. But as of 2015 this situation is still very rare.

Something that may be of interest and which has always seemed unusual to me is that the financial industry, especially banks, have far worse security than one would think and that the pharmaceutical industry are much better than other industries, including many defense contractors. There are some industries that are absolutely horrible about security. I mention this because there are other dynamics at play: it's not just big companies versus small ones, a large part of it has to do with organizational culture.

To answer your question, I'm going to point out that it's the business as a whole making these decisions and not the security teams. If the security teams were in charge of everything, or even knew about all the projects going on, things probably wouldn't look anything like they do today.

That said, you should keep in mind that most large businesses are publicly traded and for a number of reasons tend to be much more concerned with short-term profits, meeting quarterly numbers, and competing for marketshare against their other large competitors than about security risks, even if the risks could effectively destroy their business. So keep that in mind when reading the following answers.

If source code were stolen:
1. Most wouldn't care and it would have almost no impact on their brand or sales. Keep in mind that the code itself is in many cases not what stores the value of a company's offering. If someone else got a copy of Windows 10 source, they couldn't suddenly create a company selling a Windows 10 clone OS and be able to support it. The code itself is only part of the solution sold.
2. Would the product be at greater risk because of this? Yes absolutely.
External Modification: Yes, but this is harder to do, and easier to catch. That said, since most companies are not seriously monitoring this it's a very real possibility that this has happened to many large companies, especially if back-door access to their software is of significant value to other nation-states. This probably happens a lot more often than people realize.
Internal Attacker: Depending on how smart the attacker was, this may never even be noticed or could be made to look like an inconspicuous programming mistake. Outside of background checks and behavior monitoring, there is not much that can prevent this, but hopefully some source-code analysis tools would catch this and force the team to correct it. This is a particularly tough attack to defend against and is the reason a few companies don't outsource work to other countries and do comprehensive background checks on their developers. Static source code analysis tools are getting better, but there will always be gap between what they can detect and what can be done.

In a nutshell, the holes will always come out before the fixes, so dealing with most security issues becomes something of a race against time. Security tools help give you time-tradeoffs but you'll never have "perfect" security and getting close to that can get very expensive in terms of time (slowing developers down or requiring a lot more man-hours somewhere else).

Again, just because a company is big doesn't mean they have good security. I've seen some small companies with much better security than their larger competitors, and I think this will increasingly be the case since smaller companies that want to take their security more seriously don't have to do massive organizational changes, where larger companies will be forced to stick with the way they've been doing things in the past due to the transition cost.

More importantly, I think it's easier for a new company (of any size, but especially smaller ones) to have security heavily integrated into it's core culture rather having to change their current/legacy cultures like older companies have to. There may even be opportunities now to take market share away from the a less secure product by creating a very secure version of it. Likewise, I think your question is important for a totally different reason: security is still in it's infancy, so we need better solutions in areas like code management where there is a lot of room for improvement.

To add to the point that most people wouldn't care if source code was stolen: generally speaking the real value of the company is the data in their databases, not the source code. Chances are most of that source code is boring wheel-reinventions that you can find anywhere. — David says Reinstate Monica, Dec 28 '15 at 13:39
As a sidenote, a technology giant requires you to carry your devices (PC, Phone) with you all the time. — ave, Dec 29 '15 at 08:45
It might be important to point out that "data loss", in this context, refers to exfiltration or disclosure. — forest, Dec 20 '18 at 08:10

score 32 · Answer 2 · answered Dec 27 '15 at 23:25

^{Disclaimer: I work for a very big company that does a good job in this area, but my answer is my own personal opinion and is not indicative of my employer's position or policies.}

First of all, how to protect code from being leaked:

Network Security: This is the obvious one -- if Chinese hackers get credentials into your internal systems, they'll go for your source code (if for no other reason than the fact that the source code will tell them where to go next). So basic computer security has to be a "given".
Access Control: Does your receptionist need access to your code repository? Probably not. Limit your exposure.
Be selective in hiring and maintain a healthy work environment: DLP measures like scanning outbound email is nifty in theory, but if your engineer is smart enough to be of any use to you at all, they're smart enough to figure out how to circumvent your DLP measures. Your employees shouldn't have a reason to leak your source code. If they do, you've done something horribly, horribly wrong.
Monitor your network: This is an extension of the "network security" answer but with a Digital Loss Prevention emphasis. If you see a sudden spike in DNS traffic, that may be your source code getting exfiltrated by an attacker. OK, now ask yourself if you would even know if there was a sudden spike in DNS traffic from your network. Probably not.
Treat mobile devices differently: Phones and laptops get lost really often. They also get stolen really often. You should never store sensitive information (including source code, customer data, and trade secrets) on mobile devices. Seriously. Never. That doesn't mean you can't use mobile devices to access and edit source code. But if a laptop goes missing, you should be able to remotely revoke any access that laptop has to sensitive data. Typically that means that code and documents are edited "in the cloud" (see c9.io, koding.com, Google Docs, etc) with proper authentication and all that. This can be done with or without trusting a third party, depending on how much work you want to put in to it. If your solution doesn't support 2-factor then pick another solution; you want to reduce your exposure with this measure, not increase it.

Second, how to prevent malicious code modification; there really is only one answer to this question: change control.

For every character of code in your repository, you must know exactly who added (or deleted) that code, and when. This is so easy to do with today's technology that it's almost more difficult to not have change tracking in place. If you use Git or Mercurial or any modestly usable source control system, you get change tracking and you rely on it heavily.

But to up the trustworthiness a bit, I would add that every change to your repository must be signed-off by at least one other person besides the author submitting the change. Tools like Gerrit can make this simple. Many certification regimes require code reviews anyway, so enforcing those reviews at checkin-time means that malicious actors can't act alone in getting bad code into your repo, helps prevent poorly-written code from being committed, and helps ensure that at least 2 people understand each change submitted.

Yup. This basically describes my work environment when I was at Nokia — slebetman, Dec 28 '15 at 10:33
WRT "For every character of code ... you must know exactly who added... that code, and when. This is so easy .. Git or Mercurial" Git, hg, and others do keep track of code authorship, but unless you use something like gpg signed commits (most don't), it is easy for hackers to bypass. — emory, Dec 28 '15 at 13:44
@emory, even without signed commits, someone can't change history without modifying current hashes, and (for places with no-rebasing-allowed workflows, which anyone on this scale should have in place) that gets noticed. — Charles Duffy, Dec 28 '15 at 23:05
@CharlesDuffy why would the hacker need to change history. Just push a new commit with some fresh security flaws and attribute it to a trusted team member. — emory, Dec 28 '15 at 23:52
@emory, where I live, new commits get strict scrutiny where old ones don't. — Charles Duffy, Dec 29 '15 at 00:39
@emory, ...btw, in the git world it's common practice to ensure that the SSH key being used for uploading a changeset aligns with the Committed-By identity in the header. Thus, you'd need to either compromise the SCM or steal credentials for that trusted staff member. — Charles Duffy, Dec 29 '15 at 00:51
@emory if your organization has any sense at all, committing code without credentials is impossible, and the credentials used are recorded as the committer. If you don't have that very basic level of security, then you're not even trying. — tylerl, Dec 29 '15 at 03:46
@tylerl My experience has been using ssh keys to push code to the organizational repository (you don't need credentials to commit) and the organization repository being on a VPN. The organization can be reasonably confident that only organization members are pushing code, but organization members can impersonate each other. gpg signed commits or restricting pushes to ssh keys where the key matched the Committed-By identifier would allow to organization to know exactly who committed which code, but I don't think these measures are common - I could be wrong about that. — emory, Dec 29 '15 at 12:03
@CharlesDuffy ensuring that the SSH key being used for uploading a changeset aligns with the Committed-By identity makes sense, but I have never heard of it in practice, so I can not agree with it being common practice. Maybe it should be. — emory, Dec 29 '15 at 12:08
Why do you single out *Chinese* hackers? That looks a little weird. — Kobi, Dec 29 '15 at 14:24
@emory, if Github Enterprise doesn't have that support for enforcing that policy out-of-the-box, my memory is failing me badly. — Charles Duffy, Dec 29 '15 at 14:33
@Kobi Industrial espionage tends to come from China. If you're hacked by Syrian or Brazilian attackers, it's generally for another reason. — tylerl, Dec 29 '15 at 23:08
+1 for the part with eployees shouldn't have motivation to steal — BlueWizard, Jan 03 '16 at 06:27

score 3 · Answer 3 · answered Dec 28 '15 at 17:06

There will be measures in place to prevent the accidental insertion of problematic code, aka bugs. Some of these will also help against the deliberate insertion of problematic code.

When a developer wants to commit code to the repository, another developer has to examine this merge request. Perhaps the second developer will be required to explain to the first developer what the new code does. That means going over every line.
If the code looks confusing, it might be rejected as bad style and not maintainable.
Code has automated unit and integration tests. When there is no test for a certain line of code, people wonder. So there would have to be a test that the backdoor works, or some sort of obfuscation.
When a new version of the software is built, developers and quality assurance check which commits are part of the build and why. Each commit has to have a documented purpose.
Software is built and deployed using automated scripts. That is not just for security but also to simplify the workload.

Of course such measures rely on the honesty and good will of all participants. Someone with admin access to the build server or repository could wreak a lot of havoc. On the other hand, ordinary programmers don't need this kind of access.

SysAdmins shouldn't have this kind of access either. When the programmers need to ee-check the work of the other programmers - why should the sysAdmins shouldn't re-check what the other Admins do — BlueWizard, Jan 03 '16 at 06:25

score 2 · Answer 4 · answered Dec 28 '15 at 18:56

In my (large) company, our laptops all use encrypted hard drives. We use a tool called Digital Guardian that monitors all file transfers/uploads, and that blocks USB ports for writing files. Anyone who needs an exception needs to use a hardware encrypted USB drive (again, to prevent access to the files in case the drive is lost or stolen). Such exceptions are temporary, and one needs to justify the files being written (which are logged). Digital Guardian prevents the sending of most kinds of attachments to external email addresses. Any internal file exchange is done using web-based folders with access control and an audit trail of who accessed what - this means that sharing files "for business needs" is pretty seamless, and everything else is hard.

The system is not fool proof, and it does have an impact on the bandwidth - but the auditing tools alone have turned up multiple instances of employees (often employees who were about to leave the company) who attempted to take source code / design documents etc. At least we are stopping some of these attempts.

And in case anyone thought that https uploads would be "invisible": all external web traffic is handled by a proxy-in-the-cloud that uses a MITM certificate to inspect https traffic. It's no doubt possible to circumvent these measures - there are a lot of clever employees - but sometimes it is enough to make your target harder than the other guys'.

oh eh... oh... do your workstations have those handy dvd drives? .. I accidentally opened the case when no-one was looking, hotswapped a second harddrive into the case,copied over my files, disconnected hdd, and i'm good to go :-) — Tschallacka, Dec 29 '15 at 10:39
@MichaelDibbets no, writing to any non recognized hard drive doesn't work. And if you were able to spoof a "known" disk, the writing of a bunch of files to another drive would be flagged and you would not get past security at the exit without explaining why you needed to do that and where the disk is now. Again something you could try to circumvent... But the question was "how do large companies..." Not "what is a foolproof way to...". This is how one large company does it. As such it answers the question. — Floris, Dec 29 '15 at 14:08
Note also that this is really just more detail reflecting what paragraph 4 of the accepted answer states: "a few others are very serious... DLP..." — Floris, Dec 30 '15 at 12:20

score 1 · Answer 5 · answered Jan 03 '16 at 06:21

This is a tricky question and because I've never worked in that industry my Idea might be theorethical or very impractical.

What if employees in low hirarchy only get small parts of the sourcecode? The explorer.exe-Team only needs access to the source code to the explorer.exe. This would make stealing the source code from inside harder because you would need to be in higher positions to gain insight into more parts of the source code.

Sure you would need to have a good documentation on all parts that the Team might need but cannot see. Also debugging would become trickier because you would have to merge pre-comiled sourcecode with the freshly compiled one from the explorer.exe-Team. There might be a compiling server that holds all the source code and can compile full versions of the Product even for those who only have edited a small piece. (Because they need to test their changes) This Server also knows who has access to what parts of the code.

I don't know if this is a practical approch or if it's really happening in the industry or not. It's just a concept that i think would make sense to prevent source code theft.

4 months later and i think: "why did i try to answer this by pointing out that i have zero experience" — BlueWizard, May 11 '16 at 08:34

score -1 · Answer 6 · answered Dec 29 '15 at 02:43

Market value of a software is a direct function of

Value of business cases it supports. RoI, that is.
Market value of competitive products
The fact that it would continue to get enhanced and supported by the company
Brand of the company. Marketing and sales effort around the product
Number of existing satisfied customers, developers and community members
Features, LoC, number of patents, completeness and Quality of the software

So the source code is just part of the game. Not the whole of it.

Source code itself is protected by

Legal contracts / T&Cs signed by the employees and contractor
Patents
Decomposing the product into multiple black-box components that no one single person or teams have access to.
Using software level and physical level security mechanisms to protect access to the code

this does not answer the question - the OP explicitly states that "law" and "contracts" are not valid answers - the whole point to the question is to expand on the "software level and physical level security mechanisms" — schroeder, Dec 29 '15 at 02:45

How do large companies protect their source code?

6 Answers6

Linked

Related