Keeping valuable algorithms secret

Question

Given a securities trading algorithm that is very effective and thereby very valuable, how do you keep it secret and protect it from theft or copying? What techniques or architecture is the best?

The algorithm will of course need some input data and output some data. It will also need maintenance and development.

Preferably it would be connected to the network, but then there are all kinds of threats and possibilities of security breaches.

People are working on different parts of the algorithm either as software developers, testers, admins or as users. People in themselves pose security risks in that they may copy the parts that they have access to.

Here are some of my thoughts up to now:

Some organizations have an internal network that is not connected to the external network except through humans and USB sticks. The Stuxnet virus shows the USB stick approach is not enough to keep secrets secret. Humans have a very limited bandwidth and have a tendency for introducing a lot of transmission errors.

Some organizations instead of a direct network connection have a simple to analyze and understand RS232 serial connection to the outside world with a very limited bandwidth, and if too much data is being transmitted, they cut the connection until they have analyzed what data was being sent and why.

Extracting the valuable part from the standard straight forward parts of the algorithm reduces the target area.

Running on OpenBSD may help avoid several types of problems.

Reverse proxy
Firewall
Up-to-date antivirus
Honey pots to detect intrusions
Storing everything on Truecrypt volumes
Avoiding forgiving languages such as PHP, C, C++ and preferring languages that do additional checking such as C# or Java.

I think Warren Buffett has the ultimate security for his algorithm in that he keeps it in his head, and does not use a computer or a mobile phone.

I would tell you how I keep my valuable algorithms secret, but that would just give you ideas about how to access them... — , May 05 '12 at 00:29
"I think Warren Buffett has the ultimate security for his algorithm in that he keeps it in his head, and does not use a computer or a mobile phone." sounds like he just does old fashioned insider trading. — dr jimbob, May 06 '12 at 21:23

score 4 · Answer 1 · edited Mar 17 '17 at 13:14

There are two parts to this question:

Don't let people see the algorithm, implementation, etc. This is standard IT security, and I'll discuss below some possible defenses.
Don't let people learn anything from their interaction with the algorithm. This part may be a little less obvious, but if you allow people you don't trust to feed inputs to the algorithm and observe the outputs, then (depending upon the specifics of the situation) they might be able to learn something of your secret sauce. You'll have to determine whether that is indeed the case, and how to defend against it (e.g., by limiting access to inputs or outputs to the algorithm).

As far as how to protect the algorithm and its implementation, there is a range of possible defenses. Each has its own tradeoffs, and you'll need to evaluate them in the context of their business impact as well as the degree of risk you are facing.

Isolation. Run the code on its own server. Store the source code on a totally separate system. Choose a limited group of developers who are given access to it. Keep it separate from all of your usual systems.
Hardening. Configure the server that runs this code in a highly secure way. Don't run anything else on the server. Lock down the configuration. Set up firewalls and strict access control. If you search on this site, you'll find a great deal of information on server hardening.
Network separation. Run the server on a local internal network only, which is completely disconnected from the Internet (no machine on that local network is connected to the Internet, or to anything that's connected to the Internet, or anything that's connected to, etc., to anything that's connected to the Internet, transitively). I recommend using strict red/black separation, as it helps ensure separation.
Airgapping. Alternatively, disconnect the server from every network entirely, and use only sneakernet. However, instead of carrying data on USB sticks, I would recommend using write-once media (e.g., CD-R, DVD, etc.), as that reduces the risk of viruses travelling the wrong direction. Turn off autorun etc. Set up software on the critical server that checks the contents of the DVD before using it (e.g., checking that it is of the expected type, has the expected format, and so forth).
Physical security. Restrict access to the servers on which this algorithm and code runs and lives, to a limited set of trusted staff.

See also Contract requires me to work with sensitive IP, severe penalties for loss of data to 3rd parties. Some protection pointers please for some additional techniques that may be of interest (even though the situation is a bit different).

score 0 · Answer 2 · answered May 06 '12 at 21:15

Have an internal network disconnected from all others with USB connections etc. shut off, & only allow data to travel in using finalised CDs / DVDs? Depends how much data you need to get off of it as well.

Nothing's going to be 100% foolproof ("The only secure computer is in a locked room, disconnected from the internet, turned off & unplugged" etc. etc...), but this might at least be a step in the right direction.

Keeping valuable algorithms secret

2 Answers2

Linked