5

Data loss protection is a major concern to every industry. The software engineering process involves multiple points for potential data loss, as a number of parties are involved other than the client and software development team. The list may include external testing agencies, other software vendors, consulting agencies etc. From requirement analysis document to the source code and beyond, all the software artifacts contain information that the client may consider sensitive.

There has been considerable efforts for protecting application generated data using k-anonymity, L-diversity. A quick summary is here

But what are the options/best practices/tools available in protecting sensitive information that sit in software artifacts (e.g. analysis documents, source code, documentation etc) in a rather unstructured format? (Of course except NDA and good faith ...)

Tathagata
  • 213
  • 1
  • 5
  • It would help folks if you were to edit your post to link to info about those fancy techniques, e.g. [k-anonymity](http://privacy.cs.cmu.edu/dataprivacy/projects/kanonymity/kanonymity2.html) – nealmcb Jan 21 '11 at 20:55

1 Answers1

3

I recently blogged about how I deal with my clients' source code. Essentially: version control for auditing, encryption for, well, encryption. Compiled artifacts are stored on the same encrypted filesystems, and if they have to go onto other devices, they are encrypted there where available and deleted as soon as they aren't in use.

A similar story is true for documents: I work on threat models for some clients, and they're managed pretty much the same way described above.

I guess what I'm saying is that there isn't really anything intrinsic to software engineering tools to stop data leakage, though of course a good SCM installation can limit and track who accesses confidential data, but once it's checked out you're on your own. Some of the DLP software can be configured to treat source code files as confidential, which may be worth checking out. But if you're working with consultants or subcontractors, then you have a contract with them and whatever level of detection/enforcement is mutually acceptable under the terms of the contract.

  • 1
    +1 for the contracts piece. A lot of what I deal with on a daily basis comes down to contracts and SLA's - you do what you can at a technical level but some things need to be a promise against punitive costs, and for those scenarios a contract is key! – Rory Alsop Jan 21 '11 at 01:01
  • 1
    @Rory as I come from the contracts from the other end, I usually aim to limit my liability to the value of the contract. I don't think I'd take one on that had punitive charges. –  Jan 21 '11 at 01:08
  • I'm on the same end as you, but I've seen some interesting contracts. Sometimes the value is high but it's a difficult decision against risk... :-) – Rory Alsop Jan 21 '11 at 01:48
  • It sounds like he is looking for something more powerful, e.g. see [k-anonymity](http://privacy.cs.cmu.edu/dataprivacy/projects/kanonymity/kanonymity2.html), or [k-anonymity and l-diversity | History of an Idea: Missing Data](http://missingdata.wordpress.com/2007/08/23/k-anonymity-and-l-diversity/), which it doesn't seem like you've covered here. Not that I know if can be covered... – nealmcb Jan 21 '11 at 20:57
  • @nealmcb - you are right. Added your link in the main question. But unlike what the paper addresses, my question is aimed at data available in non-relational form - say documents that hold business critical information for the client; something that the security expert and business analyst would like to redact out of the requirement document and the developer/tester will frown at such an act. Is there some standardized procedure/ tools available that pacifies both parties by balancing security and understandability? – Tathagata Jan 21 '11 at 22:47