One solution could be to use git's smudge
and clean
filters.
This solution has some serious downsides though, which we will come to later.
You'll need to write two scripts that act as filters, i.e. read from standard input and write to standard output. From the documentation:
A filter driver consists of a clean command and a smudge command, either of which can be left unspecified. Upon checkout, when the smudge command is specified, the command is fed the blob object from its standard input, and its standard output is used to update the worktree file. Similarly, the clean command is used to convert the contents of worktree file upon checkin.
For example when using openssl we can write the files fooenc.sh
:
#!/bin/sh
openssl enc -bf -nopad -pass pass:1KjeHD8d6YUI80bIIEAQ9iYr@njqLw3T
and foodec.sh
:
#!/bin/sh
openssl enc -bf -nopad -d -pass pass:1KjeHD8d6YUI80bIIEAQ9iYr@njqLw3T
Note that these scripts should be kept outside of the repository and that they should be kept secret since they contain the key! Otherwise they're convenient because they don't ask for a passphrase everytime they are called.
A somewhat more secure alternative might be to use GPG.
In the .git/config
file in your repository you should specify these filters;
[filter "crypt"]
clean = fooenc.sh
smudge = foodec.sh
This is not a typo! See the documentation excerpt above. This setup encrypts data on checkin, and decrypts on checkout.
Then in the repository's .git/info/attributes
file, you specify to use this filter for all files;
* filter=crypt
As long as the filtering scripts are available, the working directory will contain readable files. But the git objects will be encrypted.
Note that this precludes actually using the files on any machine that doesn't have the necessary scripts. So bitbucket would only work as a storage.
Now for the downside; This solution also makes tools like git diff
and everything that depends on that useless, since git's objects are now encrypted blobs.
Edit:
There are utuilities like git-crypt or git-encrypt to help you with encrypting your repo's contents.
And there is a solution to the diff problem; using a special filter for diffs; using textconv with an additional script to decrypt the blobs before they are diff-ed.
I'm going to say this is not possible. Git relies on the content of the files in your repository being the same in order to do what it does (everything is based on hashes of the files). If you encrypt the file, you change them, and break the system. If you don't trust BitBucket, don't host your code there. – heavyd – 2013-11-15T20:17:02.133
@heavyd: Obviously if TheDude encrypts all his files, he will need to refresh them all in BitBucket – once. As long as he uses the same encryption tool (algorithm) with the same key, why would it be a problem after the initial hurdle? An unchanged file, encrypted with the same key, is still unchanged. – Scott – 2013-11-15T21:14:05.847
@Scott, if BitBucket were just plain cloud storage like Dropbox then there would be no problem, but BitBucket is a source code repository which is built on top of SCM systems like Git and Mercurial. Being based on Git is what makes this difficult. Git won't work well with the encrypted data. – heavyd – 2013-11-15T21:27:38.693