124

I tend to use Git for deploying production code to the web server. That usually means that somewhere a master Git repository is hosted somewhere accessible over ssh, and the production server serves that cloned repository, while restricting access to .git/ and .gitignore. When I need to update it, I simply pull to the server's repository from the master repository. This has several advantages:

  1. If anything ever goes wrong, it's extremely easy to rollback to an earlier revision - as simple as checking it out.
  2. If any of the source code files are modified, checking it as easy as git status, and if the server's repository has been modified, it will be come obvious the next time I try to pull.
  3. It means there exists one more copy of the source code, in case bad stuff happens.
  4. Updating and rolling back is easy and very fast.

This might have a few problems though:

  • If for whatever reason web server decides it should serve .git/ directory, all the source code there was and is becomes readable for everyone. Historically, there were some (large) companies who made that mistake. I'm using .htaccess file to restrict access, so I don't think there's any danger at the moment. Perhaps an integration test making sure nobody can read the .git/ folder is in order?

  • Everyone who accidentally gains read access to the folder also gains access to every past revision of the source code that used to exist. But it shouldn't be much worse than having access to the present version. After all, those revisions are obsolete by definition.

All that said, I believe that using Git for deploying code to production is reasonably safe, and a whole lot easier than rsync, ftp, or just copying it over. What do you think?

Septagram
  • 1,343
  • 2
  • 9
  • 5
  • 25
    Wouldn't standard practice have your git repository exist one level up from your htdoc folder so accidentally serving it out wouldn't happen? – Fiasco Labs Nov 14 '13 at 08:13
  • That's a good suggestion, and I'm doing just that for a latest personal project. But on another project my clients repository is organized differently (with htdoc being at the root of repository), so the question still applies. – Septagram Nov 14 '13 at 08:20
  • How do you ensure changes made after you deploy to test won't get deployed to production? – Graham Hill Nov 14 '13 at 10:05
  • 1
    So far I didn't have a chance to work in a team with dedicated testing server and QA (probably bad). I test thoroughly on the dev machine and only deploy new features in specific iterations. Then if there's a need for a bugfixes until the next iteration, they are done on a separate branch. E.g., version 0.6 is released, work started on 0.7. Bug is found on 0.6, fix applied in a branch 0.6 and merged into a master branch. Production then pulls branch 0.6. And until 0.7 is released, all fixes go first into branch 0.6. Sometimes (not always) I create a separate branch for production version ahead – Septagram Nov 14 '13 at 10:38
  • In a traditional enterprise environment this practice would be treated with scorn. Still, as you say, there are few technical problems with it. In fact, the more general approach of deploying quickly and often is gaining traction. It even has its own buzz word: Continuous Deployment – paj28 Nov 14 '13 at 12:47
  • 2
    Is this *really* a security question? – tylerl Nov 15 '14 at 00:03
  • 2
    You could always use a Package Manager suite like Gradle, Maven, Etc to deploy your files to the HTDOCS folder so you specifically ignore ./git, etc. Update your GIT on the server in a separate folder, than run you PM to deploy it to the proper location. It's why they exist. – Shane Andrie Sep 16 '16 at 19:42
  • I'm surprised that the security issue you're concerned about is the .git folder, while I would actually be more afraid that such process means your git credentials are stored on the web server and so, if a breach exists in your website, then anyone can access the Git repository (let's hope these credentials are read-only tho) – Xenos Aug 25 '18 at 18:50

7 Answers7

80

I would go as far as considering using git for deployment very good practice.

The two problems you listed has very little to do with using git for deployment itself. Substitute .git/ for the config file containing database passwords and you have the same problem. If I have read access to your web root, I have read access to whatever is contained in it. This is a server hardening issue that you have to discuss with your system administration.

git offers some very attractive advantages when it comes to security.

  1. You can enforce a system for deploying to production. I would even configure a post-receive hook to automatically deploy to production whenever a commit to master is made. I am assuming of course, a workflow similar to git flow.

  2. git makes it extremely easy to rollback the code deployed on production to a previous version if a security issue is raised. This can be helpful in blocking access to mission-critical security flaws that you need time to fix properly.

  3. You can enforce a system where your developers have to sign the commits they make. This can help in tracing who deployed what to production if a deliberate security flaw is found.

  • 5
    Nice answer, I am really looking forward to using git in my production servers! However, I'm reluctant to use automatic push as you (and several others I read) suggested. Automatic push can change your production server when you don't want to, and you could accidentaly leave your server offline. Maybe I'm missing something and overcomplicating matters, but I prefer the paranoic approach of ssh'ing into the production server, do a `git fetch origin master`, then `git diff master origin/master`. Only then I would do `git merge origin/master --ff-only`. Do you have any thoughts about this matter? – pedromanoel Jun 11 '14 at 13:07
  • 4
    @pedromanoel I subscribe to the school of thought where anything you push to master should be production ready so that shouldn't really be a concern. What you are suggesting works as well. –  Jun 17 '14 at 11:51
  • Our master branch commits deploy to the Dev Web server. We have a QA, Staging and Production branch that map to their respective environments. When we want to release to production we simply have to merge in the tested code into the production branch. This has worked well for us. We use KUDU for this. Not sure what other tools are available. – RayLoveless Jun 19 '15 at 20:44
  • @TerryChia Can you suggest any tutorials / tools for setting up your post-receive hooks for automatic deployments? We use Kudu but I've found it tricky to set up for non asure sites(We have our own IIS servers on prem). – RayLoveless Jun 19 '15 at 21:02
27

There's nothing wrong with deploying from a git repo, in fact it's a pretty common practice and as you say a lot less prone to errors than copying files over ftp or rsync.

Given the information you've provided I'd note the following points:

  • Don't just pull in the latest master. Production should be deployed from a release tag. Use git flow or similar to get a little more process around the deployment of the code and creation of the tags. Because tags are an immutable reference to your code at a given time it's more stable than pointing to a master branch that could be updated by an errant commit.

  • As for serving the .git directory this shouldn't be a big issue. Just redirect anything prefixed with .git to a 404 in .htaccess.

  • Your git authentication should be ssh key based so no repo passwords need to be stored on the server.

Hurray for git deployment workflows!

Matt Surabian
  • 539
  • 3
  • 5
21

I'm going to disagree with popular opinion here. Git is for version control, it is not for deployment/CI.

The methods people are advocating here are fine for small projects, but generally speaking, do not scale very well.

...which is not to say that you should not continue to do what you're doing. Just keep in mind, as your career progresses, that the projects you're working on will likely outgrow a purely git-based deployment workflow.

The main shift in paradigm is to stop thinking about deploying branches, and start thinking about deploying build results, and injecting environment dependencies to decouple your infrastructure from your branching strategy.

For example, to use the paradigm above about 'deploying files' and rollback. If you were deploying files, you could keep multiple versions of the application on the production server, and if you need to rollback, you can point your webserver to an older version by re-symlinking the web root. Two shell commands, microseconds of downtime, less room for error than using git.

Another example - you have the standard dev branch > staging branch > master workflow, with a server for each. You have stuff ready to go in dev, but some other stuff on staging failed QA. So you need to do some ugly operations - strip the bad commits from stage, redeploy stage, and also strip or fix the bad commits in development, and hope that development and staging don't end up out of sync.

Instead, what if you cut a release branch off master, merged the stuff that is ready to go into that release branch and built the result, spun up a testing server on AWS and deployed the result to it, ran your QA and functional tests against that temporary staging server, and then pushed that same result to the production cluster and deployed it? Then decommission the temporary staging box you spun up, merge the release into master, and tag master with a release number.

Obviously those are extreme examples. Usually, it's not that clean, even if we want it to be - you probably need to run some build processes at the target environment because you have database changes that need to happen. And if you've got non-idempotent database mods between versions, well, rollback isn't going to be that easy not matter what method you use, because you'll need to rollback the db (generally speaking, try to deploy idempotent database changes, then remove the obsolesced database bits in a later release, after you're sure they're out of application flow).

Anyway, I guess what I'm getting at, is that my answer to 'Is using git for deployment bad practice?' is, generally speaking, 'Yes', in the same way that using training wheels is bad for bike riding. It's an improvement over busting your ass on the concrete repeatedly, but you hopefully outgrow it eventually.

siliconrockstar
  • 311
  • 2
  • 3
  • I would also agree with this. I don't necessarily think it's bad, but there are some additional items of concern. 1. It puts *everything* in production 2. `git checkout` is not transactional. That means that during deployment your system can be in an inconsistent state while individual files are written As @siliconrockstar noted, git is a SCM system, not a deployment system. They solve two different problems. – Kevin Schroeder Nov 28 '18 at 14:40
  • @siliconrickstar do you have any further reading or resources on this? I've used this process before, but I'd like to see some best practice guides. Example, using Gitlab, would your CI pipeline bundle up the _entire_ application into a zip and rsync that into production? Dependencies and all (like vendor, node_modules, etc) or would they resolve on the server (I assume the former, as you then have an _entire_ snapshot of the application in one directory, and the prod server isn't doing any "work" except a symlink swap) – Chris Feb 17 '19 at 01:28
  • @Chris I wish I had some 'authoritative' resources for you, but the actual details vary drastically depending on project and stack. I work mostly with Magento these days and even their 'official' deployment flow using Magento Cloud does a few weird things that are probably not best practice (coupling branches and environments comes immediately to mind). In Magentoland, regarding dependencies, current Magento uses composer and npm, and IMO it's generally safe to let those tools do their thing on the deploy target(s), instead of copying all of vendor/ every deployment. – siliconrockstar Sep 04 '19 at 14:44
  • Cool cool, thanks @siliconrockstar! – Chris Sep 05 '19 at 01:57
10

You can use the --separate-git-dir=<git dir> argument when calling git clone. This will place a symbolic link in the .git directory (symbolic to Git, I don't think it's a symbolic link to your OS), and you can specify the <git dir> to somewhere outside of the document root.

Brendon
  • 201
  • 1
  • 6
5

An alternative approach that I have taken before is something similar to Terry Chia's comment regarding post-receive hooks.

Git has a number of hooks that can be used to perform all sorts of tasks before/during/after multiple different actions.

Create a bare repository anywhere other than your web folder. The bare repository can then be used as a remote to push to, and a post-receive hook can be triggered to checkout the new code into a specified directory.

This approach has some advantages: The deployment location acts as a "one-way" street where code must go through source control to end up in production. You can deploy different branches/versions to different locations (all from the same bare repo). It also provides a nice easy way to rollback using standard git checkout. However, a caveat/consequence of this is that code changes on the server won't reflect in git (bare repositories don't have working directories) so you'd have to manually perform git --work-tree=/path/to/code status to see any changes (but you shouldn't be changing production code anyway, right?)

russdot
  • 151
  • 1
  • 4
  • 3
    I don't think you need that introductory paragraph. It only tempts to flag your post for deletion. Better work on a good answer. – techraf Mar 29 '16 at 15:08
  • 1
    As a reader when I see "meant to be a comment" type answers I normally discount it as not useful and skip reading it. You have some useful content in this post. I would build up on that. Wash, rinse, and repeat until you have enough reputation to leave comments. – Bacon Brad Mar 29 '16 at 20:23
  • 1
    Thanks for the feedback! I removed the first sentence and expanded my answer. It ended up being too long for a comment anyway. – russdot Mar 30 '16 at 00:33
  • But why even create a repository when you can just push an archive of files? Like you can do the same thing with less overhead with some basic shell scripting. Also that was a rhetorical question, this is a good answer. – siliconrockstar Mar 26 '21 at 18:34
3

I agree with Terry Chia that it is a very good practice. It ensures:

  • that the revision in place is the right one,
  • with all the files needed and checks for completeness of the version,
  • make deploy procedure fast and easy

But, I have to add that there are caveats I feel like sharing.

Usually, into a git repository you put:

  • code,
  • docs,
  • unit testing,
  • deployment scripts,
  • integration tools,
  • .deb archives,
  • sql patchs
  • or stuff like that, and maybe a bunch of others stuff

Well, in production you just want the code !

Because, docs, unit testing, tools, or .deb archives could take a lot of space disk and has nothing to do in production.

With Subversion (before version 1.7) you could checkout just the src/ directory and you have all advantages. But with Subversion >1.7 and with Git, you cannot do that.

An alternative would be to use git submodules to make that kind of archiecture : project repo |- doc |- src -> submodule to project-src repo. \ tests But then your project/tests and project-src code would not be synchronize and that would really suck.

The solution would then be to use "sparse checkout", see : https://stackoverflow.com/questions/600079/how-do-i-clone-a-subdirectory-only-of-a-git-repository

So you can use git, but please be careful of these drawbacks.

Thibault
  • 131
  • 3
  • I'd say introducing submodules\special checkouts\other weird stuff should be very carefully examining before making decisions. Usually your "other stuff" is not so big so it could be an issue. Much better when developer can just clone the whole repo and instantly start working on it without doing magical dances across configuring everything. P.S. And of course you shouldn't store binary blobs in your repo at all. – The Godfather Aug 18 '18 at 23:47
1

Here is the script I use to make git to push to prod.

https://gist.github.com/Zamicol/f160d05dd22c9eb25031

It assumes that anything pushed to the master branch is ready for production. All other pushed branches are ignored. You can modify this hook script to suit your needs.

As far as your concerns, I place my git directory under /var/git and my production files somewhere else (like /var/www) and restrict access to these directories.

You could also have your git repo on a separate server from your production server and then use something like scp and ssh in the script above to move your files from the git server to the production server. This would allow you to still use git to push to prod while keeping your code separate from your production.

Zamicol
  • 220
  • 1
  • 6