30

What's the best practise to deploy new code on a live (e-commerce) site?

For now I have stopped apache for +/- 10 seconds when renaming directory public_html_new to public_html and old to public_html_old. This creates a short down-time, before I start Apache again.

The same question goes if using Git to pull the new repo to the live directory. Can I pull the repo while the site is active? And how about if I need to copy a DB as well?

During the tar (backup purpose) compression of the live site I noticed that changes occurred in the media directory. That indicated to me that files keep on changing periodically. And if these changes can interfere if Apache is not stopped during deployment.

T. Zengerink
  • 199
  • 5
  • 13
nicoX
  • 611
  • 9
  • 18

8 Answers8

32

The fastest and easiest is to use a version directory such as

/var/www/version/01
/var/www/version/02

and use a current symbolic link as your html_root :

/var/www/html -> /var/www/version/02

This technique integrate perfectly into a revision control system (svn, git, mercurial, ...) as you can checkout branches & tags, changes the symbolic link and reload Apache. The downtime is minimal using this technique and it allows very easy rollback.

It also integrate well with more complex deployment system such as RPM packages, or configuration change management (chef, puppet, etc) infrastructure.

CloudWeavers
  • 2,511
  • 1
  • 14
  • 17
  • 4
    Simplest solution are always the best... :-) Of course don't forget to mention, that some FollowSymlinks and such apache flags in the configs may be needed. – peterh Aug 13 '14 at 07:50
  • Take special care in what @PeterHorvath said. Apache can get very grumpy when working with symlinked DocumentRoots. Be sure to test carefully! – mhutter Aug 14 '14 at 09:56
  • @mhutter Thanks :-) What really problematic is that enabling FollowSymlinks on apache can cause security problems... – peterh Aug 15 '14 at 12:17
  • Updating a symlink isn't an atomic operation. Even using something like `ln -snf` to clobber the original symlink, the underlying operation is an `unlink` and `symlink`. There's a chance of users getting a 404 during the update. This is no better than just renaming the original directory out of the way and renaming a new one into place (assuming you aren't crossing filesystems). See the answer above with a check mark next to it, which addresses this concern. – GargantuChet Aug 25 '14 at 00:19
14

Renaming the directories without shutting Apache down should work as well. That will shorten the window significantly. mv public_html public_html_old && mv public_html_new public_html should finish in a fraction of a second.

A couple of drawbacks are that this approach will give a 404 to any request that still manage to happen during the window. And if you run the above command without having a public_html_new directory it will fail and leave you with a site giving 404 on every request.

Doing it atomically with directories is not support. But you could do it with symlinks. Instead of having a directory named public_html, have a directory named public_html.version-number and a symlink called public_html pointing to that directory. Now you can create a directory called public_html.new-version-number and a new symlink called public_html.new.

Then you can rename public_html.new to public_html to switch atomically. Notice that mv is "too intelligent" to perform that rename, but it could be done using os.rename from python or anything else which will call the rename system call without trying to be smart.

What to do with the data base depends on what database you are using, and what you are using it for. You need to provide a lot more details about the database before we can give you a good answer to that part of your question.

kasperd
  • 29,894
  • 16
  • 72
  • 122
  • 1
    On my Debian system, `mv` has a `-T` option that keeps it from following the symlink. This will let you atomically rename `public_html.new` over `public_html`, assuming both are soft links. – GargantuChet Aug 25 '14 at 00:21
13

Using a load balancer is a good idea. If the site is important enough to worry about a few seconds of downtime, it's important enough to worry about fault tolerance.

That aside, if this is on a UNIX system you can put Apache on hold during the rename (or symlink update, etc.):

killall -STOP httpd  # Pause all httpd processes
mv public_html public_html_orig
mv public_html_new public_html
killall -CONT httpd  # Resume all httpd processes

This will keep Apache from accepting new requests during the rename. If you prefer symlinks or some other approach, the same idea can be used:

killall -STOP httpd  # Pause all httpd processes
rm /var/www/html
ln -s /var/www/version/03 /var/www/html
killall -CONT httpd  # Resume all httpd processes

Note that any pending connections or packets will queue up in the OS. For an extremely busy site, consider tuning ListenBacklog if appropriate for your httpd worker type, and check your OS settings related to TCP listen backlog.

You could also change DocumentRoot in httpd.conf and do a graceful restart (apachectl graceful). The drawback here is the increased risk of errors, since you'd have to update any Directory configuration as well.

GargantuChet
  • 314
  • 1
  • 8
  • Will the pause session this still have the site running? – nicoX Aug 15 '14 at 09:00
  • 4
    It stops giving CPU time to Apache. If you tried to access the site in a browser while Apache is paused, the browser will be waiting to connect until Apache is resumed (or the browser times out, if Apache is paused longer than the timeout period). If someone is in the process of downloading a file, Apache will stop sending data while it is paused, again because it's not getting any CPU time. Again this will only cause problems if Apache is stopped for so long that the transfer times out. – GargantuChet Aug 15 '14 at 15:00
  • 5
    Put another way, the site will be unresponsive while Apache is paused, but pending operations will finish when it is resumed. Users will not get "connection refused", and downloads will not break, but operations will only continue when Apache is resumed. This will ensure that existing transactions can finish, but new requests will only be handled after your new content is moved into place. – GargantuChet Aug 15 '14 at 15:04
  • 1
    Please note that in any high traffic website, this will could very easily kill your Apache service. 200 rq/s will very easily trash your connection pool as soon as you will 'unlock' your Apache process after the move (if the move take a while) – CloudWeavers Aug 18 '14 at 18:05
  • 1
    On a high-traffic site, there will be plenty of in-flight requests to finish when Apache is resumed. This will stagger the processing of new requests. It's also a good argument for making sure your Apache settings (max number of threads / servers / clients) are reasonable, and tuning the TCP backlog accordingly. Although I'm confused about what you mean by "killing" the service. Apache is very tunable. – GargantuChet Aug 19 '14 at 02:06
  • My point was that the question stipulate that he wanted to remove the downtime. This just artificially hide it from the user and actually lengthen it, due to backlog processing. Your downtime, using this method, is longer than it would have been in his previous procedure. To be able to withstand this technique you need to be able to hold a very high 'free capacity' pool in your production system (ie: able to deal with 5x the current traffic). This isn't cost effective. There's better way to scale if you need burstable capacity. – CloudWeavers Aug 21 '14 at 12:53
  • What would you suggest instead? Let users users get a 404 while the content is renamed (or symlink is removed / recreated, etc.)? Shut the server down, which means waiting for pending transfers to wrap up while not refusing new requests? Wrapping the rename / symlink update in the STOP / CONT signals prevents either of these from happening. – GargantuChet Aug 21 '14 at 14:44
  • You don't need to restart the server, just reload it. The delays are created due to the 'mv' command. Look at my suggestion with 28+ votes bellows- you just need a symbolic link (you can even put it one level lower (ie: /var/www/(current)/html ; current being the symlink)). – CloudWeavers Aug 24 '14 at 02:22
  • So you would let the users get a 404 between the time the original symlink is removed and the time the new symlink is created. Got it. If you look at my accepted answer above you'll note that this is avoidable risk. – GargantuChet Aug 24 '14 at 23:24
  • After I proceeded with `killall -CONT httpd` the Magento site generated some errors (MySQL etc.) when I reloaded the page. After the next reload again the site worked fine. During the error I though that something broke while httpd was paused. – nicoX Sep 03 '14 at 10:04
  • @GargantuChet, could you please expand your answer and add description for tuning the TCP backlog etc. – nicoX Sep 03 '14 at 15:14
  • Roughly how much time elapsed between the `-STOP` and the `-CONT`? The backlog would only come into play if you got some sort of "connection refused" error from the browser. – GargantuChet Sep 03 '14 at 19:27
  • @GargantuChet, less than 20 seconds. Yes, I got some sort of "connection refused" and MySQL generated errors. Don't know what it was related to, if it had something to do with the site being cached, or that after `-CONT`, `httpd` didn't run the site fast enough when I reloaded the web-site. – nicoX Sep 11 '14 at 11:44
11

Symlinks and mv are your friends, however, if you really need to avoid end-users getting an error page while deploying a new version, you should have a reverse-proxy or a load-balancer in front of at least 2 backend servers (apache in your case).

During the deploy, you just need to stop one backend at a time, deploy the new code, restart it and then iterate on the remaining backends.

The end-users will be always directed to good backends by the proxy.

Giovanni Toraldo
  • 2,557
  • 18
  • 27
  • 4
    I was just working on this answer when I saw you already posted it. Balancer + 2 servers makes the process invisible and easier to recover from a bad upgrade... – Bart Silverstrim Aug 12 '14 at 19:37
9

If you are applying changes regularly on a production system, I would take care of a structured life cycle. A good practice is Capistrano http://capistranorb.com/. This is a an open source solution for deploying software on one or more servers on several platforms and configurations.

For Magento there is even a plugin: https://github.com/augustash/capistrano-ash/wiki/Magento-Example

For single server and almost seamless transitions, I recommend to use symlinks.

Skiaddict
  • 116
  • 1
  • 10
4

The way I do it is do commit my changes from my local dev environment to an online Git repository such as Github. My production environment runs off a remote repository so all I need to do is ssh to the server and run git pull to bring down the latest changes. No need to stop your webserver.

If you have files in your project whose settings and/or content differ from your local version (such as configuration files and media uploads) you can use environment variables and/or add these files/directories to a .gitignore file to prevent syncing with the repository.

harryg
  • 841
  • 2
  • 10
  • 19
3

My first idea is:

# deploy into public_html_new, and then:
rsync -vaH --delete public_html_new/ public_html/

A good solution were to use rsync. It changed only the really changed files. Beware, the slashes at the end ot the pathes are here important.

Normally apache don't need a restart, it is not the java world. It checks for the change of every php file on request, and rereads (and re-tokenizes) on change automatically.

Git pull were similar efficient, although it were a little bit harder to script. Of course it enabled a wide spectrum of different merging/change detection possibilities.

This solution will seamlessly only if there are no really major changes - if there are big changes in the deployment, a little bit of hazard can't be closed out, because there is a not negligible time interval, when the code will be partially changed and partically not.

If there are big changes, my suggestion were your initial solution (two rename).


Here is a little bit hardcore, but 100% atomic solution:

(1) do an alternate mount some of your filesystem, where your magento takes place:

mount /dev/sdXY /mnt/tmp

(2) do a --bind mount of your public_html_new to public_html:

mount --bind /path/to/public_html_new /path/to/public_html

From this point, the apache will see your new deployment. Any change of a 404 is impossible.

(3) do the synhcronistation with rsync, but on the alternate mount point):

rsync -vaH --delete /mnt/tmp/path/to/public_html_new/ /mnt/tmp/path/to/public_html/

(4) remove the bind mount

umount /path/to/public_html
peterh
  • 4,914
  • 13
  • 29
  • 44
  • Will the command delete public_html and deploy public_html_new into it? – nicoX Aug 12 '14 at 12:29
  • @nicoX No, it will copy _only_ the changes. – peterh Aug 12 '14 at 12:31
  • @nicoX It goes through on _both_ directory structures, and if it finds a difference (new file, modified file, deleted file), it modifies the second directory to match the first, as it is needed. The result if you deleted public_html and then moved public_html_new to its place, _but_ without any possibility of a temporary 404 problem. – peterh Aug 12 '14 at 12:45
  • 1
    No, this is not a good idea. Depending on the changes, you might have a short period of time where the code in `public_html` is in an inconsistent state, and you don't want to take this chance. – Sven Aug 12 '14 at 12:48
  • @SvW You are right, my idea is only okay if there are only minor changes. I extended my answer accordingly. – peterh Aug 12 '14 at 12:51
2

Moving/replacing the http_public folder can be achieved with simple mv or ln -s commands or equivalent while your http server keeps running. You can do some scripting to significantly reduce the downtime, but check carefully the return codes of your commands in the script if you automate the process.

That said, if you want to achieve no downtime, you application must also support it. Most application uses a database for persistency. Having version N of your application messing with version N+1 (or the reverse) of your datamodel may break things if not foreseen by the development team.

From experience, maintaining such a consistency through upgrades is not a given for most applications. A proper shutdown, despite the downtime, is a good way to avoid consistency issues.

Uriel
  • 123
  • 3