24

I'm about to deploy ~25 servers running Debian. The machines will have different roles - web servers, Java appservers, proxies, MySQL boxes. The environment will probably not grow much in the future - maybe 2-5 more servers in next 2 years.

I'll probably use fai for system installation, but I'm unsure if it's worth to add also cfengine or puppet centralized configuration management for such small scale.

Does configuration management make sense for an environment this size?

HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
pQd
  • 29,561
  • 5
  • 64
  • 106

7 Answers7

29

I would recommend using a mixture of Debian pre-seeding, where you give the installer a text file that answers all the questions it would ask, and Puppet.

THe reason for using the preseeding, rather than FAI is that you don't have to set up an image first and deal with keeping it up to date. You will end up with an install very similar to what you would have if you did them all by hand. When you come to install a new release, you will have to update a config file with the changes, rather than having to rebuild a new image.

A configuration management tool is particularly useful where you have several servers performing the same role and you want them to be identical, e.g. webserver cluster. However, they can also be useful for configuring the base install of all servers. You're going to want to install particular packages on all your servers, like ntpd and a MTA. You're going to want to change a config file on all your servers. An additional benefit is that you can keep your manifests in something like subversion and keep a record of what changed on a server and who did it and why. Configuration management can also be a life saver in the case of a server failure and you need to rebuild it quickly. Install the OS (using FAI or preseeding), install puppet and away it goes, built back exactly as it was before. Obviously you'll need to keep backups of data.

Configuration management requires dedication to make sure you only make changes using it and will have an upfront cost setting things up, but once you have a working setup you won't regret it.

Puppet is the more modern of the two tools you've mentioned. I really recommend it to anyone. The configuration is a declarative language and is easy to build up higher level constructs. There is also a very large community around it and there are always people welcome to help on the mailing list or the IRC channel.

David Pashley
  • 23,151
  • 2
  • 41
  • 71
  • thanks for hint about pre-seeding. i'm taking a look at the docs about it right now. – pQd Jun 28 '09 at 10:50
  • FAI is old skool; I definitely wouldn't recommend it. Preseeding+Puppet ftw. – womble Jun 28 '09 at 11:12
  • We use FAI and cfengine, we have around 1000 machines and it works very well. Its worth noting that you can ssh into the machine as it builds itself, so that can make writing the micro scripts alot easier. – James Jul 08 '09 at 22:35
  • Good advice, we use a similar approach and it works well. However, I wouldn't dismiss FAI. FAI doesn't use images for installation (SystemImager does that). You have to set up a minimal nfs root directory that is used for running the FAI installer. The installation process is automated with configuration files and executing various user defined hooks. The advantage over preseeding is that the concept of FAI classes makes it easy to handle multiple servers (and even workstations) having different roles. – JooMing Nov 27 '10 at 15:38
  • FAI old scool? NO! FAI is rock solid and has more than 10 years of experiences. Have a look at the long list of FAI users at http://fai-project.org/reports – Thomas Lange Nov 01 '10 at 20:28
10

I'd recommend CFengine for any environment which is more than 2-3 boxes and where you have some concept of 'templates' or servers performing specific roles.

Why? Simply put it reduces mistakes, you have a tool which will ensure file/directory permissions are correct everywhere in the environment and when you come to roll out more servers, the tool handles absolutely everything and never makes any mistakes.

Contrast with even a skilled System Administrator rolling out a web server at the end of a twelve hour shift when things already went wrong.... Are they likely to remember that nasty little configuration file which needs to go in /etc/random/location/foo/bar otherwise the application will silently fail to do something rather important, like bill customers? :)

Tools like CFengine are also a great way to perform environment-wide security updates. Dropping a Nagios configuration (NRPE) onto all boxes is also a doddle. Whether you're dealing with five boxes or five hundred boxes you will save time with CFengine.

It is probably worth noting that my environment is a little larger, however I've also deployed CFengine for smaller environments than you note, hence the recommendation!

Probably your next question will be CFengine vs Puppet? That's a more difficult decision, and I've always gone CFengine due to (in the early days) some immaturity from Puppet, particularly around error logging.... these days I'm really not sure - have a play 'n see? Looking back to my specific issues with Puppet, they were SSL certificate related, painfully still recall the time I spent 3 hours diagnosing server <-> client connectivity issues in irc.freenode.net/#puppet with some hefty RTFM and RTFS only to find an error, not being logged, and Luke said, "Ah that's really difficult to fix" and never did. :(

nixgeek
  • 874
  • 5
  • 14
  • good point. problem is in my case things are going to be highly specialized, number of templates [ because of redundancy ] will be probably around n/2 [ where n is total number of servers ]. – pQd Jun 28 '09 at 10:38
  • 1
    That's no bad thing, most of my WWW clusters are n+2 if not n/2 and you can be pretty flexible with CFengine in deploying nodes behind your load balancers like HAproxy. It's perfectly viable to manage IPVS and keepalive stuff too :-) Even with n/2 redundancy requirements I'd wager you have a lot of identical or similar configuration files in your environment? Remember that with CFengine you do have the 'editfiles' tool for doing things like a "templated" config file containing something like __IP__ and then (at runtime) a find and replace with right info. ;) – nixgeek Jun 28 '09 at 10:42
  • @astinus thanks for your comments. i'm also a bit scared of getting my production down by screw-up in central configuration. what do you think about disabling automated polling of config and logging on each of machines and forcing it to update and manually checking if all things are fine? [ yes, i will have nagios / custom monitoring in place as well... but still ]. – pQd Jun 28 '09 at 10:47
  • 1
    I think that confidence in your configuration management techniques comes with time, but in the interim, just disable the automated polling of boxes and use 'cssh' to login to each class of boxes to run 'cfagent -qv' (or whatever!) when you want to push updates. If you want a top tip for confidence boosting, deploy a virtual machine as a 'staging' environment and ensure all changes go through that first. Pretty easy if you keep your CFengine or Puppet configuration in Subversion, just use branches and tags. – nixgeek Jun 28 '09 at 11:06
  • I will also recommend using SLACK for ridiculously simplifying systems (re)installation, configuration management. It's available here: http://www.sundell.net/~alan/projects/slack/ – HK_ Jun 28 '09 at 13:43
5

In addition to cfengine and puppet, there's also chef. I would strongly suggest using one of these tools as things always will grow in unexpected directions. This helps manage things in a centralized location.

The important thing to recognize is that chances are you won't get everything but if you can at least get 90% there, it's a start. Besides, it's fun and will make your life easier in the long run. Lastly, it's a good skill to have going forward.

Jauder Ho
  • 5,337
  • 2
  • 18
  • 17
  • chef is a recent entry into the configuration management scene. It's designed to be configured by writing ruby to do what you want, as opposed to puppet's custom declarative language. Time will tell which method works out well. I currently sit in the puppet camp. – David Pashley Jun 29 '09 at 13:08
3

I'm using cfengine since 5 years to install debian (from woody up to lenny nowadays). With etch I build a custom debian-installer. Thanks to preseed one single question comes up: "whats the hostname". After this cfengine configures the whole server (dns+dhcp with dnssec, samba, ntpd, default (Samba) users and passwords, ssh, openvpn, apache vHosts, backup with rsnapshot on LVM, custom webminmodules etc).

Even when I install just one server I use cfengine-scripts from my toolbox like this:

control:

  Repository  = ( $(CFREPO) )
  IfElapsed = ( 0 )
  Syslog = ( on )
  actionsequence = ( editfiles shellcommands )
  CPTYPE = ( sum )

editfiles:
  { /etc/sysctl.conf
    # don't spam on tty:
    BeginGroupIfNoSuchLine "kernel.printk.*=.*2 4 1 7"
      DeleteLinesMatching "^kernel.printk.*=.*"
      Append "kernel/printk=2 4 1 7"
    EndGroup
    # no E(xplicit?) C(ongestion) N(otification) 
    BeginGroupIfNoSuchLine "net.ipv4.tcp_ecn.*=.*0"
      DeleteLinesMatching "^net.ipv4.tcp_ecn.*=.*"
      Append "net/ipv4/tcp_ecn=0"
    EndGroup
    BeginGroupIfNoSuchLine "net.ipv4.ip_forward.*=.*1"
      DeleteLinesMatching "^net.ipv4.ip_forward.*=.*"
      Append "net/ipv4/ip_forward=1"
    EndGroup
    DefineClasses "configchange_sysctl"
  }

shellcommands:
  configchange_sysctl::
    "/sbin/sysctl -p /etc/sysctl.conf"

# vim: set ts=2:

I like cfengine, because the cf2-scripts are somewhat human readable.

so its definetly worth it to work with tools for automatic configuration management.

/thorsten

ThorstenS
  • 3,084
  • 18
  • 21
2

It's got to be worth it even for a small site. Its all about consistency as you grow. And you know that your site is going to grow. Best to start while your still small. Cfengine is awesome. Especially the version 3, which can handle all the package managers across the field, and its real lightweight and secure and it "just works". Puppet just didn't deliver what it claimed. Haven't tried Chef.

The advantage of cfengine over the others is it's ultra lightweight but actually has more capabilities. It's security is like ssh, rather than the web certificates used by puppet. When I told my boss about cfengine he thought it was science fiction :) If you're looking for something futuristic, try reading some of Marc Burgess's research papers. Cool stuff.

SAnnukka
  • 69
  • 3
1

I agree with every one here. You should start to learn and set up a working infrastructure when you are not to large. Because then you are prepared when you grow.

Depending on what you want to run, I would go for FAI, cfengine and pre-seeding for Debian/Ubuntu. FAI can work with many different tools, so it is a good start for any Debian-like distribution. With FAI (and cfengine) class-controlled configuration, you can easy divide your installations into small modules, which you then can select which to use for each of your machine. In this way, it will be usefull even if you have many different machines. It is actually more usefull, as you will document your installation with these scripts. And when you install on a new machine, you will not forget anything.

Yes, you SHOULD have some machines to test on at befor you deploy your changes in a live installation. But with configuration script like this, you will not forget to do any step in the live installation.

Anders
  • 167
  • 1
  • 8
1

The number one tool I wish I had when running a small site is 'push-button' builds. It makes patching, updates, and rebuilds easier, which can address a myriad of other problems in the future.

No ssh properly installed on all boxes? no curl/wget/vim either? what about other in-house tools you'd like to have on each box?

Having central management of your servers is one of the first tools you should have working to make future efforts much easier.

ericslaw
  • 1,562
  • 2
  • 13
  • 15