I'm trying to build a system that will run short-lived (CI and test builds) of software components, it's mandatory according to my requirements that each live on a private host. I'm taking that definition to include paravirtualsation options as well, as it seems like it will save me a lot of headache.
I'm working on a Mac, so pretty much every technology is out, libvirt and quemu, etc just won't work for me. I am however planning on deploying to Debian; so anything that runs on Debian is back on the table, provided I can script the provisioning of the host machine as well as it's guest domains.
My intended setup was that I can use to bootstrap a Debian installer, that something should mean that upon booting, the machine is automatically provisioned (Chef, Puppet, Babushka, don't mind, really) - and part of that provisioning should build a template rootfs that can be used for booting a container. The container itself also needs to be provisioned, so that when the container comes up, it knows what work is has to do, and can do the work, and then exit.
In short, here's the workflow I need:
- Boot a machine (virtual or otherwise) and have it ready to do work.
- The work should be performed by a script installed by chef/puppet/babushka/etc
- When work comes in, a virtual machine should be started to do the work.
- The VM should do the work, exit and release it's resources to the parent/host machine. (it's important that this scaled to at least hundreds of guest VMs on reasonable hardware)
I've come to a point where I've tried the following, and abandoned them for the reasons inlined below:
For the host machine
- Pre-seed Debian micro ISO images with Instalinux (LinuxCOE backed) (Bad: Didn't work at all ("No kernel modules found"( because the Instalinux images are out of sync with the FTP repositories, apparently this solution is notoriously fragile, it also doesn't allow much scope for post-install, and dropping known SSH keys, host keys, etc onto the machine, it seems like fire and forget, in the end I'd have a running machine, but no access to it.)
- Pre-seed Debian netinst ISO (Bad: same problems, as above, except at least the install typically completes as there's no kernel disparity between the ISO and the FTP repository. Still limited scope for post-install. Good: Absolutely reliably & repeatable, easy to throw at any VM technology stack on Mac, or on a bare metal machine, would work anywhere, however I can't post-install it enough)
- Various methods of building a rootfs, and compiling it as a bootable hard disk image (Bad: What little I could get working was fragile as hell, would be difficult to install onto a real machine, and is a complex build process. Good: If I could get it working, this would seem to provide the most scope for pre-configuring the machine to a given specification with ssh keys, host keys, hostname, software installed from Git and whatever else, but then the question would be how to package it for distribution, or how to script it's recreation.)
I'm honestly not sure what technology people are expected to use to bring up a VM from nothing to a running, working and useful system. Seems like three steps to me a) operating system, b) system configuration (users, etc) and then c) filesystem changes.
For the guest (virtual) machines:
- Lots of things, mostly I think the answer here is a readonly rootfs created with
debootstrap
, and a special partition on the LXC container which contains the work to be done for this specific instance (a job manifest). Insert all the usual caveats about building the OS, booting, creating users, checking out software from git, and doing work.
I'm genuinely not sure what tools to reach for, seems like the problem should be well solved. But I just can't find out where to really get a start.
Most people seem to suggest for the host machine that I should pick a virtualisation technology, boot a machine to a working state, and then snapshot it (libvirt seems the logical favorite for this). Using the snapshot to bring up any subsequent installations for testing, or in production.
For the guest machines, lxc seems to provide the easiest option, except that backgrounding a container, and connecting to it later over the console is broken in all present kernels, and the newest version of lxc available to stable Debian is more than 18 months old, and lacks a lot of features which are widely usef.
Typically I'm an application developer, and I don't often work with server level technology (and I'm certain that SF will flag this question as "too subjective") but I'm genuinely uncertain which tools to reach for.
Final word is that I know of one similarly stacked project (travis-ci.org) who are using Vagrant boxes for this. That seems like a rather blunt instrument, big, slow, ruby orientated tools designed for small-scale desktop provisioning of testing VMs being used for critical service infrastructure, but I also know some of those guys, and they're smarter than I am, so maybe they just gave up.
Any help appreciated.