How to maintain 40 copies of the same computer installation?

0

I serve 40 fixed installed terminals (with touch screen, no mouse nor keyboard connected) in an environment. These 40 PC's all have a slimmed version of Ubuntu and Chromium installed, as their only purpose is to serve a web applicaton in fullscreen mode.

Now, here's the problem: a few days ago, we had a power failure, all terminals shut down. When I restarted them all again, not a single one's touchscreen worked, I have no clue why. However, reinstalling the system with a CloneZilla flash drive, that solved the issue. Which was a lot of work -- connecting keyboard & flash drive, change BIOS settings to boot from the flash drive, reinstall all 40 terminals.

How can this be done in a better way? My dream scenario would be to deply one change (let's say I'd like to have a small NodeJs server on every terminal as well, or configure SSH access). I looked into running a PXE servers, but that does apparently take A LOT of time (800 MB image x 40 takes quite some time to download).

Does anyone have a better solution on how this setup can be maintained in a better way?

chr_lt_ney

Posted 2018-03-02T17:41:43.723

Reputation: 1

Answers

4

Well, PXE doesn't have to be a case where you download the image to each server as they boot. In fact, the more traditional use for it (at least when dealing with UNIX systems) was to provide diskless boot for systems which had their root filesystem on NFS (or these days possibly some other network filesystem). I'm not quite sure how well that might work for you (it trades the time issues of PXE for a single point of failure in the NFS server), but it might be worth looking at. You can also do similar things with iSCSI or NBD, though those are a bit more complicated to set up.

You might also look into the possibility of chain-loading things similarly to how SystemRescueCD does. When netbooting, it only needs to load syslinux, the kernel, and the initial ram disk over TFTP, and can then load the actual system image over another protocol (for example, where I work, we do so over HTTP). TFTP is a horribly inefficient protocol (it requires each block to be separately acknowledged before the next block can be sent and uses a very small block size by default), so doing this can significantly speed up the process (we have the network where I work set up to netboot SystemRescueCD and have it load the system image over HTTP instead of TFTP, which cuts the boot time from almost 15 minutes down to about 3 on the systems I tested it on when I set it up).

Given that you're running something based on Ubuntu, you might look at using a combination of MAAS and Juju, as that's the standard stack for doing this type of thing with Ubuntu.

Beyond all that though, if you can safely assume that mass outages like what you saw are rare (and therefore you aren't likely to need to reinstall all 40 systems at once again), you might just look at an automated management tool. It wouldn't help with installing systems, but it would greatly simplify deploying changes to configuration or packages on the systems. I'm particularly fond of Ansible for this type of thing, largely because of how dead simple it is to set up (you literally just need passwordless SSH login and a single specific python package installed on the systems you intend to manage) and the fact that it uses a stateful (mostly) declarative language to handle tasks which is really easy to learn. Puppet, Chef, and Salt are the other three popular options for this type of thing, but I've never had any personal experience with them beyond just cursory evaluation, so i can't really give any advice on which one might be best for your usage.

Austin Hemmelgarn

Posted 2018-03-02T17:41:43.723

Reputation: 4 345

Thanks a lot, that was a very detailed answer! I fell for the HTTP thingie and have to ask: so you basically boot into a SystemRescureCD with TFTP, and from there on, how would/could I continue with HTTP? My TFTP server is Windows, the clients are Ubuntu (I could easily go with diskless options as well). – chr_lt_ney – 2018-03-04T09:03:15.840

@chr_lt_ney The best advice I can give is to check the official documentation for SystemRescueCD on this, located here: http://www.system-rescue-cd.org/manual/PXE_network_booting/. I've never tried setting this up with a Windows TFTP server, so beyond that, there's probably not much advice I can give.

– Austin Hemmelgarn – 2018-03-05T15:33:40.760

0

I also recommend the diskless PXE boot, but as an alternative, you could easily backup one system and then just restore it to all 40 in this situation. That would be faster than reinstalling 40 times.

psusi

Posted 2018-03-02T17:41:43.723

Reputation: 7 195