19

I'm changing the way that our DHCP/DNS stuff works at work. Currently we've got 3 DNS servers, and a DHCP box. All of them are VMs.

There's a circular dependency where stuff booting requires NFS, which requires DNS. So when we reboot stuff, things might come back subtly broken until the DNS is up, and we restart some services.

What I want to do is have a few low power servers, probably dual core Atoms or similar, running from SSDs, so that they boot damn fast. I want to make the whole thing boot as near to instantaneously as possible.

Ideally I'd like to use Ubuntu 11.10, or Debian 6 as the OS. I'm not interested in Gentoo or compiling my own kernel. This needs to be reasonably supportable by myself.

Other than SSD drives, what other optimization steps can I take to improve boot speed?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
Tom O'Connor
  • 27,440
  • 10
  • 72
  • 148
  • 2
    Is there an actual question here? – ceejayoz Nov 16 '11 at 16:00
  • 14
    @ceejayoz `Other than SSDing drives, what other optimisation steps can I take` looks like a question to me. – MDMarra Nov 16 '11 at 16:03
  • 1
    That's not a question, that's a request for a book-length response. – ceejayoz Nov 16 '11 at 16:06
  • 7
    @ceejayoz So don't answer it if you think it'll take too long to answer. It's not discussion oriented and it's certainly a question, no matter what you want to call it. – MDMarra Nov 16 '11 at 16:19
  • 2
    Per the close reasons, "This question is ambiguous, **vague**, incomplete, **overly broad**, or rhetorical and cannot be reasonably answered in its current form." – ceejayoz Nov 16 '11 at 16:22
  • 4
    @ceejayoz I don't think it's vague or overly broad at all. It's a well-defined question that may have a complicated answer, but it's not far reaching like `"Halp! My thingz don't boot rite!"`. You're more than welcome to jump into [chat] and discuss if you'd like though. There are a bunch of us talking about it in there, including the OP. – MDMarra Nov 16 '11 at 16:25
  • 1
    Can the OP better describe the circular boot dependencies? – Freiheit Nov 16 '11 at 17:24
  • 1
    NFS is evil. Using very important infrastructure servers that depend on that is even more evil. So why do you have that weird design? – Nils Nov 16 '11 at 21:19
  • 1
    What are you mounting from NFS anyway? Are you mounting part of the root filesystem, or just /home, or some other data filesystem? If nothing in NFS is required for the dhcp server to boot, then you could just setup the automounter, so that filesystems are mounted as needed. – Zoredache Nov 16 '11 at 21:57
  • mostly just /home and so on. – Tom O'Connor Nov 16 '11 at 22:36
  • @Nils NFS is fantastic on a stable network. NFS is evil if it's not set up right. – Tom O'Connor Nov 16 '11 at 22:37
  • I agree with @Nils - NFS is evil. No matter how great the setup is eventually it will degrade into cross-mounted hell. And some idjit will hard-mount things that aren't critical, causing the entire network to implode. – voretaq7 Nov 16 '11 at 23:51
  • @voretaq7 NFS is also evil when it comes down to security (at least V2 and V3). To get secure NFS you have to set up V4 in a proper way. – Nils Nov 17 '11 at 21:32
  • Since NFS is evil, what do you recommend than? – JohannesM Jun 25 '12 at 13:41

5 Answers5

29

Isn't this a situation where you should engineer around the circular dependencies? Set power-on delays in the server BIOS. You have multiple DNS servers, so that's a plus. DNS caching? Would this be as simple as using IP addresses or host files for your NFS or storage network? You didn't mention the particular virtualization technology, but it's possible to set VM boot priority in VMWare, for instance... Is this across multiple host servers?

Otherwise, SSD-based boot drives can help. Use a distro with Upstart boot processes. Trim down daemons.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • 5
    I think this is a good answer. Fix circular dependencies as best you can. – mfinni Nov 16 '11 at 16:45
  • Good answer. This is the whole reason hosts files are still around. They provide a solution for before DNS is available, or for cases where you *need* lookups even if DNS isn't available (i.e. Oracle RAC cluster). – Christopher Cashell Mar 28 '12 at 16:22
11

Depending on your UPS status, this could be one of the few use-cases where an ACPI hibernate may be a good idea. Generally restore-from-hibernate beats out a boot-from-scratch, especially in the case of low-RAM SSD-based systems. If you have the ability, the 'shutdown' step for your UPS software can be set to hibernate the DNS server.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • That's actually an interesting idea -- the only downside is powering the machine back on (or waking it from sleep). If the PSU is set to "automatically power on after power loss" this should work as long as the machine actually loses power, otherwise you just need to be aware of the corner case where the hibernate signal is sent but then AC power comes back before the UPS dies. – voretaq7 Nov 16 '11 at 16:22
  • 2
    @Voretaq7 If I were to do this, I'd have the Primary DNS behave normally (no ACPI trickery), and the Secondary DNS do this trick. It'll slow down service startup elsewhere due to the DNS fail-back, but services would still start. Otherwise, some wake-on-LAN magic may need to happen. – sysadmin1138 Nov 16 '11 at 16:26
  • WoL would actually solve this nicely – voretaq7 Nov 16 '11 at 16:28
  • We have a fat UPS, seriously fat.. and it's got Apcupsd software in some kind of network configuration.. I do quite like this idea actually. We're gonna roll out WOL across the network for other purposes too, later on in the month. – Tom O'Connor Nov 16 '11 at 16:29
6

I can recommend a very tiny NetBSD system on SSDs, but if you have your heart set on Linux there are two options that spring immediately to mind:

  • Damn Small Linux is one of the big-name little Linux distros. I don't know what their boot time is, but it's gotta be relatively short.
  • Slax is a bit more customizable from the initial outset, and may be a tad faster.

There's also the option of really tiny custom/embedded solutions like this one ($99 ARM-based system on a module with a 1-second(ish) bootup time. It isn't commodity hardware but it could be tucked away in a quiet corner of a datacenter and left to just run forever...

voretaq7
  • 79,345
  • 17
  • 128
  • 213
3

In most setups DNS is the most important infrastructure service. If it breaks everything else will break, too. The conclusion is that the DNS-server(s) should not depend on other servers.

If you really need NFS for booting - make your DNS-servers those NFS-servers (this is breaking a rule, too) - but make sure to export ro only and make sure you can`t put your NFS-servers in the danger of a DoS-attack.

Propably the better solution is a different (HA) approach for providing the needed NFS-service for booting, thus breaking the circular dependency (nscd may help on the NFS-servers as well).

Update 2011-11-17 on NFS: From one of your comments I see that NFS is being used for /home-dirs. Local technical users should not have those. Anything else should be mounted via autofs whith bg,hard,intr.

Nils
  • 7,657
  • 3
  • 31
  • 71
2

You might want to use bootchart to see what are the boot time hotspots.

There's also readahead: https://fedorahosted.org/readahead/ , which I haven't tried.

alex
  • 1,329
  • 6
  • 9