10

I'm dealing with a known issue in RHEL 7 whereby services that specify an address to bind to will not start correctly. I've found a number of similar reports, many say they have been resolved with updates to systemd but I still face this problem. This affects all the services on my box (sshd, sshd, vsftpd, nginx) that don't just bind to 0.0.0.0.

I've found all sorts of supposed workarounds but none of them work for me consistently. Taking sshd as an example, config looks like this:

Port 22
ListenAddress 192.168.242.225
...

Here's what I've tried, alone and in combinations:

From https://bugzilla.redhat.com/show_bug.cgi?id=1352214#c4 (I've also tried sys-subsystem-net-devices-eth1.device in place of network-online.target but I suspect this doesn't wait for addressing to happen.)

mkdir /etc/systemd/system/sshd.service.d
tee /etc/systemd/system/sshd.service.d/wait.conf << 'EOF'
[Unit]
After=network-online.target
EOF

From https://bugzilla.redhat.com/show_bug.cgi?id=1352214#c11

mkdir /etc/systemd/system/sshd.service.d
tee /etc/systemd/system/sshd.service.d/wait.conf << 'EOF'
[Unit]
Wants=network-online.target
After=network-online.target
EOF

From https://bugzilla.redhat.com/show_bug.cgi?id=1438749#c0

systemctl add-wants multi-user.target network.target

From somewhere

mkdir /etc/systemd/system/sshd.service.requires
ln -s /usr/lib/systemd/system/network-online.target /etc/systemd/system/sshd.service.requires/

No matter what I try, I usually end up with "error: bind to port 22 on 192.168.242.125 failed: Cannot assign requested address". Sometimes, everything starts up perfectly, which I am guessing is down to a timing issue.

Running Scientific Linux (RHEL) 7.5 and network manager is enabled, all IP addressing is static. If there are any other details that might help, please let me know. Here is the output of journalctl after a failed startup, with After=network-online.target in the sshd unit file. Relevant stuff starts down around line 1700. Hoping someone has come across this issue and solved it successfully!

miken32
  • 930
  • 1
  • 11
  • 32

2 Answers2

6

It may be better to not configure system services to listen on specific IP addresses, and to control access to them via the host firewall if necessary.

If you really need to be able to bind to specific IP addresses before they are configured on a network interface, you can work around the timing issue by setting the sysctl net.ipv4.ip_nonlocal_bind for IPv4 and the sysctl net.ipv6.ip_nonlocal_bind for IPv6. Services can then bind to IP addresses not configured on any network interface, but they will not be accessible until those IP addresses are configured on an interface.

Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
  • 1
    For most of the services, listening on 0.0.0.0 would be fine, but nginx needs to run different services on different IP addresses. Are there any negative implications to allowing the services to attempt to bind to arbitrary IP addresses? (Assuming, of course, that the system administrator is halfway competent and doesn't configure services with bogus addresses!) – miken32 Nov 24 '18 at 01:59
  • It might be confusing if an admin isn't expecting it. It's not normally done to configure services this way. Most of the time it isn't necessary as a host usually isn't on two or more networks at once. In the case of a web server, configuring a specific IP address to listen on means that a virtual host is inaccessible from other IP addresses (including the localhost addresses), which can make host agent based monitoring tricky. Overall it's just something to be avoided unless you really, really have a need for it. – Michael Hampton Nov 24 '18 at 02:05
  • This is a FreePBX box which includes a web-based administration tool and a user-facing portal. The "recommended" setup runs both these through the same virtual host, just sticks the admin stuff under an `/admin` directory. But I do not want a web app running as the asterisk user on a public IP address! So I need a completely separate vhost, php-fpm instance, etc. Admin runs on the private IP, which we access via VPN, user portal runs on the public IP. – miken32 Nov 24 '18 at 02:11
  • FreePBX has its own internal controls for this, as part of its special firewall. Consider saving yourself a lot of trouble and using that instead. – Michael Hampton Nov 24 '18 at 02:13
  • We're not running their distro – miken32 Nov 24 '18 at 02:14
  • 1
    Oh, well in that case... I'll buy you a beer, you're going to need it! – Michael Hampton Nov 24 '18 at 02:14
  • LOL I've been doing this a few years now, currently have about 60 PBX VMs running, but am still using the template I set up on EL 6. Trying to move to EL 7 on new instances, and this is the last hurdle. – miken32 Nov 24 '18 at 02:19
  • 1
    Hmm. I think if I were you I would just make Apache listen on localhost, and stick a copy of nginx in front of it, make that public facing, and filter by URL and IP address/network in the nginx configuration. Then I wouldn't have to disturb FreePBX's Apache customizations much or at all. – Michael Hampton Nov 24 '18 at 02:22
  • Memory's pretty tight, don't have the space for two web servers to be running. I'll give your answer a try tomorrow though, and mark accepted if nobody else comes up with a way to delay that service start. – miken32 Nov 24 '18 at 02:25
4

If you're using NetworkManager, then in order for network-online.target to work as expected, you need to enable service NetworkManager-wait-online.service, which is the one that actually waits for the network to be online to satisy that target.

The network-online.target needs to be "hooked" into your network manager (since NetworkManager is not the only alternative, there is also systemd-networkd which can be used to manage the network.)

For network-online.target to work with NetworkManager, you need to have a symlink under /etc/systemd/system/network-online.target.wants/ pointing to /usr/lib/systemd/system/NetworkManager-wait-online.service.

Which you can actually create by enabling that service:

$ sudo systemctl enable NetworkManager-wait-online.service
Created symlink from /etc/systemd/system/network-online.target.wants/NetworkManager-wait-online.service to /usr/lib/systemd/system/NetworkManager-wait-online.service.

Once that's in place, dependencies on network-online.target should start working, waiting until NetworkManager is done bringing up all interfaces it's supposed to bring up at boot.

To help diagnose any issues with that setup, you might want to look at output of systemctl status network-online.target and systemctl status NetworkManager-wait-online.service as well, as they might have more clues about what is going on. (In particular, the timestamps might be helpful, if the daemons that depend on network-online.target are starting before NetworkManager-wait-online.service is finished, then you might have an issue with your configuration.)


Of the solutions you listed, I'd recommend this one:

# mkdir /etc/systemd/system/sshd.service.d
# tee /etc/systemd/system/sshd.service.d/wait.conf << 'EOF'
[Unit]
Wants=network-online.target
After=network-online.target
EOF

Since network-online.target is the one you actually want (to ensure all IPs are up, etc.) and including Wants= makes sure its startup will be requested.

From the other methods, this one won't work: systemctl add-wants multi-user.target network.target, since it's not creating any dependencies between the services themselves (SSH daemon, etc.) and the network being fully up. It's just saying you want the network to be up...

And the one involving the /etc/systemd/system/sshd.service.requires/ directory is missing the After= dependency (which I believe is essential, and not implied by just having a .requires/ on it.) If you think Requires= is better than Wants= (it's stronger, causes the unit to fail if the dependency fails), then I'd recommend just using that in /etc/systemd/system/sshd.service.d/wait.conf instead, the override file is definitely a more flexible way to manage this configuration.

Adding a dependency on sys-subsystem-net-devices-eth1.device doesn't help either, since that only indicates that the device exists (from the point of view of udev), which says nothing about it being up and configured yet. So that's not an option either.

filbranden
  • 652
  • 5
  • 9
  • 1
    No symlink, and NetworkManager-wait-online.service is disabled. Going to enable and create the symlink. Hopefully you're on to something! – miken32 Nov 24 '18 at 19:20
  • 1
    Didn't even need to create the symlink. Output from `systemctl enable NetworkManager-wait-online.service` was "Created symlink from /etc/systemd/system/network-online.target.wants/NetworkManager-wait-online.service to /usr/lib/systemd/system/NetworkManager-wait-online.service." – miken32 Nov 24 '18 at 20:27
  • 1
    That's done the trick, thanks. Not sure why that service is disabled by default. – miken32 Nov 25 '18 at 19:07
  • Great to hear this worked for you @miken32! I have reordered and rephrased the answer a little, since the `NetworkManager-wait-online.service` part was the most important one, I brought it first. I also incorporated your suggestion on `systemctl enable` working to activate it. Let me know if you think the answer can still be improved somehow. Cheers! – filbranden Nov 27 '18 at 00:50