9

I have a script that installs and sets up traffic server:

yum install -y trafficserver
systemctl start trafficserver

traffic_line -s proxy.config.url_remap.remap_required -v 0
traffic_line -s proxy.config.reverse_proxy.enabled -v 0

The problem is, traffic_line fails with:

[connect] ERROR (main_socket_fd 3): No such file or directory error: could not connect to management port, make sure traffic_manager is running

This is because systemctl start returns immediately, without waiting for traffic server to be actually started.

Is there a way to tell systemctl start to only return once the service is started?

If this is not possible, is there a command that I can run after systemctl start to actually wait for the service to be started?

Julien Jm
  • 123
  • 7
BenMorel
  • 4,215
  • 10
  • 53
  • 81
  • 1
    Why not create a .socket unit that will let systemd hold `traffic_line` until `trafficserver` has finished starting? – Ignacio Vazquez-Abrams Dec 05 '17 at 15:20
  • @IgnacioVazquez-Abrams Is a .socket unit something that can be run only once? Also, I'm concerned that even when traffic server is considered active by systemd, it's not actually accepting connections from `traffic_line` just yet. Please see my answer below! – BenMorel Dec 05 '17 at 15:22

4 Answers4

9

This is because systemctl start returns immediately, without waiting for traffic server to be actually started.

Is there a way to tell systemctl start to only return once the service is started?

systemctl start does wait for the service to be ready (except if invoked with --no-block), the service just needs to indicate that properly (i. e., not use Type=simple). If the service doesn’t tell systemd when it’s ready, no variation of systemctl is-active, systemctl show, etc. will help you.

The most elegant solution, as mentioned in the comments, would be a socket unit. systemd starts the socket, traffic_line connects to it, systemd starts the service, and traffic_line blocks until the service starts to accept connections on the file descriptor it inherited from systemd.

Alternatively, you can use either Type=forking (the service forks, and the main PID exits once the forked service is ready) or Type=notify (the service calls sd_notify(0, "READY=1") once it’s ready).

Unfortunately, all of these solutions require some support from trafficserver – use systemd’s socket instead of allocating its own, fork and wait appropriately in the main process, or call sd_notify. systemd can’t magically guess when the server is ready if the server doesn’t cooperate :)


After looking at trafficserver’s source code a bit, it looks like it might actually support Type=forking – the server is spawned by a dedicated traffic_cop command, which seems to wait until the server is up and perform some basic testing (at least the code looks like it). So if you change the service type, it might just work:

# /etc/systemd/system/trafficserver.service.d/type-forking.conf
[Service]
Type=forking
3

I finally got it to work, after several attempts.

First attempt

After digging into systemctl help I found the is-active command:

$ systemctl is-active trafficserver
active

I therefore wrote a shell script to wait until the service becomes active:

while true; do
    if [ $(systemctl is-active trafficserver) == "active" ]; then
        break
    fi

    sleep 1
done

Unfortunately, even though this script works as expected when I test it with start/stop, I was still getting the same error when running the traffic_line commands right after it. I think that the service is reported as active before the actual processes have fully started (probably a matter of milliseconds).

Second attempt

So I tried another way. Knowing that this is the very first start of the service, I can wait until the PID file of the trafficserver manager exists. Here is what I tried:

while [ ! -f /run/trafficserver/manager.lock ]; do
  sleep 1
done

Same problem: when the trafficserver manager's PID file is written, the manager is not actually ready to receive orders yet, so I'm still getting the error.

Damn, I don't want to use a blind sleep.

Third attempt

So I ended up checking that the traffic_line command itself does not fail:

while ! traffic_line --status &> /dev/null; do
    sleep 1
done

And this works!

Nice, but...

Unfortunately, the answer is very specific to the service I'm using (trafficserver), and would not directly apply to other services.

If you know a more generic answer to this question, please feel free to share it.

BenMorel
  • 4,215
  • 10
  • 53
  • 81
  • Lucas's answer actually addresses the question asked (how to make `systemctl start` wait for the service), you should accept that instead – rvalue Oct 24 '18 at 03:30
  • Put that in a helper service or in an ExecPostStart by calling bash https://unix.stackexchange.com/a/324035/8337 – rogerdpack Jul 09 '19 at 17:19
2

I'm bad in shells scripting, but I think that you'd want to test for both, the ActiveState and the SubState property if the return active and running respectively.

$ systemctl show trafficserver -p SubState,ActiveState
ActiveState=active
SubState=running

After that you should be able to run the second portion of your script.

Daniel
  • 6,780
  • 5
  • 31
  • 60
-1

Simplest way would be to add a sleep to the script:

sleep 30

Or you could do some job control as here

Simon Greenwood
  • 1,343
  • 9
  • 12
  • I'd prefer to avoid a sleep-based solution, as it's not 100% safe (although I agree that something really weird would need to happen for the first start to take more than 30s). I checked your link regarding job control, but can't get the PID with `$!` after `systemctl start`. – BenMorel Dec 05 '17 at 13:11
  • I was thinking that you would want `$!` after installation, not after `systemctl start`. Once it's in the realm of `systemctl` it's a lot easier to query the startup process. – Simon Greenwood Dec 06 '17 at 07:50