7

I'm having a hard time figuring out what HEALTHCHECK really is used for when running Docker in swarm mode.

One place suggests that Docker will restart a task which is considered unhealthy. Another place explains that Docker will stop sending traffic to tasks that are unhealthy. The Docker documentation itself only explains what the HEALTHCHECK directive is, and how to configure it. It makes no attempt to explain what happens when a task goes unhealthy.

In other words, I'm struggling to find a clear and trustworthy explanation of what HEALTCHECK really does.

Furthermore, looking at the Docker REST API, this particular piece of data (is a task healthy or not) is not even exposed for tasks (it is exposed for containers though). This makes it hard to use this metric for monitoring a Docker Swarm, so it doesn't seem to me that this is the primary purpose of the metric either.

What really happens when a task becomes unhealthy when running Docker in swarm mode?

sbrattla
  • 1,456
  • 3
  • 26
  • 48

1 Answers1

11

You setup healthchecks the same ways your first link suggests. All those ways will tell docker what command to run, how often to run it, etc.

If you use docker run to start a container, the UI will show unhealthy when healthchecks fail, but docker will do nothing to the container. It's up to you or some higher level monitoring solution to act on it.

If you use docker service create (or docker stack deploy) to create a Swarm service and that healthcheck fails, it will stop/kill the task (container) and reschedule a new task to replace that replica of the service. During the stop/kill (it tries to gracefully stop it, but kills after 10s like all docker containers), Swarm will stop overlay inbound traffic to that task like it does for all stopping tasks.

Bret Fisher
  • 3,963
  • 2
  • 20
  • 25
  • 1
    Could I replace jwilder's "Dockerize" with service level healthchecks? For example, would utilizing a healthcheck grace period for a DB service block other services in the swarm from communicating with the DB service until the healthcheck returned successfully? – user83948 Jan 17 '19 at 00:00
  • Yes. that's what makes zero downtime updates possible: https://www.youtube.com/watch?v=dLBGoaMz7dQ – Bret Fisher Jan 17 '19 at 05:24
  • @BretFisher Have you any idea how docker-compose or docker-swarm handles the `starting` health status? I.e. let's say that we have a web-app as scaled service with exposed port and a docker container of this service has `starting` health status. Will this container be included in the pool of available instances? – Panagiotis Simakis Oct 09 '19 at 15:53
  • 1
    @SimakisPanagiotis docker-compose doesn't do anything with healthchecks other than report their status. In Swarm, it shouldn't send connections to that task (container) while it's in any state other than healthy – Bret Fisher Oct 13 '19 at 02:19