0

I use Docker to implement a Nextcloud service. For this, I use the official nextcloud-apache image, an Nginx reverse proxy, certbot, and MariaDB. Nothing special, really.

My docker instance runs in swarm mode. All containers run together with the manager on the same host with only one replica for each service, standard overlay network(s). The swarm is started by using a standard compose file.

My setup has run stable for many months, until last night, when it mysteriously broke. As far as I see there were no updates or restarts whatsoever, not for the OS (Ubuntu Server LTS), Docker-CE or any of the images (I do all my updates manually at regular intervals and I certainly didn't do them at 4 am last night). I tracked down the cause to the Nextcloud container (but I think this is a Docker problem, hence my question here...):

The log for the Nginx reverse proxy shows the following line:

2022/04/06 20:16:45 [error] 10#10: *3 nextcloud-app could not be resolved (3: Host not found), client: 10.135.40.1, server: <redacted>, request: "GET / HTTP/1.1", host: "<redacted>"

Nginx cannot resolve the backend server and throws a 502/Bad Gateway back to the client.

I checked, and the host name for the Nextcloud container ("nextcloud-app") is indeed not registered in the docker-internal DNS (available under 127.0.0.11 in each container). I can login to any of the containers and fire off a DNS request (after running after apt-get update && apt-get install iputils-ping dnsutils inside the container(s)), the name "nextcloud-app" is not resolved anywhere. Example:

root@nextcloud-app:/var/www/html# nslookup nextcloud-app
Server:     127.0.0.11
Address:    127.0.0.11#53

** server can't find nextcloud-app: NXDOMAIN

All other container names resolve as they should. Resolving external addresses works as well. "nextcloud-app" is the only unresolvable container name.

However, I can ping to and from the nextcloud-app container using the docker-internal IP addresses directly. The connectivity is there, only the DNS resolution fails.

I have no idea how to debug this further. I didn't touch my compose.yml file for week. As far as I know, nothing has changed. Yet the setup stopped working overnight.

How can I force the Nextcloud container to register its own host name at the docker-internal DNS? Any suggestions are appreciated.

Andy
  • 1
  • 2

1 Answers1

0

I found the solution. This answer is for the poor guy who maybe faces the same problem in the future.

Turns out it was not a docker problem after all. Nextcloud went into maintenance mode during the night (for reasons still unknown, will have to investigate that next). Somehow the Nextcloud docker image could not register itself in Docker's DNS (because of a bug in the image?) when in maintenance mode and the situation got deadlocked: No DNS resolution, no accessibilitiy to Nextcloud through reverse proxy --> not noticing that Nextcloud is in maintenance mode --> staying in maintenance mode forerver, no DNS resolution, ...

If only I had thought to check if Nextcloud was in maintenance mode, it could have saved me several hours of debugging and head-scratching. Getting Nextcloud out of maintenance mode was/is a matter of a few minutes. Feeling a bit stupid right now. ;-)

Andy
  • 1
  • 2