4

When sudo certbot renew command is run, nginx server is crashing. The error log in systemd looks like this:

- The job identifier is 48862.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Main process exited, code=dumped, status=11/SEGV
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- An ExecStart= process belonging to unit nginx.service has exited.
-- 
-- The process' exit code is 'dumped' and its exit status is 11.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046516 (PassengerAgent) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046532 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046535 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046538 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046539 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046557 (PassengerAgent) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046516 (PassengerAgent) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046532 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046535 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046538 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046539 (nginx) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2046557 (PassengerAgent) with signal SIGKILL.
Sep 01 11:31:52 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Failed with result 'core-dump'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- The unit nginx.service has entered the 'failed' state with result 'core-dump'.

The output of the renew command looks like this:

$ sudo certbot renew --dry-run
Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/atp.flexgrid-project.eu.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Attempting to parse the version 1.18.0 renewal configuration file found at /etc/letsencrypt/renewal/atp.flexgrid-project.eu.conf with version 0.40.0 of Certbot. This might not work.
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator nginx, Installer nginx
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for atp.flexgrid-project.eu
Waiting for verification...
Cleaning up challenges
nginx: [error] open() "/run/nginx.pid" failed (2: No such file or directory)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
new certificate deployed with reload of nginx server; fullchain is
/etc/letsencrypt/live/atp.flexgrid-project.eu/fullchain.pem
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/db.flexgrid-project.eu.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Attempting to parse the version 1.18.0 renewal configuration file found at /etc/letsencrypt/renewal/db.flexgrid-project.eu.conf with version 0.40.0 of Certbot. This might not work.
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator nginx, Installer nginx
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for db.flexgrid-project.eu
Waiting for verification...
Cleaning up challenges
nginx: [alert] kill(2046610, 1) failed (3: No such process)
Attempting to renew cert (db.flexgrid-project.eu) from /etc/letsencrypt/renewal/db.flexgrid-project.eu.conf produced an unexpected error: nginx restart failed:
b''
b''. Skipping.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/phoenix.medialab.ntua.gr.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Attempting to parse the version 1.18.0 renewal configuration file found at /etc/letsencrypt/renewal/phoenix.medialab.ntua.gr.conf with version 0.40.0 of Certbot. This might not work.
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator nginx, Installer nginx
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for phoenix.medialab.ntua.gr
nginx: [alert] kill(2046610, 1) failed (3: No such process)
Cleaning up challenges
nginx: [alert] kill(2046610, 1) failed (3: No such process)
Encountered exception during recovery: 
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 70, in handle_authorizations
    resps = self.auth.perform(achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1134, in perform
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 919, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1202, in nginx_restart
    raise errors.MisconfigurationError(
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/error_handler.py", line 124, in _call_registered
    self.funcs[-1]()
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 243, in _cleanup_challenges
    self.auth.cleanup(achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1152, in cleanup
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 919, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1202, in nginx_restart
    raise errors.MisconfigurationError(
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''
Attempting to renew cert (phoenix.medialab.ntua.gr) from /etc/letsencrypt/renewal/phoenix.medialab.ntua.gr.conf produced an unexpected error: nginx restart failed:
b''
b''. Skipping.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/rabit.socialenergy-project.eu.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Attempting to parse the version 1.18.0 renewal configuration file found at /etc/letsencrypt/renewal/rabit.socialenergy-project.eu.conf with version 0.40.0 of Certbot. This might not work.
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator nginx, Installer nginx
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for rabit.socialenergy-project.eu
http-01 challenge for rat.socialenergy-project.eu
nginx: [alert] kill(2046610, 1) failed (3: No such process)
Cleaning up challenges
nginx: [alert] kill(2046610, 1) failed (3: No such process)
Encountered exception during recovery: 
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 70, in handle_authorizations
    resps = self.auth.perform(achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1134, in perform
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 919, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1202, in nginx_restart
    raise errors.MisconfigurationError(
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/certbot/error_handler.py", line 124, in _call_registered
    self.funcs[-1]()
  File "/usr/lib/python3/dist-packages/certbot/auth_handler.py", line 243, in _cleanup_challenges
    self.auth.cleanup(achalls)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1152, in cleanup
    self.restart()
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 919, in restart
    nginx_restart(self.conf('ctl'), self.nginx_conf)
  File "/usr/lib/python3/dist-packages/certbot_nginx/configurator.py", line 1202, in nginx_restart
    raise errors.MisconfigurationError(
certbot.errors.MisconfigurationError: nginx restart failed:
b''
b''
Attempting to renew cert (rabit.socialenergy-project.eu) from /etc/letsencrypt/renewal/rabit.socialenergy-project.eu.conf produced an unexpected error: nginx restart failed:
b''
b''. Skipping.
The following certs could not be renewed:
  /etc/letsencrypt/live/db.flexgrid-project.eu/fullchain.pem (failure)
  /etc/letsencrypt/live/phoenix.medialab.ntua.gr/fullchain.pem (failure)
  /etc/letsencrypt/live/rabit.socialenergy-project.eu/fullchain.pem (failure)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates below have not been saved.)

The following certs were successfully renewed:
  /etc/letsencrypt/live/atp.flexgrid-project.eu/fullchain.pem (success)

The following certs could not be renewed:
  /etc/letsencrypt/live/db.flexgrid-project.eu/fullchain.pem (failure)
  /etc/letsencrypt/live/phoenix.medialab.ntua.gr/fullchain.pem (failure)
  /etc/letsencrypt/live/rabit.socialenergy-project.eu/fullchain.pem (failure)
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates above have not been saved.)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
3 renew failure(s), 0 parse failure(s)

The renew scripts in certbot look like this:

$ sudo cat /etc/letsencrypt/renewal/atp.flexgrid-project.eu.conf 
# renew_before_expiry = 30 days
version = 1.18.0
archive_dir = /etc/letsencrypt/archive/atp.flexgrid-project.eu
cert = /etc/letsencrypt/live/atp.flexgrid-project.eu/cert.pem
privkey = /etc/letsencrypt/live/atp.flexgrid-project.eu/privkey.pem
chain = /etc/letsencrypt/live/atp.flexgrid-project.eu/chain.pem
fullchain = /etc/letsencrypt/live/atp.flexgrid-project.eu/fullchain.pem

# Options used in the renewal process
[renewalparams]
account = XXXXXXXXXXXXXXXXXXXXXXXXXXxx
authenticator = nginx
installer = nginx
server = https://acme-v02.api.letsencrypt.org/directory

The versions are:

  • certbot 0.40.0
  • nginx 1.18.0-0ubuntu1.2
  • Ubuntu 20.04.3 LTS

This error is really annoying because the server shuts down with no warning whenever the a certificate needs to be renewed.

I saw several threads in forums and issues, but they are from some years ago and don't seem to work on newer versions. One thread suggested to use snap version instead of apt for certbot, but the error remains.

The error log of nginx has the following:

==> /var/log/nginx/error.log <==
[ N 2021-09-02 08:07:25.5247 2298270/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours)
2021/09/02 08:07:25 [notice] 2298303#2298303: signal process started

[ N 2021-09-02 08:07:25.5352 2298270/T9 age/Cor/CoreMain.cpp:670 ]: Signal received. Gracefully shutting down... (send signal 2 more time(s) to force shutdown)
[ N 2021-09-02 08:07:25.5352 2298270/T1 age/Cor/CoreMain.cpp:1245 ]: Received command to shutdown gracefully. Waiting until all clients have disconnected...
[ N 2021-09-02 08:07:25.5353 2298270/Tb Ser/Server.h:901 ]: [ServerThr.2] Freed 0 spare client objects
[ N 2021-09-02 08:07:25.5353 2298270/Tc Ser/Server.h:901 ]: [ServerThr.3] Freed 0 spare client objects
[ N 2021-09-02 08:07:25.5353 2298270/Te Ser/Server.h:901 ]: [ServerThr.4] Freed 0 spare client objects
[ N 2021-09-02 08:07:25.5353 2298270/Tb Ser/Server.h:558 ]: [ServerThr.2] Shutdown finished
[ N 2021-09-02 08:07:25.5353 2298270/Tc Ser/Server.h:558 ]: [ServerThr.3] Shutdown finished
[ N 2021-09-02 08:07:25.5353 2298270/Te Ser/Server.h:558 ]: [ServerThr.4] Shutdown finished
[ N 2021-09-02 08:07:25.5354 2298270/T9 Ser/Server.h:901 ]: [ServerThr.1] Freed 0 spare client objects
[ N 2021-09-02 08:07:25.5354 2298270/T9 Ser/Server.h:558 ]: [ServerThr.1] Shutdown finished
[ N 2021-09-02 08:07:25.5355 2298270/Tg Ser/Server.h:901 ]: [ApiServer] Freed 0 spare client objects
[ N 2021-09-02 08:07:25.5355 2298270/Tg Ser/Server.h:558 ]: [ApiServer] Shutdown finished
2021/09/02 08:07:28 [notice] 2298312#2298312: signal process started
2021/09/02 08:07:28 [error] 2298312#2298312: open() "/run/nginx.pid" failed (2: No such file or directory)
[ N 2021-09-02 08:07:28.8243 2298314/T1 age/Wat/WatchdogMain.cpp:1373 ]: Starting Passenger watchdog...
[ N 2021-09-02 08:07:28.8529 2298317/T1 age/Cor/CoreMain.cpp:1340 ]: Starting Passenger core...
[ N 2021-09-02 08:07:28.8530 2298317/T1 age/Cor/CoreMain.cpp:256 ]: Passenger core running in multi-application mode.
[ N 2021-09-02 08:07:28.8674 2298317/T1 age/Cor/CoreMain.cpp:1015 ]: Passenger core online, PID 2298317
2021/09/02 08:07:29 [info] 2298343#2298343: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:63
[ N 2021-09-02 08:07:31.1196 2298317/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours)
2021/09/02 08:07:32 [notice] 2298349#2298349: signal process started

[ N 2021-09-02 08:07:32.2357 2298317/T9 age/Cor/CoreMain.cpp:670 ]: Signal received. Gracefully shutting down... (send signal 2 more time(s) to force shutdown)
[ N 2021-09-02 08:07:32.2358 2298317/T1 age/Cor/CoreMain.cpp:1245 ]: Received command to shutdown gracefully. Waiting until all clients have disconnected...
[ N 2021-09-02 08:07:32.2358 2298317/Ta Ser/Server.h:901 ]: [ServerThr.2] Freed 0 spare client objects
[ N 2021-09-02 08:07:32.2358 2298317/Tc Ser/Server.h:901 ]: [ServerThr.3] Freed 0 spare client objects
[ N 2021-09-02 08:07:32.2359 2298317/Ta Ser/Server.h:558 ]: [ServerThr.2] Shutdown finished
[ N 2021-09-02 08:07:32.2359 2298317/Tc Ser/Server.h:558 ]: [ServerThr.3] Shutdown finished
[ N 2021-09-02 08:07:32.2359 2298317/T9 Ser/Server.h:901 ]: [ServerThr.1] Freed 0 spare client objects
[ N 2021-09-02 08:07:32.2359 2298317/T9 Ser/Server.h:558 ]: [ServerThr.1] Shutdown finished
[ N 2021-09-02 08:07:32.2359 2298317/Te Ser/Server.h:901 ]: [ServerThr.4] Freed 0 spare client objects
[ N 2021-09-02 08:07:32.2359 2298317/Te Ser/Server.h:558 ]: [ServerThr.4] Shutdown finished
[ N 2021-09-02 08:07:32.2360 2298317/Tg Ser/Server.h:901 ]: [ApiServer] Freed 0 spare client objects
[ N 2021-09-02 08:07:32.2360 2298317/Tg Ser/Server.h:558 ]: [ApiServer] Shutdown finished
[ N 2021-09-02 08:07:32.2765 2298352/T1 age/Wat/WatchdogMain.cpp:1373 ]: Starting Passenger watchdog...
[ N 2021-09-02 08:07:32.3006 2298356/T1 age/Cor/CoreMain.cpp:1340 ]: Starting Passenger core...
[ N 2021-09-02 08:07:32.3008 2298356/T1 age/Cor/CoreMain.cpp:256 ]: Passenger core running in multi-application mode.
[ N 2021-09-02 08:07:32.3151 2298356/T1 age/Cor/CoreMain.cpp:1015 ]: Passenger core online, PID 2298356
[ N 2021-09-02 08:07:32.5218 2298317/T1 age/Cor/CoreMain.cpp:1325 ]: Passenger core shutdown finished

[ N 2021-09-02 08:07:34.5746 2298356/T5 age/Cor/SecurityUpdateChecker.h:519 ]: Security update check: no update found (next check in 24 hours)
2021/09/02 08:07:34 [notice] 2298386#2298386: signal process started

[ N 2021-09-02 08:07:34.8582 2298356/T9 age/Cor/CoreMain.cpp:670 ]: Signal received. Gracefully shutting down... (send signal 2 more time(s) to force shutdown)
[ N 2021-09-02 08:07:34.8583 2298356/T1 age/Cor/CoreMain.cpp:1245 ]: Received command to shutdown gracefully. Waiting until all clients have disconnected...
[ N 2021-09-02 08:07:34.8583 2298356/Tc Ser/Server.h:901 ]: [ServerThr.3] Freed 0 spare client objects
[ N 2021-09-02 08:07:34.8583 2298356/Tc Ser/Server.h:558 ]: [ServerThr.3] Shutdown finished
[ N 2021-09-02 08:07:34.8583 2298356/T9 Ser/Server.h:901 ]: [ServerThr.1] Freed 0 spare client objects
[ N 2021-09-02 08:07:34.8584 2298356/T9 Ser/Server.h:558 ]: [ServerThr.1] Shutdown finished
[ N 2021-09-02 08:07:34.8584 2298356/Tb Ser/Server.h:901 ]: [ServerThr.2] Freed 0 spare client objects
[ N 2021-09-02 08:07:34.8584 2298356/Tb Ser/Server.h:558 ]: [ServerThr.2] Shutdown finished
[ N 2021-09-02 08:07:34.8584 2298356/Te Ser/Server.h:901 ]: [ServerThr.4] Freed 0 spare client objects
[ N 2021-09-02 08:07:34.8584 2298356/Te Ser/Server.h:558 ]: [ServerThr.4] Shutdown finished
[ N 2021-09-02 08:07:34.8584 2298356/Tg Ser/Server.h:901 ]: [ApiServer] Freed 0 spare client objects
[ N 2021-09-02 08:07:34.8584 2298356/Tg Ser/Server.h:558 ]: [ApiServer] Shutdown finished
panic: memory wrap at /usr/share/perl/5.30/constant.pm line 20.
Compilation failed in require at /usr/lib/x86_64-linux-gnu/perl5/5.30/nginx.pm line 61.
BEGIN failed--compilation aborted at /usr/lib/x86_64-linux-gnu/perl5/5.30/nginx.pm line 61.
Compilation failed in require.
BEGIN failed--compilation aborted.
2021/09/02 08:07:34 [alert] 2298325#2298325: perl_parse() failed: 255
[ N 2021-09-02 08:07:35.1369 2298356/T1 age/Cor/CoreMain.cpp:1325 ]: Passenger core shutdown finished
2021/09/02 08:07:38 [notice] 2298396#2298396: signal process started
2021/09/02 08:07:39 [info] 2298398#2298398: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:63
2021/09/02 08:07:40 [notice] 2298401#2298401: signal process started
2021/09/02 08:07:40 [alert] 2298401#2298401: kill(2298325, 1) failed (3: No such process)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:40 [emerg] 2298402#2298402: still could not bind()
2021/09/02 08:07:43 [notice] 2298406#2298406: signal process started
2021/09/02 08:07:43 [alert] 2298406#2298406: kill(2298325, 1) failed (3: No such process)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:43 [emerg] 2298407#2298407: still could not bind()
2021/09/02 08:07:46 [info] 2298408#2298408: Using 32768KiB of shared memory for nchan in /etc/nginx/nginx.conf:63
2021/09/02 08:07:49 [notice] 2298414#2298414: signal process started
2021/09/02 08:07:49 [alert] 2298414#2298414: kill(2298325, 1) failed (3: No such process)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:49 [emerg] 2298416#2298416: still could not bind()
2021/09/02 08:07:52 [notice] 2298420#2298420: signal process started
2021/09/02 08:07:52 [alert] 2298420#2298420: kill(2298325, 1) failed (3: No such process)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to 0.0.0.0:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:80 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: bind() to [::]:443 failed (98: Address already in use)
2021/09/02 08:07:52 [emerg] 2298421#2298421: still could not bind()

user000001
  • 141
  • 4
  • 1
    Could you try if `certbot renew --installer none` works? I think that your problem is very similar to the one discussed here: https://github.com/certbot/certbot/issues/5486 – digijay Sep 01 '21 at 14:11
  • 1
    @digijay: Unfortunately there is the same behavior, `nginx` is still killed. I think the reason is that certbot tries to start nginx by directly invoking the binary, without systemd, and thus it fails to start. The reason is that after it is terminated some `nginx` processes are listening on 80 and 443, and they have to be killed before starting nginx from systemd – user000001 Sep 01 '21 at 14:28
  • Is Certbot 0.40 new enough to be aware of systemd? I think the current version is 1.0. – tsc_chazz Sep 01 '21 at 23:12
  • @tsc_chazz: The snap version I tried is `certbot 1.18.0`, and the behavior is identical – user000001 Sep 02 '21 at 05:15
  • You are running Phusion Passenger, I don't think it is supported by certbot. – AlexD Sep 02 '21 at 18:36

1 Answers1

0

I think that I found a (hacky) workaround to fix the issue. The steps are the following:

  1. Create an executable script in location /etc/letsencrypt/nginx_fix.sh with the following contents:

    #!/bin/bash
    for arg
    do
      if [ "$arg" = "reload" ]
      then
        exec systemctl restart nginx
      fi
    done
    exec nginx "$@"
    
  2. Add the line nginx_ctl = /etc/letsencrypt/nginx_fix.sh to the [renewalparams] block of each renewal configuration file that is managed by nginx authenticator, located at etc/letsencrypt/renewal/*.conf as follows:

    # Options used in the renewal process
    [renewalparams]
    account = XXXXXX # Use your own account key
    authenticator = nginx
    installer = nginx
    server = https://acme-v02.api.letsencrypt.org/directory
    nginx_ctl = /etc/letsencrypt/nginx_fix.sh
    

Not marking this answer as accepted, because it is suboptimal in two ways:

  1. You have to remember to edit the renewal configuration file for every domain that you add.

  2. You still get some errors in the nginx logs as follows:

    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351182 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351186 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351227 (ruby) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351630 (ruby) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351655 (ruby) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351673 (ruby) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351692 (ruby) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351755 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351191 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351194 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351197 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351198 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351200 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351201 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351205 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351210 (PassengerAgent) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351618 (file_store.rb:*) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351651 (worker-1) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351656 (ruby) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351659 (connection_poo*) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351664 (n/a) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351665 (utils.rb:110) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351677 (n/a) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351682 (utils.rb:110) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351683 (n/a) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351696 (connection_poo*) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351701 (n/a) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351708 (worker-2) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351709 (worker-3) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351710 (worker-2) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351711 (worker-3) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351729 (io-worker-1) with signal SIGKILL.
    Sep 02 09:37:24 phoenix.medialab.ntua.gr systemd[1]: nginx.service: Killing process 2351756 (PassengerAgent) with signal SIGKILL.
    

But at least it seems to be working, at least with the dry-run option.

user000001
  • 141
  • 4