1

I'm running instances of Symfony or Drupal websites on two Debian servers, with Nginx listening to 443, Varnish listening to 80 and passing to nginx on listening custom ports 80** for each vhost.

Recently I added a new website to one of the servers. Then I began to run in this quite documented error nginx: [emerg] bind() to [::]:80 failed (98: Address already in use).

Despite there is no nginx server block at all listening to :80 port, neither any server block without listen directive, Nginx began to listen on port 80 all together with the custom ports.

sudo netstat -tlpn| grep nginx
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      4191/nginx: master  
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      4191/nginx: master  
tcp        0      0 0.0.0.0:8081            0.0.0.0:*               LISTEN      4191/nginx: master  
tcp        0      0 x.x.x.x:8082            0.0.0.0:*               LISTEN      4191/nginx: master  
tcp        0      0 y.y.y.y:8083            0.0.0.0:*               LISTEN      4191/nginx: master  
tcp        0      0 z.z.z.z:8084            0.0.0.0:*               LISTEN      4191/nginx: master  
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      4191/nginx: master  
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      4191/nginx: master  
tcp6       0      0 :::8080                 :::*                    LISTEN      4191/nginx: master  
tcp6       0      0 :::80                   :::*                    LISTEN      4191/nginx: master  
tcp6       0      0 :::8081                 :::*                    LISTEN      4191/nginx: master  
tcp6       0      0 :::443                  :::*                    LISTEN      4191/nginx: master  
tcp6       0      0 :::8000                 :::*                    LISTEN      4191/nginx: master

I already read a bunch of questions and posts about handling dual-stack IPv4 and IPv6 correct new syntax, and tried , AFAIK, all possible syntaxes such as below, no way.

Working directive before crash : listen x.x.x.x:8082; Tried adding listen [::]:8082 ipv6only=on; . No change.

I listed, and killed process many times with sudo fuser -k 80/tcp before restarting systemctl varnish, nginx, even daemon-reload...

Last, I checked my history but can't find what could have caused this sudden behavior. The lone point I'm not sure about is I changed a couple of sysctl.conf params, but hopefully reverted them, just in case, I'm not used to this part od administration : cat /etc/sysctl.conf | grep net.ipv4.conf

#net.ipv4.conf.default.rp_filter=1
#net.ipv4.conf.all.rp_filter=1
#net.ipv4.conf.all.accept_redirects = 0
# net.ipv4.conf.all.secure_redirects = 1
#net.ipv4.conf.all.send_redirects = 0
#net.ipv4.conf.all.accept_source_route = 0
#net.ipv4.conf.all.log_martians = 1

Here's my configuration.

cat /etc/nginx/nginx.conf (relevant 2 lines, no server block in it)

include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;

cat /etc/nginx/conf.d/default.conf

server {
        listen 8000 default_server;
        listen [::]:8000 ipv6only=on default_server;
        server_name _;

        listen 443 ssl default_server;
        listen [::]:443 ssl ipv6only=on default_server;
}

One of the sites-available vhosts (they all follow exactly same pattern) :

server { # this block only redirects www to non www
        listen x.x.x.x:443 ssl;
        server_name www.example.com;

        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        ssl_certificate /var/www/clients/client0/web3/ssl/example.com-le.crt;
        ssl_certificate_key /var/www/clients/client0/web3/ssl/example.com-le.key;

        return 301 https://example.com$request_uri;
        }

server {
        listen x.x.x.x:443 ssl;
        server_name example.com

        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        ssl_certificate /var/www/clients/client0/web3/ssl/example.com-le.crt;
        ssl_certificate_key /var/www/clients/client0/web3/ssl/example.com-le.key;

        location / {
            # Pass the request on to Varnish.
            proxy_pass  http://127.0.0.1;
 
            # Pass some headers to the downstream server, so it can identify the host.
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 
            # Tell any web apps like Drupal that the session is HTTPS.
            proxy_set_header X-Forwarded-Proto https;
            proxy_redirect     off;
        }
        
}
server {
        listen x.x.x.x:8082;
#       listen [::]:8082 ipv6only=on;

        server_name example.com www.example.com;

        root   /var/www/example.com/web/public;

        location / {
            # try to serve file directly, fallback to index.php
            try_files $uri /index.php$is_args$args;
        }

       location ~ ^/index\.php(/|$) {
            fastcgi_pass 127.0.0.1:8998;
            fastcgi_split_path_info ^(.+\.php)(/.*)$;
            include fastcgi_params;
            fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
            fastcgi_param DOCUMENT_ROOT $realpath_root;
            internal;
        }
        location ~ \.php$ {
           # return 404;
        }

        error_log /var/log/ispconfig/httpd/example.com/error.log;
        access_log /var/log/ispconfig/httpd/example.com/access.log combined;

        location ~ /\. {
                        deny all;
        }

        location ^~ /.well-known/acme-challenge/ {
             access_log off;
             log_not_found off;
             root /usr/local/ispconfig/interface/acme/;
             autoindex off;
             try_files $uri $uri/ =404;
        }

        location = /favicon.ico {
            log_not_found off;
            access_log off;
            expires max;
            add_header Cache-Control "public, must-revalidate, proxy-revalidate";
        }

        location = /robots.txt {
            allow all;
            log_not_found off;
            access_log off;
        }
}

cat /etc/default/varnish relevant part

DAEMON_OPTS="-a :80 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s malloc,3G"

I'm wondering what could have caused a config I'm working with since years to bug ?

I carefully studied these Q&A and a bunch of doc or posts, with no success : Nginx tries to run on port 80 but the configs have been removed ; Nginx will not start (Address already in use) ; nginx - bind() to 0.0.0.0:80 failed (98: Address already in use)

EDIT

Here's the output of nginx -T. (Since body is limited to 30000 characters I had to paste it in pastebin).

Kojo
  • 165
  • 1
  • 9
  • Please post the output of `nginx -T`. – Michael Hampton Jun 27 '20 at 16:00
  • @MichaelHampton Proceeding, thanks, just give me a couple of min please, since I need to remove all the real IP and domains – Kojo Jun 27 '20 at 16:03
  • We prefer that posts not be obfuscated when possible. See [here for discussion and guidance](https://meta.serverfault.com/q/963/126632). – Michael Hampton Jun 27 '20 at 16:18
  • Well, I had to obfuscate IP and domains, but did it in a way that respects 100% the logic of the configuration (domain1, domain2...) and 1.2.3.4 /6.7.8.9 IPs... thanks a lot if you can have a look, I'm stump since yesterday morning :-(( – Kojo Jun 27 '20 at 16:29
  • 1
    What's this `include /etc/letsencrypt/le_http_01_cert_challenge.conf`? Leftover from some old certbot run? It probably shouldn't be there. – Michael Hampton Jun 27 '20 at 16:39
  • Not sure ! I first basically added the certificates with ISPCONFING UI, then added subdomains to it with CLI ... BUt you're a grand master if you permit me, here's the content of this file : `server{listen 80;listen [::]:80;server_name example.com;root /var/lib/letsencrypt/http_01_nonexistent;location = /.well-known/acme-challenge/PlsQNg7nOVxIe6CwwGpcoKTbSudji44JNZVQA57EyNE{default_type text/plain;return 200 PlsQNg7nOVxIe6CwwGpcoKTbSudji44JNZVQA57EyNE.7nkyfxInEw24UW4P7xfgJQGTMXYGQH_mzIOz6F0641Y;}}` – Kojo Jun 27 '20 at 16:43
  • The content was included in your paste, so no need to paste it again. Of course that challenge is long since expired, which is why I think it's old and unnecessary (and probably causing the problem). – Michael Hampton Jun 27 '20 at 16:45
  • Moving it to another place gives nginx: [emerg] open() "/etc/letsencrypt/le_http_01_cert_challenge.conf" failed (2: No such file or directory) – Kojo Jun 27 '20 at 16:49
  • But you didn't remove the `include` that calls it! – Michael Hampton Jun 27 '20 at 16:49
  • My bad, I need a rest :-(. Ok fine I could reload services pffffffff. You deserve a couple of beers ! But in 2 words can you tell me the purpose of this challenge inclusion please ? I mean, I don't remember when it arrived in my conf...By th way many many yhanks already – Kojo Jun 27 '20 at 16:53
  • 1
    certbot does a challenge response when you get a Let's Encrypt certificate. – Michael Hampton Jun 27 '20 at 16:56
  • Last : Next time I'll run nginx -T and same 2 days of my life, and a couple of white hair. Be what it is I'll keep your external contact in case of emergency... As you said my DEVOPS is 10% OPS ... ;-) Thank you and stay safe ! – Kojo Jun 27 '20 at 17:03

1 Answers1

1

Well, thanks to @MichaelHampton, I realized the searched listen directive was hidden in a certbot challenge, called in nginx.conf with an include :

# configuration file /etc/letsencrypt/le_http_01_cert_challenge.conf:
server{listen 80;listen [::]:80;server_name example.org;root /var/lib/letsencrypt/http_01_nonexistent;location = /.well-known/acme-challenge/PlsQNg7nOVxIe6CwwGpco
KTbSudji44JNZVQA57EyNE{default_type text/plain;return 200 PlsQNg7nOVxIe6CwwGpcoKTbSudji44JNZVQA57EyNE.7nkyfxInEw24UW4P7xfgJQGTMXYGQH_mzIOz6F0641Y;}} 

This underlined 2 basic lessons (at least for myself) :

  1. In case of emergency, don't search faster, but slower, taking time to really evaluate each line you read : this include was the first line of my HTTP block !!!
  2. Specifically, for Nginx, that simple CLI Nginx -T is a killer tool : outputting an inline version of every single line of each config file, it gives a powerful way to find immediately the culprit : nginx -T | grep ':80' would have put me on the right track in seconds !
Kojo
  • 165
  • 1
  • 9