1

I'm confused about where the problem is located, but basically I have nginx proxying websocket connections to a backend ruby thin server, which services the connections with the websocket-rails module in a Ruby on Rails application. Which all works fine except for that a lot of sockets, maybe all of them, don't get closed, so the thin server relatively quickly runs out of file descriptors.

I'm using nginx 1.4.2 and this is my config:

map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}
server {
    listen       my.ip.num.ber:80;  
    server_name admin3.mydomain.com;
    root /home/apps/mydomain/current/public;
    try_files $uri/index.html $uri @admin3.mydomain.com;  
    access_log  /var/log/nginx/admin3.access.log  combined;
    error_log  /var/log/nginx/admin3.error.log error;
    location /websocket {  
        proxy_redirect off;
        proxy_pass http://localhost:3008;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        keepalive_timeout 90;
        proxy_connect_timeout 10;
        proxy_read_timeout 60;
        proxy_send_timeout 60;
    }
}

I'm using thin 1.5.1 and this is the configuration:

port: 3008
user: ploy
group: ploy
pid: /home/apps/mydomain/shared/pids/thin.pid
timeout: 90
wait: 30
log: /home/apps/mydomain/shared/log/thin.log
max_conns: 1024
require: []
environment: production
max_persistent_conns: 512
servers: 1
threaded: false
#no-epoll: false
daemonize: true
chdir: /home/apps/mydomain/current
tag: admin3

There's only a couple of dozen active websocket connections at a time, and they seem to be established and terminated fine from the perspective of a client browser or the websocket-rails backend. But the thin server ends up with 1025 open file descriptors, mostly sockets.

ls -l /proc/`ps aux | grep "thin server" | grep -v grep | head -n 1 | awk '{print $2}'`/fd

gives this kind of thing:

lrwx------. 1 root root 64 Aug 31 15:15 993 -> socket:[1319549665]
lrwx------. 1 root root 64 Aug 31 15:15 994 -> socket:[1319549762]
lrwx------. 1 root root 64 Aug 31 15:15 995 -> socket:[1319549850]
lrwx------. 1 root root 64 Aug 31 15:15 996 -> socket:[1319549974]
lrwx------. 1 root root 64 Aug 31 15:15 997 -> socket:[1319846052]
lrwx------. 1 root root 64 Aug 31 15:15 998 -> socket:[1319549998]
lrwx------. 1 root root 64 Aug 31 15:15 999 -> socket:[1319550000]

A similar thing seems to subsequently happen for nginx:

ls -l /proc/`ps aux | grep "nginx: worker" | grep -v grep | head -n 1 | awk '{print $2}'`/fd

although the number of socket file descriptors creeps up more slowly and it takes it much longer to get to the 1025. As a matter of fact, I only saw that once.

So, I'm a little at a loss at identifying if there's a problem with my nginx config, or with thin, or it is something in the websocket-rails backend. I hope some of your trained eyes might see something obviously wrong, even if you're not familiar with the backend pieces.

1 Answers1

0

Let me answer my own question... What was wrong turned out to be nothing with the configuration laid out above, which still seems perfectly reasonable.

The author of the websocket-rails module pointed out to me that I was opening a new connection to Redis for every action triggered in the websocket module. Apparently that connection didn't get closed properly, which caused open sockets to not be closed, and caused thin to grind to a halt. Using a Redis connection set once and reused changed everything.

So, a rather obscure situation, and I'm a little embarrassed to even have presented it as a server configuration issue.