3

In a Kubernetes cluster, I have an Nginx server acting like a reverse proxy / TLS termination solution that proxypass requests to a backend Tomcat application that has some functionalities powered by Web Sockets (SockJS / Stomp). Unfortunately, the Web Sockets handshake never completes successfully.

On the Client side, in my browser, I can see the following messages in the console: Opening Web Socket... websockets-0.1.min.js:116 Whoops! Lost connection to https://myhost/stomp

Followed by a HTTP 504 Gateway Timeout.

websockets-0.1.min.js:72 WebSocket connection to 
'wss://myhost/stomp/673/ugvpxc1lwmfjnung/websocket' 
failed: Error during WebSocket handshake: Unexpected response code: 504 

--

On the tomcat side I have the following entry in the access log:

0:0:0:0:0:0:0:1,2017-06-01 16:53:36.915 
+0000,4,GET,HTTP/1.1,"/stomp/673/ugvpxc1lwmfjnung/websocket",101,
-,O,-,blablablabla,-,-,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6)
 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 
Safari/537.36",-,,-,-,-,-,-,- 

Whereas, on the nginx access log I have the corresponding entry:

10.2.89.0 - - [01/Jun/2017:16:54:41 +0000] "GET 
/stomp/673/ugvpxc1lwmfjnung/websocket HTTP/1.1" 499 0 "-" "Mozilla/5.0 
(Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) 
Chrome/58.0.3029.110 Safari/537.36" "24.5.136.13" 

Now, according to what I've researched, the 499 code is presented when the client closes the connection, but I can't figure out why it would take so long for the response to return to the client. According to the timestamps from these two entries, these two events are separated by ~1 minute. What's going on here?

Here a snippet from my nginx.conf, any assistance at this point is deeply appreciated:

server { 
    listen 9965 default_server ssl; 
    listen [::]:9965 default_server ssl; 

    resolver 127.0.0.1; 
    server_name _; 

    ssl_certificate /etc/ssl/certs/certificate.pem; 
    ssl_certificate_key /etc/ssl/certs/key.pem; 
    ssl_dhparam /etc/ssl/certs/dhparam.pem; 

    client_max_body_size 2000M; 

    location / { 
        proxy_read_timeout 900; 

        proxy_pass_header Server; 

        proxy_http_version 1.1; 
        proxy_set_header Host $host; 
        proxy_set_header X-Real-IP $remote_addr; 
        proxy_set_header Upgrade 'websocket'; 
        proxy_set_header Connection "upgrade"; 

        proxy_pass http://localhost:15010; 
    } 

--

Any ideas on how to troubleshoot this further?

theMarceloR
  • 159
  • 1
  • 1
  • 7

1 Answers1

3

Fixed. Had to replace the Classic AWS ELB with an ALB.

theMarceloR
  • 159
  • 1
  • 1
  • 7
  • I've actually read in some other place that I could have switched the inbound rules for the ELB to use TCP as opposed to HTTPS. But the ALB option worked just fine. – theMarceloR May 25 '18 at 12:27
  • Why did this work/help? – rogerdpack Nov 18 '19 at 23:39
  • 2
    @rogerdpack, I guess it's because Web Sockets (WS) and HTTP are different protocols -- So the ELB was accepting HTTPS traffic only. I didn't really have to change the LB, changing the inbound rule to a higher-level protocol like TCP would've solved the problem. But, anyway, ALB is a cheaper option and it solved the problem. – theMarceloR Nov 19 '19 at 04:44