1

The docker container works just fine when tested locally, upon deployment on cloud run I get 502 bad gateway. It takes approximately 50 - 60 min then for no reason starts working....causing a site down time. Have been onto this for approximately a week now no success on figuring out why this is happenning.

History:

I was deploying a Vuejs static build before, with the same backend, everything was working fine... i recently re-wrote my front end to Nuxtjs that's when the deployment problems started.

How am testing dockerfile build:

As recommended by google am using : PORT=8080 && docker run -p 9090:${PORT} -e PORT=${PORT} myImage:latest check here

Dockfile:

# Alpine Deployment Server
FROM nginx:stable-alpine as alpine-server
# install necessary packages
....
# Start server
CMD pm2 start > pm2.log && \
    gunicorn -b 0.0.0.0:5000 --workers 1 --threads 8 --timeout 0 api:app --daemon && \
    gunicorn -b 0.0.0.0:5001 --workers 1 --threads 1 --timeout 0 extras:app --daemon && \
    nohup sh -c "scrapyrt -p 7000 -i 0.0.0.0" > /dev/null 2>&1 & \
    nginx -g 'daemon off;'

NB: Nuxt is also configured to start at 0.0.0.0:3000 am using pm2 to start Nuxt

NGINX CONFIG

server {
  listen 8080;
  server_name _;

  charset utf-8;

   location / {
    proxy_redirect                      off;
    proxy_set_header Host               $host;
    proxy_set_header X-Real-IP          $remote_addr;
    proxy_set_header X-Forwarded-For    $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto  $scheme;
    ....
  }

  location ~^/api/v1/(.+)/search/(.+)$ {
    set $allowspace2 $2;
    proxy_pass          http://127.0.0.1:5000/api/v1/$1/search/$allowspace2;
    proxy_http_version  1.1;
    proxy_redirect      ~^/api/v1/(.+)/search/(.+)$ http://127.0.0.1:5000/api/v1/$1/search/$allowspace2;
    ....
  }

  location ~^/api/v1/(.+)/view/(.+)$ {
    set $allowspace2 $2;
    proxy_pass          http://127.0.0.1:5000/api/v1/$1/view/$allowspace2;
    proxy_http_version  1.1;
    proxy_redirect      ~^/api/v1/(.+)/view/(.+)$ http://127.0.0.1:5000/api/v1/$1/view/$allowspace2;
    ...
  }
  location ~^/extras/v1/(.+)/(.+)$ {
    set $allowspace2 $2;
    proxy_pass          http://127.0.0.1:5001/extras/v1/$1/$allowspace2;
    proxy_http_version  1.1;
    proxy_redirect      ~^/extras/v1/(.+)/(.+)$ http://127.0.0.1:5001/extras/v1/$1/$allowspace2;
   ....
  }
  ....

    gzip on;
    gzip_types      text/plain application/xml text/css application/javascript;
    gzip_comp_level 6;
    gzip_min_length 1000;
    gzip_proxied any;
    gzip_vary on;
}

What ERROR is logged cloud run when trying to navigate to url during 502 bad Gateway ?

*13 connect() failed (111: Connection refused) while connecting to upstream,....

What bums me out is why ??

  1. Why the upstream are offline, all services are started first then NGINX is started.

  2. Why does it take long and eventually start.

  3. Why it works just locally on deployment to cloud run its just hocus-pocus.

How do I resolve this ??

xaander1
  • 121
  • 7
  • Which url takes long to respond? --> which proxy_pass doesn't work? – ppuschmann Feb 22 '21 at 07:46
  • As for the Dockerfile: WTH? What about splitting this to single containers? – ppuschmann Feb 22 '21 at 07:47
  • All requests take long.....after 50 minutes of `502 bad gateway request` the services start and are up and running on their own. Reason why I have bundled everything together its not like you have so many options on cloud run. https://stackoverflow.com/questions/64330293/possible-to-deploy-or-use-several-containers-as-one-service-in-google-cloud-run – xaander1 Feb 22 '21 at 12:28

1 Answers1

1

Update

The problem had nothing to do with GCP finally solved the issue after I came across this answer just luck

Niko Solihin solution:

  1. Set node HTTP server to listen strictly for ipv4 by including localhost as host: server.listen(5000, 'localhost');
  2. Removed any ipv6 listen directives (listen [::]:80; or listen [::]:443 ssl default_server;).
  3. Changed location block proxy_pass to use IPs: proxy_pass http://127.0.0.1:5000 (not proxy_pass http://localhost:5000).

In my case the nuxt host config is localhost instead of 127.0.0.1

Cheers . Hope it helps someone else.

How I solved the problem old:

You can use these as optimization tips

summary


  • Deploying both static and SSR for nuxt.

The static build to be used as nginx fallback when the nuxt upstream fails. This evidently solves the problem of users getting nginx 502 bad gateway error when the api's and other stuff are not working. Instead the static build is served.

  • Reducing docker start up time

The google documentation specifies Maximum container startup time is 4 minutes See here

  1. Using entrypoint instead of CMD for some reason this reduced how regularly the container failed during deployment and instead of 50 minutes it took around 20 minutes to start on it's own.

  2. Increase your container instance memory and and vCPU

  • Bandwidth issue - Limit how many times you deploy (probably opinion based)

I don't know if this is just me however I did note that when the container exceed the free allocated bandwidth...I start running into 502 bad gateway problems. If you have a very large container limit how many times you build your container. enter image description here

In my case the first deployment of the day is between free quota and runs fine. The second is in between free and paid. I sometimes get the 502 eventually resolves itself when I interact with the static build. Anyway its just a nice idea to avoid unnecessary deployments.

NB: I also noted after deleting cloud run artifacts storage and the rest of the storage wait for a few while before deploying. Deletion is usually done in the background. Quick deletion and deployment of my project to cloud run lead to 502 sometimes.

(Will update in case I have more insights)

xaander1
  • 121
  • 7