1

I have a number of small services that are running spread across two servers:

  • Server A) manages high security services: User accounts and personal messaging.
  • Server B) manages low security services: Image uploads and public stuff.

Here comes the issue. In somewhat unpredictable intervals the server will start timing out, without noticeable CPU, memory or disk activity.

I soon found out that the issue is that server A is running two services that rely on each other to work. For this example I'll call them Service A.A and A.B. When service A.B receives a request, it will curl A.A to retrieve data about the user's account in an oAuth like fashion.

I determined that the issue is a deadlock within PHP-FPM. When service A.B receives n requests (n being the number of simultaneous processes PHP-FPM is allowed to spawn) before it has managed to send out a request to the service on the same machine it will start queueing the requests.

Obviously, if it already has all it's processes allocated, PHP-FPM will stop processing any new requests. Sadly, this includes the requests it's curling on the same server. Therefore, the server will be effectively dead (taking server B with it).

At first I found the solution to be rather simple: I created several PHP-FPM pools, allowing the applications to be run in parallel. This alleviates the problem, since it now allows PHP-FPM to create more processes overall, but doesn't fix it.

I am uncertain what could be the exact cause of the issue but since the issue seems to be unchanged: the server will idle but spawn the max amount of PHP-FPM processes possible for one of the services. I am assuming that the deadlock is now on NginX's side.

I don't wanna claim to understand how nguni works, but as far as I can understand if there's n+1 (n still being the PHP-FPM process limit for one of the pools) requests for service A.B - which depends on A.A - NginX will wait for PHP-FPM to take that request.

Would love to know if there's the option to have "2 request queues" for the two different services or if there's anything else that's wrong with my approach.

-- Of course, if there is any diagnostic / log I should run / provide. I will be happy to provide it.

César
  • 111
  • 2
  • You've already created two request queues of a sort with your second php-fpm pool, which has solved the problem in a simple and reasonably elegant way. Can you be more clear on what your current problem is? Also when you say "two requests queues" what level / layer / software are you referring to? – Tim Aug 19 '17 at 18:16
  • @Tim The issue is that the deadlock still persists. I am unsure what software layer is affected. But I suppose it's the fact that nginx is trying to pass along the requests to PHP-FPM, therefore getting blocked when one pool is saturated (while the other isn't) and therefore getting deadlocked just like it did before. – César Aug 20 '17 at 11:55
  • What does `php-fpm.log` show? – Tero Kilkanen Aug 20 '17 at 16:32
  • I wonder if reducing the number of requests that need to hit PHP would help. Is it feasible to cache pages for some / anonymous users? That makes things a lot faster, reduces PHP load, and could work around this problem. Given your setup you may need to scale to more hardware. Alternately you could include the code from the dependent service in the first service, but that's not very clean. – Tim Aug 20 '17 at 20:13

0 Answers0