I have a 2 node IIS cluster using Microsoft Network Load Balancing under Windows Server 2019 hosting multiple websites/application pools. The cluster ran without problems for a year. However, beginning a few months ago, around once a month, one of the two web servers will stop responding to all http/https requests. The NLB doesn't detect the server as down so half the requests to the website are failing during this time. When this happens, IIS manager locks up and an iisreset at the command line responds with "restart attempt failed". During this time, the HTTP Error logs are filled with 503 response codes/QueueFill messages. There are no specific events in the Windows System/Application logs indicating an issue. Performance on the server(s) is not at all an issue. A reboot of the troubled server makes the issue go away.
What can I do to track down the cause of this issue?