7

So I am running lsof -i | wc -l periodically and it is telling me that out of 420 lines, between 240 and 255 are in CLOSE_WAIT state. How does TCP connections enter this state?

Should I be worried and how should I troubleshoot it?

7ochem
  • 280
  • 1
  • 3
  • 12
user20414
  • 173
  • 1
  • 1
  • 3

4 Answers4

12

(I was going to edit mikegrb's answer, but decided I was butchering it a little too much)

CLOSE_WAIT means pretty much exactly what it says -- the kernel is waiting for the local process to close it's file descriptor before removing the entry. The TCP connection has been completely torn down and the far end may be under the impression that the connection is finito, but your end is holding onto things.

The only concern is that a lot of CLOSE_WAIT entries consumes kernel memory and file descriptor table entries, which can be a problem if there's great piles of them. If the entries you're looking at are transient, then it's probably just that you're cycling through a lot of TCP connections, and you're seeing a small fraction of them in the small amount of time between when the connection is closed and the process gets around to closing the file descriptor. On the other hand, if they're permanent (the ports and IP addresses don't change over time) then something is leaking descriptors and it needs to be fixed so that it always closes it's fds when it's finished with them. As mikegrb said, a newer version may already have fixed the problem, so a question on the relevant mailing list or examination of changelogs is probably warranted.

womble
  • 95,029
  • 29
  • 173
  • 228
3

CLOSE_WAIT state means that the other end sent a FIN segment to close the connection. The connection is still sort of established. It's in a mode you could think of as half duplex, allowing this end to flush any buffers, sending on the last bits of data to the end requesting the connection be closed before closing the connection from this end.

If you have lots of connections staying in CLOSE_WAIT it means that the process responsible is not closing the socket once it goes into CLOSE_WAIT. You could use tcpdump, or other network traffic capture tools, to look at the packets.

Also take a look at the process responsible. Out of curiosity what is the responsible process? It may have a newer fixed version available or maybe it's time to file a bug report ;)

mikegrb
  • 141
  • 4
0

You are probably not closing a resource (file handle, network connection) somewhere in an application that is running on the server.

0

if you operate in weak network you may tune:

  • Max number of file descriptors via ulimits and via /proc (system wide)
  • You could shorten TCP wait time via /proc
jscott
  • 24,204
  • 8
  • 77
  • 99