Friday, January 14, 2011

Numerous TIME-WAIT connections

I am running on a Solaris system. I wrote a Korn shell script that every 30 seconds runs this line: netstat -a | grep TIME-WAIT | wc -l

This has been working fine for a year. Now I move into a new lab and when I run it the number of connections in the TIME-WAIT grows from 80 to 32000.

Most of these connections are to an ldap server running on a different box on the local network.

Has anybody seen this behavior before? How did you fix it?

Thank you.

  • is nscd running? I'm guessing that previously it was and now it's not. nscd is used to cache certain types of directory data (specifically for group and passwd). If it's not running, then whenever you're doing lookups that require hitting the directory server, they'd (potentially) have to establish a connection to the ldap host instead of first querying the cache.

    Since you don't state what version of solaris, I'll assume it's 10. You can check to see if nscd is running by doing a:

    svcs -l name-service-cache

    and see if it's in the online state. If it is not, you can try to restart it with a:

    svcadm refresh name-service-cache

    If you still get failures, you should check out the logfile for it, which you can find in the output of the above svcs command.

    Mike Jr : I am checking it out and will get back to you. Thank you!
  • TIME-WAIT state is often caused by broken socket closing logic in the application. You need to identify the processes that are on each end, check each one to see what it's doing, and check the other server.

    Try to identify if the inidividual TIME-WAIT connections remain for a long time, or whether there are a lot of connections that are created then dropped in a short period of time.

0 comments:

Post a Comment