5

My ASP.NET and SQL Server 2012 application is running on Windows Server 2008 R2. Suddenly, my internet on server stopped working and my application start throwing,

An operation on a socket could not be performed because the system lacked sufficient buffer 
space or because a queue was full

Running netstat showing that PID = 0 opening a lot of ports. Netstat saying that,

Process Id = 0, State = TIME_WAIT have 130,053 ports open
Process Id = 38840, State = CLOSE_WAIT have 5 ports open
Process Id = Any, State = LISTENING have 30 ports open
Process Id = Any, State = ESTABLISHED have 10 ports open

Stats 22 Dec 2015,

CLOSE_WAIT  5   
ESTABLISHED 146
TIME_WAIT   646750
LAST_ACK    1
LISTENING   30

2 Answers2

7

You are running a web server that is accessed by browsers from multiple mobile devices.

Due to the way TCP/IP works, connections can not be closed immediately. Packets may arrive out of order or be retransmitted after the connection has been closed. CLOSE_WAIT indicates that the remote endpoint (other side of the connection) has closed the connection. TIME_WAIT indicates that local endpoint (this side) has closed the connection. The connection is being kept around so that any delayed packets can be matched to the connection and handled appropriately. The connections will be removed when they time out within default period of four minutes.

Nevertheless, the number next to your TIME_WAIT statistic, 646750, is extremely excessive. It means that 646750 connections were closed in the last 4 minutes, which makes 2694 per second! Evidently, some of these mobile devices are heavily malfunctioning and are bombarding your server with connections that are not being properly closed from the client side, or that you are serving an enormous number of clients (which makes no sense for a single server).

If you are unable to isolate which mobile devices or application are at the cause of the problem and to fix them, you don't control the client side and can only alleviate the problem on the server side.

One parameter that can improve this congestion is TcpTimedWaitDelay, described as:

Determines the time that must elapse before TCP can release a closed connection and reuse its resources. This interval between closure and release is known as the TIME_WAIT state or 2MSL state. During this time, the connection can be reopened at much less cost to the client and server than establishing a new connection.

Reducing the value of this entry allows TCP to release closed connections faster, providing more resources for new connections. However, if the value is too low, TCP might release connection resources before the connection is complete, requiring the server to use additional resources to reestablish the connection.

TcpTimedWaitDelay can be modified by regedit at HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters. It contains the number of seconds to wait. The default is 240 seconds (4 minutes). Reboot is required if changed.

For example, changing to 30 seconds and with 2694 connections per second will mean that only 80820 connections will be waiting for close. This number is still enormous, but the change will still reduce the usage of connection resources.

harrymc
  • 498,455
-1

Same question here : https://serverfault.com/questions/661476/getting-an-operation-on-a-socket-could-not-be-performed-because-the-system-lack/

Its a windows max connexion problem some kb say change the max ephemeral port or add memory :/

http://blogs.msdn.com/b/sql_protocols/archive/2009/03/09/understanding-the-error-an-operation-on-a-socket-could-not-be-performed-because-the-system-lacked-sufficient-buffer-space-or-because-a-queue-was-full.aspx

I saw this problem on physical server with very big uptime (8+ month) a reboot resolved the problem ...

YuKYuK
  • 99