I am troubleshooting a socket connection issue where a peer irregularly gets WSAETIMEDOUT (10060) from socket send() and I would like to understand the detail on the actual TCP level.
The actual implementation is done with Winsock blocking socket and has the following call pattern:
auto result = ::send(...);
if (result == SOCKET_ERROR)
{
auto err = ::WSAGetLastError();
// err can be WSAETIMEDOUT
}
As far as I understand the send returns immediately if the outgoing data is copied to the kernel buffer [as asked in another SO].
On the other hand, I assume that the error (see Steffen Ullrich's answer)WSAETIMEDOUT should be caused by missing TCP ACK from the receiving side. Right?
What I am not sure is if such WSAETIMEDOUT only happens when option SO_SNDTIMEO is set.
The default value of SO_SNDTIMEO is 0 for never timeout. Does it mean that an unsuccessful send would block forever? or is there any built-in/hard-coded timeout on Windows for such case?
And how TCP retransmission come into play?
I assume that unacknowledged packet would trigger retransmission. But what happen if all retransmission attempts fail? is the socket connection just stall? or WSAETIMEDOUT would be raised (independent from SO_SNDTIMEO)?
My assumption for my connection issue would be like this:
A current send operation returns SOCKET_ERROR and has error code with WSAETIMEDOUT because the desired outgoing data cannot be copied to kernel buffer which is still occupied with old outgoing data which is either lost or cannot be ACKed from socket peer in time. Is my understanding right?
Possible causes may be: intermediate router drops packets, intermediate network just gets disconnected or peer has problem to receive. What else?
What can be wrong on receiving side? Maybe the peer application hangs and stops reading data from socket buffer. The receive buffer (on receiver side) is full and block sender to send data.
Thanks you for clarifying all my questions.