"search1" is an AWS elasticsearch service. It has an access policy that only lets traffic through from selected IP addresses. My understanding is that AWS implements this as an ELB in front of a VPC that I cannot access.
"esproxy" is a AWS EC2 instance to act as a proxy to search1. On esproxy, nginx is configured to require (https) basic auth, and anything with that gets proxied to search1. 
It works. For a while, hours, or a day. Then every request starts giving "504 Gateway Time-out" errors. nginx still responds instantly to give out 401 auth required errors, but with auth it takes two minutes to get a timeout back. Neither side seems to be under much load when this happens and a restart of nginx fixes it. And really, traffic through the proxy is not heavy, a few thousand hits a day. 
Trying to understand the problem, I tried to use openssl like telnet:
openssl s_client -connect search1:443
[many ssl headers including certs shown rapidly]
GET / HTTP/1.1
Host: search1
HTTP/1.1 408 REQUEST_TIMEOUT
Content-Length:0
Connection: Close
It takes about a minute for that 408 timeout to come back to me. Aha, I think, this particular server is having issuses. But then I tried that openssl test from another host. Same delay.
Then I think to myself, hey, curl works to test https, too, now that I know the ssl layer is snappy. Well, with curl access works, even while nginx and openssl are timing out from the esproxy at that same time.
So I think, maybe something about the headers? curl has different headers than I'm typing into openssl. 
I modified a low level http/https tool to let me easily send specific headers. And I found it doesn't seem to be lack of or extra headers, but the line endings. nginx (and apache) don't care if you use DOS-style line endings (correct to HTTP spec) or Unix-style (incorrect). The search1 instance (either elastic search itself or the ELB) apparently cares a lot.
Without knowing a whole lot about nginx, I have these questions:
- Could the source of my proxy timeouts be a bunch of existing connections caught up with bad request line endings?
- How can I tell?
- It might not be since the timeouts are different (one vs two minutes).
 
- Does nginxcorrect line endings on proxied requests by default?- If not, can it be forced to?
 
- AND if the line endings is a red herring, how can I get nginxto help me figure this out? All I see in the log is "upstream timed out (110: Connection timed out) while reading response header from upstream", which doesn't improve my understanding of the issue.
I found this issue earlier in my debugging:
nginx close upstream connection after request
And I've already fixed the nginx conf to use a 1.1 proxy as outlined there. Relevant conf:
upstream search1 {
  server search1-abcdefghijklmnopqrstuvwxyz.us-east-1.es.amazonaws.com:443;
  # number of connections to keep alive, not time
  keepalive 64;
}
location / {
  proxy_set_header   X-Forwarded-For $remote_addr;
  proxy_set_header   Host "search1-abcdefghijklmnopqrstuvwxyz.us-east-1.es.amazonaws.com"
  # must supress auth header from being forwarded
  proxy_set_header   Authorization "";
  # allow keep-alives on proxied connections
  proxy_http_version 1.1;
  proxy_set_header Connection "";
  proxy_pass         https://search1;
}
