2

I've run into an issue that seems similar too this one; https://forums.docker.com/t/cant-access-service-in-swarm/63876. My setup is a little bit different though and I haven't found a solution to my problem yet.

The minimal, reproducible example

  1. Build a swarm cluster between atleast 3 Ubuntu 20.04 docker swarm managers.

  2. Deploy a service docker service create --name test_web --replicas 3 --publish published=8080,target=80 nginxdemos/hello

  3. Check that the containers and services were created properly and observe the failure of connecting to that service:

demi-ubu01:~/stacks$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d4a12a3c5448 nginxdemos/hello:latest "nginx -g 'daemon of…" About a minute ago Up About a minute 80/tcp test_web.2.yul33wdycarig3qoxnehgrjrz

demi-ubu01:~/stacks$ docker service ls

ID             NAME      MODE         REPLICAS   IMAGE                     PORTS
0yqd7gvggwuh   test_web      replicated   3/3        nginxdemos/hello:latest   *:8080->80/tcp
# External test:
demi-ubu01:~/stacks$ curl -I 10.100.4.5:8080     
curl: (7) Failed to connect to 10.100.4.5 port 8080: Connection refused

# Inside container to published service port:
demi-ubu01:~/stacks$ docker exec -it d4a12a3c5448 wget http://test_web:8080
Connecting to test_web:8080 (10.0.4.2:8080)
wget: can't connect to remote host (10.0.4.2): Host is unreachable

# Inside container to apps exposed port:
demi-ubu01:~/stacks$ docker exec -it d4a12a3c5448 wget http://localhost:80
Connecting to localhost:80 (127.0.0.1:80)
index.html    100% |****************************|  7217   0:00:00 ETA

The expected result of the first curl command should be a Status 200 Ok.

The detailed report

My setup is 4 nodes in total. They are identical Ubuntu 20.04 KVM virtual machines all on the same network. There are no firewalls between them. I have 3 Managers and 1 Worker (which i've only added as a step during troubleshooting).

:~/stacks$ docker node ls 
ID                            HOSTNAME     STATUS    AVAILABILITY   MANAGER STATUS   ENGINE VERSION
kcm5v64psntjxngnqkfdj1jzh *   demi-ubu01   Ready     Active         Reachable        20.10.1
uo3rljg6ax5qkjm898pyym9t1     demi-ubu02   Ready     Active         Leader           20.10.1
pysnl8sohdp4fv67gui156z4k     demi-ubu03   Ready     Active         Reachable        20.10.1
rp2otsqpnxkgbmxbpkv21yjs6     demi-ubu04   Ready     Active                          20.10.1

I can run a container normally and reach it on the local host fine.

demi-ubu01:~/stacks$ docker run -p 8080:80 -d nginxdemos/hello
de4d0a937710acb1d6d8ae3b7eb9175860b6614dfd9ce92bc972efe619ae095f

demi-ubu01:~/stacks$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES de4d0a937710 nginxdemos/hello "nginx -g 'daemon of…" 4 seconds ago Up 2 seconds 0.0.0.0:8080->80/tcp pedantic_wiles

demi-ubu01:~/stacks$ curl -I 10.100.4.5:8080 HTTP/1.1 200 OK Server: nginx/1.13.8 Date: Sat, 19 Dec 2020 17:59:23 GMT Content-Type: text/html Connection: keep-alive Expires: Sat, 19 Dec 2020 17:59:22 GMT Cache-Control: no-cache

However the same app deployed as a service using the following compose file:

demi-ubu01:~/stacks$ cat test.yml 
version: "3.6"

services: web: image: nginxdemos/hello:latest deploy: replicas: 3 resources: limits: cpus: "0.1" memory: 50M restart_policy: condition: on-failure ports: - target: 80 published: 8080 protocol: tcp mode: ingress networks: - webnet

networks: webnet: driver: overlay

It does not become reachable from any of the hosts at all:

demi-ubu01:~/stacks$ docker stack deploy -c test.yml test
Creating network test_webnet
Creating service test_web

demi-ubu01:~/stacks$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 05030ef897a1 nginxdemos/hello:latest "nginx -g 'daemon of…" 10 seconds ago Up 7 seconds 80/tcp test_web.1.kobrpkp68f2qbs4jhd6o8aebg

Trying on all of the hosts in the cluster. No firewalls here.

demi-ubu01:~/stacks$ curl -I 10.100.4.5:8080 curl: (7) Failed to connect to 10.100.4.5 port 8080: Connection refused demi-ubu01:~/stacks$ curl -I 10.100.4.9:8080 curl: (7) Failed to connect to 10.100.4.9 port 8080: Connection refused demi-ubu01:~/stacks$ curl -I 10.100.4.10:8080 curl: (7) Failed to connect to 10.100.4.10 port 8080: Connection refused demi-ubu01:~/stacks$ curl -I 10.100.4.11:8080 curl: (7) Failed to connect to 10.100.4.11 port 8080: Connection refused

demi-ubu01:~/stacks$ docker service ls ID NAME MODE REPLICAS IMAGE PORTS elvfm7o4v4zo test_web replicated 3/3 nginxdemos/hello:latest *:8080->80/tcp

I also don't see any port bindings being made on those hosts at all, so it doesn't look like any ports are being published.


INeed2Poo@demi-ubu01:~/stacks$ docker service inspect test_web
[
    ## https://pastebin.com/WqqyDnVS ##
]

demi-ubu01:~/stacks$ netstat -na | grep LISTEN tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:49152 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:24007 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN

demi-ubu01:~/stacks$ docker network ls NETWORK ID NAME DRIVER SCOPE 6e5f7e7cebc3 bridge bridge local 7a1155f87a62 docker_gwbridge bridge local ab32da8ac1ec host host local 46id8wzw4ayf ingress overlay swarm a24a40ef78f4 none null local d9l7msysdx8m test_webnet overlay swarm INeed2Poo@demi-ubu01:~/stacks$ docker network inspect 46id8wzw4ayf [ https://pastebin.com/JPA0ZBjE ]

I also can't reach the service while exec'ed into a container for that service. Execing into a container, I'm able to hit the LOCAL app port, however I cannot hit the service by name. The container CAN resolve the service name.

## Testing the app's service from the local container fails:

demi-ubu01:~/stacks$ docker exec -it 05030ef897a1 wget http://test_web:8080 Connecting to test_web:8080 (10.0.4.2:8080) wget: can't connect to remote host (10.0.4.2): Host is unreachable

Testing the app's local port from the local container is sucessful:

demi-ubu01:~/stacks$ docker exec -it 05030ef897a1 wget http://localhost:80 Connecting to localhost:80 (127.0.0.1:80) index.html 100% |****************************| 7217 0:00:00 ETA

demi-ubu01:~/stacks$ docker --version
Docker version 20.10.1, build 831ebea

I've also changed the default-addr-pool for the swarm cluster from the original 10.0.0.0/8 network to:

demi-ubu01:~$ docker info --format '{{json .Swarm.Cluster.DefaultAddrPool}}'
["10.135.0.0/16"]

I've gone and made sure that I'm not using any overlapping networks that might be causing this and have gone so far as to completely redeploy the cluster. I've just about exhausted all of my troubleshooting idea's. Any Idea's?

Edit: Update: I redeployed using Ubuntu 18.04 as my base image, and the same exact setup on that (deployed using ansible) seems to work fine... So this is an issue with the current version of Docker on Ubuntu 20.04.

AlexV
  • 31

0 Answers0