其中一个容器间歇性地不能访问群中的另一个容器

我使用docker swarm(17.06 CE)来编排我的微服务。 群集有3个经理和1个工人。

我在全球范围内拥有一个运行在swarmpipe理器中的Nginx镜像。 我有一个基于Java的微服务,在同一个覆盖networking中有两个副本。

现在我发现一个Nginx容器不能访问微服务。 另外两个Nginx容器可以正常访问服务。

### there are three nginx containers in swarm ➜ ~ docker service ps pilipa-prod-nginx ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS qufld0uu8tk9 pilipa-prod-nginx.4r2p0t892qn55n4uewoymxbp0 registry.i-counting.cn/pilipa/prod/nginx:latest node02 Running Running 21 hours ago bwjw9c9dm8e1 pilipa-prod-nginx.ixw4urfkdcnkm326vgkw92x8n registry.i-counting.cn/pilipa/prod/nginx:latest node01 Running Running 21 hours ago 2w2gg83xt6g4 pilipa-prod-nginx.5t63dl8dcj603iyw5l5vv0xvx registry.i-counting.cn/pilipa/prod/nginx:latest node03 Running Running 21 hours ago ### log in the normal Nginx, it can access the micro service without problem ➜ ~ docker exec --interactive --tty pilipa-prod-nginx.4r2p0t892qn55n4uewoymxbp0.qufld0uu8tk9ieubcimed8fgw sh / # ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 10901: eth0@if10902: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP link/ether 02:42:0a:00:00:2c brd ff:ff:ff:ff:ff:ff inet 10.0.0.44/24 scope global eth0 valid_lft forever preferred_lft forever inet 10.0.0.11/32 scope global eth0 valid_lft forever preferred_lft forever 10903: eth1@if10904: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:13:00:09 brd ff:ff:ff:ff:ff:ff inet 172.19.0.9/16 scope global eth1 valid_lft forever preferred_lft forever / # wget 10.0.0.71:8080 Connecting to 10.0.0.71:8080 (10.0.0.71:8080) wget: server returned error: HTTP/1.1 401 Unauthorized ### log in the problematic Nginx container which can ping the host of micro service, but can NOT access the service ➜ ~ docker exec --interactive --tty pilipa-prod-nginx.ixw4urfkdcnkm326vgkw92x8n.bwjw9c9dm8e1qlx64z5sniw7h sh / # / # / # wget 10.0.0.71:80 Connecting to 10.0.0.71:80 (10.0.0.71:80) wget: can't connect to remote host (10.0.0.71): Connection refused / # ping 10.0.0.71 PING 10.0.0.71 (10.0.0.71): 56 data bytes 64 bytes from 10.0.0.71: seq=0 ttl=64 time=0.066 ms 64 bytes from 10.0.0.71: seq=1 ttl=64 time=0.076 ms 64 bytes from 10.0.0.71: seq=2 ttl=64 time=0.073 ms ^C --- 10.0.0.71 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.066/0.071/0.076 ms 

Upate:

我试图使用tcpdump捕获微服务主机中的stream量。 当使用ping 10.0.0.71wget 10.0.0.71:8080来访问服务时,我可以捕获正常的Nginx容器的stream量。 然而,从有问题的Nginx容器中没有捕获到pingwgetstream量!

这是一个虫群覆盖networking已知的错误或在我的环境中的一些错误configuration?