Mesos + ZooKeeper不能正常工作

我一直在build立一个由3个节点(A,B,C)组成的Mesos集群,在每个Docker容器中运行Mesos Master / Slave和ZooKeeper进程。

由于使用Ansible执行包括docker docker run集群设置,因此除了节点特定的configuration(主机名,zookeeper_myid等)之外,3个节点之间应该没有区别。

问题是…

Zookeeper节点A上的警告

Zookeeper 仅在节点A上显示以下消息。

 2015-05-25 03:28:06,060 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /<ip-nodeA>:58391 2015-05-25 03:28:06,060 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /<ip-nodeA>:58391; will be dropped if server is in ro mode 2015-05-25 03:28:06,060 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@841] - Refusing session request for client /<ip-nodeA>:58391 as it has seen zxid 0x44 our last zxid is 0xc client must try another server 2015-05-25 03:28:06,060 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /<ip-nodeA>:58391 (no session established for client) 

节点B上的Zookeeper显示以下消息。

 2015-05-25 03:12:18,594 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /<ip-nodeB>:42784 which had sessionid 0x14d89037c1e0000 2015-05-25 03:12:30,000 [myid:] - INFO [SessionTracker:ZooKeeperServer@347] - Expiring session 0x14d89037c1e0000, timeout of 10000ms exceeded 2015-05-25 03:12:30,001 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x14d89037c1e0000 2015-05-25 03:12:30,987 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /<ip-nodeB>:42853 2015-05-25 03:12:30,987 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /<ip-nodeB>:42853; will be dropped if server is in ro mode 2015-05-25 03:12:30,988 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /<ip-nodeB>:42853 2015-05-25 03:12:30,997 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - Established session 0x14d89037c1e0002 with negotiated timeout 10000 for client /<ip-nodeB>:42853 

节点C上的Zookeeper显示以下消息。

 2015-05-25 03:12:31,183 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /<ip-nodeA>:56496 2015-05-25 03:12:31,184 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection request from old client /<ip-nodeA>:56496; will be dropped if server is in ro mode 2015-05-25 03:12:31,184 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /<ip-nodeA>:56496 2015-05-25 03:12:31,191 [myid:] - INFO [SyncThread:0:ZooKeeperServer@617] - Established session 0x14d89037ccd0002 with negotiated timeout 10000 for client /<ip-nodeA>:56496 

节点B上“没有主人正在领导…”

节点C被选为主节点。 访问节点A上的mesospipe理页面已成功redirect到节点C.

但是它并没有将节点Bredirect到节点C,而是显示“No master is leading leading …”。

主节点只能检测到3个从站中的2个

在主节点(当前节点C)上,检测到3个从节点中的2个。 2个检测到的奴隶是; 节点A和C

那么,这些问题的可能原因是什么?

操作系统:CentOS 6.5

Docker图片:

  • Mesos Master:redjack / mesos-master
  • Mesos Slave:redjack / mesos-slave
  • ZooKeeper:digitalwonderland / zookeeper

docker版本:

 Client version: 1.5.0 Client API version: 1.17 Go version (client): go1.3.3 Git commit (client): a8a31ef/1.5.0 OS/Arch (client): linux/amd64 Server version: 1.5.0 Server API version: 1.17 Go version (server): go1.3.3 Git commit (server): a8a31ef/1.5.0