重新启动Docker群集托pipe服务未启动

我们在服务器上安装了一个群集,并且空间不足/挂起。 我们重新启动并清除了一些空间,现在没有任何服务正在启动。

我对Docker / Swarm来说真的很陌生,因为它通常只是运行并由其他人设置。

这里是docker info

 Containers: 53 Running: 0 Paused: 0 Stopped: 53 Images: 69 Server Version: 1.12.3 Storage Driver: devicemapper Pool Name: docker-253:0-101188511-pool Pool Blocksize: 65.54 kB Base Device Size: 10.74 GB Backing Filesystem: xfs Data file: /dev/loop0 Metadata file: /dev/loop1 Data Space Used: 23.55 GB Data Space Total: 107.4 GB Data Space Available: 2.646 GB Metadata Space Used: 36.02 MB Metadata Space Total: 2.147 GB Metadata Space Available: 2.111 GB Thin Pool Minimum Free Space: 10.74 GB Udev Sync Supported: true Deferred Removal Enabled: false Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Data loop file: /var/lib/docker/devicemapper/devicemapper/data WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device. Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata Library Version: 1.02.135-RHEL7 (2016-09-28) Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: null bridge host overlay Swarm: pending NodeID: 8vcqmfxgbmsisybt3l4cy6wd7 Is Manager: true ClusterID: cfezpbtj3dyrcknx98g72creu Managers: 1 Nodes: 1 Orchestration: Task History Retention Limit: 5 Raft: Snapshot Interval: 10000 Heartbeat Tick: 1 Election Tick: 3 Dispatcher: Heartbeat Period: 5 seconds CA Configuration: Expiry Duration: 3 months Node Address: <ip> Runtimes: runc Default Runtime: runc Security Options: seccomp Kernel Version: 3.10.0-514.el7.x86_64 Operating System: Red Hat Enterprise Linux OSType: linux Architecture: x86_64 CPUs: 2 Total Memory: 15.51 GiB Name: <name> ID: SRS5:ZXV7:33PS:D4QS:C3QX:DJCD:DPRG:OAVJ:4QDG:BPSG:3K34:JBC4 Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Insecure Registries: docker.<name>.com 127.0.0.0/8 

正如你所看到的容器都停止了。 如果我尝试再次启动容器,我会得到

 docker start d601a004a6b7 Error response from daemon: network ingress not found Error: failed to start containers: d601a004a6b7 

所以我得到了群体正在pipe理的事情,但由于某种原因,这没有出现。 如果我看

 docker service ls ID NAME REPLICAS IMAGE COMMAND 0pydqzx86y1k <name>-server-4-2-2 0/1 docker.<name>.com/<name>/<name>-server:4.2.2 0yusqmt4apag <name>-4-2-5 0/1 docker.<name>.com/<name>/<name>:4.2.5 

我看到的服务,但似乎没有开始,没有直接的开始命令? https://docs.docker.com/engine/reference/commandline/service/

我用什么命令实际启动这些呢? 我应该在哪里寻找诊断问题是什么?

我认为这可能是这样的:

  docker swarm init --advertise-addr <ip> 

但是这表明:

 Error response from daemon: This node is already part of a swarm. Use "docker swarm leave" to leave this swarm and join another one. 

任何帮助或方向将不胜感激。

我也只是注意到了

 [root@usalvlhlpool1 /]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 8vcqmfxgbmsisybt3l4cy6wd7 * <server> Down Active Leader 

也在/ var / log / messages中

 Oct 17 16:43:01 usalvlhlpool1 dockerd: time="2017-10-17T16:43:01.232624716-04:00" level=error msg="controller resolution failed" module=worker task.id=03tr5rcal954b5bc4ouzeqb3k Oct 17 16:43:01 usalvlhlpool1 dockerd: time="2017-10-17T16:43:01.232667687-04:00" level=error msg="failed to start taskManager" error="invalid volume mount source, must not be an absolute path: /root/hl-mediawiki/data/static" module=worker 

我认为他是问题,但群体在哪里寻找这条道路? 我在哪里可以看