Docker命令不再响应

大多数docker命令永远不会结束。 我必须用CTRL + C手动中断它们。 即使简单的命令,如docker psdocker info不响应。

然而, docker helpdocker version仍然工作。

我认为有一个像特定的容器死锁,所以与容器相关的命令将不会完成。

如何处理这种情况?


我的docker版本是1.12.3。 我不使用Swarm模式。 docker logs命令不起作用。 使用dmesg我可以看到很多I / O错误,但我不知道它是否与我的问题有关:

 [12898.121287] loop: Write error at byte offset 8882749440, length 4096. [12898.122837] loop: Write error at byte offset 8883666944, length 4096. [12898.124685] loop: Write error at byte offset 8882814976, length 4096. [12898.126459] loop: Write error at byte offset 8883404800, length 4096. [12898.128201] loop: Write error at byte offset 8883470336, length 4096. [12898.129921] loop: Write error at byte offset 8883535872, length 4096. [12898.131774] loop: Write error at byte offset 8883601408, length 4096. [12898.133594] loop: Write error at byte offset 8883732480, length 4096. [12917.269786] loop: Write error at byte offset 8883798016, length 4096. [12917.270331] quiet_error: 632 callbacks suppressed [12917.270334] Buffer I/O error on device dm-6, logical block 1313320 [12917.270540] lost page write due to I/O error on dm-6 [12917.270543] Buffer I/O error on device dm-6, logical block 1313321 [12917.270740] lost page write due to I/O error on dm-6 [12917.270742] Buffer I/O error on device dm-6, logical block 1313322 [12917.270957] lost page write due to I/O error on dm-6 [12917.270959] Buffer I/O error on device dm-6, logical block 1313323 [12917.271177] lost page write due to I/O error on dm-6 [12917.271179] Buffer I/O error on device dm-6, logical block 1313324 [12917.271377] lost page write due to I/O error on dm-6 [12917.271379] Buffer I/O error on device dm-6, logical block 1313325 [12917.271573] lost page write due to I/O error on dm-6 [12917.301759] loop: Write error at byte offset 8883863552, length 4096. [12917.312038] loop: Write error at byte offset 8883929088, length 4096. [12917.312396] Buffer I/O error on device dm-6, logical block 1313328 [12917.312635] lost page write due to I/O error on dm-6 [12917.312638] Buffer I/O error on device dm-6, logical block 1313329 [12917.312867] lost page write due to I/O error on dm-6 [12917.312869] Buffer I/O error on device dm-6, logical block 1313330 [12917.313121] lost page write due to I/O error on dm-6 [12917.313123] Buffer I/O error on device dm-6, logical block 1313331 [12917.313346] lost page write due to I/O error on dm-6 [13090.853726] INFO: task kworker/u8:0:17212 blocked for more than 120 seconds. [13090.854055] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 

使用命令sudo systemctl status -l docker ,会打印下面的消息,但是我不知道它们是否相关:

 dockerd[1344]: time="2016-11-24T17:49:01.184874648+01:00" level=warning msg="libcontainerd: container c9f35af1836bf856001ca6156663f713c1217a697e8d2451927c67797fb5a770 restart canceled" dockerd[1344]: time="2016-11-24T17:49:02.627116016+01:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]" dockerd[1344]: time="2016-11-24T17:49:02.627152661+01:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]" dockerd[1344]: time="2016-11-24T18:19:51.472701647+01:00" level=warning msg="libcontainerd: container c9f35af1836bf856001ca6156663f713c1217a697e8d2451927c67797fb5a770 restart canceled" dockerd[1344]: time="2016-11-24T18:19:56.712126199+01:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]" dockerd[1344]: time="2016-11-24T18:19:56.712159759+01:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]" dockerd[1344]: time="2016-11-24T18:34:24.301786606+01:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4]" dockerd[1344]: time="2016-11-24T18:34:24.302208751+01:00" level=info msg="IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844]" 

我删除一个容器后,Docker命令挂起错误。

守护进程dockerd处于exception状态:停止后( service docker stop ),它不能启动( sudo service docker start )。

 # sudo service docker start Redirecting to /bin/systemctl start docker.service Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details. # journalctl -xe kernel: device-mapper: ioctl: unable to remove open device docker-253:0-19468577-d6f74dd67f106d6bfa483df4ee534dd9545dc8ca ... systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE systemd[1]: Failed to start Docker Application Container Engine. systemd[1]: Unit docker.service entered failed state. systemd[1]: docker.service failed. polkitd[896]: Unregistered Authentication Agent for unix-process:22551:34177094 (system bus name :1.290, object path /org ESCESC kernel: dev_remove: 41 callbacks suppressed kernel: device-mapper: ioctl: unable to remove open device docker-253:0-19468577-fc63401af903e22d05a4518e02504527f0d7883f9d997d7d97fdfe72ba789863 ... dockerd[22566]: time="2016-11-28T10:18:09.840268573+01:00" level=fatal msg="Error starting daemon: timeout" systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE systemd[1]: Failed to start Docker Application Container Engine. 

而且,使用ps -eax | grep docker可以观察到许多僵尸Docker进程 ps -eax | grep docker (“STAT”列中存在“Z”),例如docker-proxies。

重新启动服务器并重新启动Docker后,僵尸进程消失,Docker命令再次运行。