k8s:从ECR中拉出图像时出错

CI升级期间,我们经常得到Waiting: ImagePullBackOff 。 任何人都知道发生了什么? k8s群集1.6.2通过kops安装。 在升级过程中,我们做了kubectl set image ,在过去的2天,我们看到以下错误Failed to pull image "********.dkr.ecr.eu-west-1.amazonaws.com/backend:da76bb49ec9a": rpc error: code = 2 desc = net/http: request canceled Error syncing pod, skipping: failed to "StartContainer" for "backend" with ErrImagePull: "rpc error: code = 2 desc = net/http: request canceled"

journalctl -r -u kubelet Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: W0726 09:32:40.731903 840 docker_sandbox.go:263] NetworkPlugin kubenet failed on the status hook for pod "backend-1277054742-bb8zm_default": Unexpected command output nsenter: cannot open : No such file or directory Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724387 840 generic.go:239] PLEG: Ignoring events for pod frontend-1493767179-84rkl/default: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724371 840 kuberuntime_manager.go:858] getPodContainerStatuses for pod "frontend-1493767179-84rkl_default(0fff3b22-71c8-11e7-9679-02c1112ca4ec)" failed: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724358 840 kuberuntime_container.go:385] ContainerStatus for 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d error: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724329 840 remote_runtime.go:269] ContainerStatus "2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d" from runtime service failed: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: with error: exit status 1 Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: W0726 09:32:40.731903 840 docker_sandbox.go:263] NetworkPlugin kubenet failed on the status hook for pod "backend-1277054742-bb8zm_default": Unexpected command output nsenter: cannot open : No such file or directory Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724387 840 generic.go:239] PLEG: Ignoring events for pod frontend-1493767179-84rkl/default: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724371 840 kuberuntime_manager.go:858] getPodContainerStatuses for pod "frontend-1493767179-84rkl_default(0fff3b22-71c8-11e7-9679-02c1112ca4ec)" failed: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724358 840 kuberuntime_container.go:385] ContainerStatus for 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d error: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: E0726 09:32:40.724329 840 remote_runtime.go:269] ContainerStatus "2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d" from runtime service failed: rpc error: code = 2 desc = Error: No such container: 2421109e0d1eb31242c5088b547c0f29377816ca068a283b8fe6c2d8e7e5874d Jul 26 09:32:40 ip-10-0-49-227 kubelet[840]: with error: exit status 1

尝试运行kubectl create configmap -n kube-system kube-dns

有关更多上下文,请查看kubernetes 1.6的已知问题https://github.com/kubernetes/kops/releases/tag/1.6.0

这可能是由于在创build图层的内容同步到磁盘之前发生closures的已知的docker bug造成的。 该修补程序包含在docker v1.13中。

解决办法是删除空文件并重新拉图像。