Kubernetes Kubelet无法访问Docker

我有一个5节点的Kubernetes集群,其中1个是master(使用kubeadm设置)。 当我第一次部署主节点时,我还部署了kubernetes仪表板,使它在同一台机器上运行。 之后,我join了群集的其他节点。

现在,当我使用YAML文件部署一个容器时,它保持在ContainerCreating状态。 所以我描述了这个吊舱,看到了它被部署的机器。 我在机器SSH'd和检查第一docker ps -a我可以确定图像不开始,甚至是拉。 所以我看着kubelet日志(我没有复制所有东西,但是这会给出一个非常好的主意):

 E0131 11:05:40.486422 2873 server.go:459] Kubelet needs to run as uid `0`. It is being run as 1000 W0131 11:05:40.486616 2873 server.go:469] write /proc/self/oom_score_adj: permission denied W0131 11:05:40.486978 2873 server.go:669] No api server defined - no events will be sent to API server. W0131 11:05:40.491423 2873 kubelet_network.go:69] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth" I0131 11:05:40.491498 2873 kubelet.go:477] Hairpin mode set to "hairpin-veth" W0131 11:05:40.495353 2873 plugins.go:210] can't set sysctl net/bridge/bridge-nf-call-iptables: open /proc/sys/net/bridge/bridge-nf-call-iptables: permission denied I0131 11:05:40.503259 2873 docker_manager.go:257] Setting dockerRoot to /var/lib/docker I0131 11:05:40.503308 2873 docker_manager.go:260] Setting cgroupDriver to cgroupfs I0131 11:05:40.506028 2873 server.go:770] Started kubelet v1.5.2 E0131 11:05:40.506209 2873 server.go:481] Starting health server failed: listen tcp 127.0.0.1:10248: bind: address already in use E0131 11:05:40.506300 2873 kubelet.go:1145] Image garbage collection failed: unable to find data for container / I0131 11:05:40.506413 2873 server.go:123] Starting to listen on 0.0.0.0:10250 W0131 11:05:40.506445 2873 kubelet.go:1224] No api server defined - no node status update will be sent. E0131 11:05:40.507209 2873 kubelet.go:1228] error creating pods directory: mkdir /var/lib/kubelet/pods: permission denied I0131 11:05:40.509613 2873 status_manager.go:125] Kubernetes client is nil, not starting status manager. I0131 11:05:40.509656 2873 kubelet.go:1714] Starting kubelet main sync loop. I0131 11:05:40.509710 2873 kubelet.go:1725] skipping pod synchronization - [error creating pods directory: mkdir /var/lib/kubelet/pods: permission denied container runtime is down] F0131 11:05:40.509522 2873 server.go:148] listen tcp 0.0.0.0:10255: bind: address already in use 

有许多权限问题。 我不知道如何解决这个问题。 我已经将root和用户帐户添加到docker组中,以查看它是否修复它,但是不是。

更新

上面我做了一个kubelet logs ,这就是为什么你得到的uid消息。 当我执行sudo kubelet logs我得到了这些结果:

 I0201 15:36:01.386564 5082 feature_gate.go:181] feature gates: map[] W0201 15:36:01.386861 5082 server.go:400] No API client: no api servers specified I0201 15:36:01.386953 5082 docker.go:356] Connecting to docker on unix:///var/run/docker.sock I0201 15:36:01.386991 5082 docker.go:376] Start docker client with request timeout=2m0s I0201 15:36:01.401737 5082 manager.go:143] cAdvisor running in container: "/user.slice" W0201 15:36:01.415664 5082 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp [::1]:15441: getsockopt: connection refused I0201 15:36:01.431725 5082 fs.go:117] Filesystem partitions: map[/dev/mmcblk0p2:{mountpoint:/var/lib/docker/aufs major:179 minor:2 fsType:ext4 blockSize:0}] I0201 15:36:01.440439 5082 manager.go:198] Machine: {NumCores:4 CpuFrequency:1920000 MemoryCapacity:3519315968 MachineID:a9807123b38d1f069a44f0b7588b5884 SystemUUID:03000200-0400-0500-0006-000700080009 BootID:7e71fe9b-a9d8-4921-80c7-9d09e49ed1ef Filesystems:[{Device:/dev/mmcblk0p2 Capacity:57295605760 Type:vfs Inodes:3563520 HasInodes:true}] DiskMap:map[179:0:{Name:mmcblk0 Major:179 Minor:0 Size:62545461248 Scheduler:deadline} 179:8:{Name:mmcblk0boot0 Major:179 Minor:8 Size:4194304 Scheduler:deadline} 179:16:{Name:mmcblk0boot1 Major:179 Minor:16 Size:4194304 Scheduler:deadline} 179:24:{Name:mmcblk0rpmb Major:179 Minor:24 Size:4194304 Scheduler:deadline}] NetworkDevices:[{Name:datapath MacAddress:72:36:99:b2:ba:be Speed:0 Mtu:1410} {Name:dummy0 MacAddress:ea:c7:5e:6d:29:75 Speed:0 Mtu:1500} {Name:enp1s0 MacAddress:00:07:32:3e:17:8c Speed:1000 Mtu:1500} {Name:vxlan-6784 MacAddress:5a:81:bb:f6:00:d7 Speed:0 Mtu:1500} {Name:weave MacAddress:92:64:f5:c5:57:a7 Speed:0 Mtu:1410}] Topology:[{Id:0 Memory:3519315968 Cores:[{Id:0 Threads:[0] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:1 Threads:[1] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:2 Threads:[2] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:3 Threads:[3] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]}] Caches:[]}] CloudProvider:Unknown InstanceType:Unknown InstanceID:None} I0201 15:36:01.442170 5082 manager.go:204] Version: {KernelVersion:4.4.0-31-generic ContainerOsVersion:Ubuntu 16.04.1 LTS DockerVersion:1.12.3 CadvisorVersion: CadvisorRevision:} I0201 15:36:01.444559 5082 cadvisor_linux.go:152] Failed to register cAdvisor on port 4194, retrying. Error: listen tcp :4194: bind: address already in use W0201 15:36:01.449146 5082 container_manager_linux.go:205] Running with swap on is not supported, please disable swap! This will be a fatal error by default starting in K8s v1.6! In the meantime, you can opt-in to making this a fatal error by enabling --experimental-fail-swap-on. W0201 15:36:01.449653 5082 server.go:669] No api server defined - no events will be sent to API server. W0201 15:36:01.457574 5082 kubelet_network.go:69] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth" I0201 15:36:01.457658 5082 kubelet.go:477] Hairpin mode set to "hairpin-veth" I0201 15:36:01.471512 5082 docker_manager.go:257] Setting dockerRoot to /var/lib/docker I0201 15:36:01.471570 5082 docker_manager.go:260] Setting cgroupDriver to cgroupfs I0201 15:36:01.474678 5082 server.go:770] Started kubelet v1.5.2 E0201 15:36:01.474926 5082 server.go:481] Starting health server failed: listen tcp 127.0.0.1:10248: bind: address already in use E0201 15:36:01.475062 5082 kubelet.go:1145] Image garbage collection failed: unable to find data for container / W0201 15:36:01.475208 5082 kubelet.go:1224] No api server defined - no node status update will be sent. I0201 15:36:01.475702 5082 kubelet_node_status.go:204] Setting node annotation to enable volume controller attach/detach I0201 15:36:01.479587 5082 server.go:123] Starting to listen on 0.0.0.0:10250 F0201 15:36:01.481605 5082 server.go:148] listen tcp 0.0.0.0:10255: bind: address already in use 

您需要以root身份运行kubelet(请参阅日志的第一行)。 目前这是一个已知的限制:

https://github.com/kubernetes/kubernetes/issues/4869

kubelet工具没有logs子命令,所以当你运行kubelet logs ,实际上是在没有任何有效的参数的情况下再次启动kubelet进程。 缺less有效的参数是大多数消息来自哪里,并且它最终会停止运行,因为bind: address already in use了消息bind: address already in use ,因为可能是现有的kubelet进程(以root身份运行的进程)已经绑定到那个港口。

你如何查看kubelet的日志取决于你如何设置你的kubelet进程,我的设置IE( kops )你可以用journalctl -u kubelet ,对于其他你可能要查找的/var/log/<kubelet-log-file>.log或类似的。