由于已经使用的NodePort,Kubernetes不断移除Heapster&Grafana服务

我在Ubuntu上通过Docker在本地运行Kubernetes集群。

由于我使用Vagrant来创buildUbuntu VM,我不得不从官方的Kubernetes指南中修改docker docker run命令:

 docker run -d \ --volume=/:/rootfs:ro \ --volume=/sys:/sys:ro \ --volume=/var/lib/docker/:/var/lib/docker:rw \ --volume=/var/lib/kubelet/:/var/lib/kubelet:rw \ --volume=/var/run:/var/run:rw \ --net=host \ --privileged=true \ --pid=host \ gcr.io/google_containers/hyperkube:v1.3.0 \ /hyperkube kubelet \ --allow-privileged=true \ --api-servers=http://localhost:8080 \ --v=2 \ --address=0.0.0.0 \ --enable-server \ --hostname-override=192.168.10.30 \ --config=/etc/kubernetes/manifests-multi \ --containerized \ --cluster-dns=10.0.0.10 \ --cluster-domain=cluster.local 

另外,运行反向代理允许我通过虚拟机之外的浏览器访问集群的服务:

 docker run -d --net=host --privileged gcr.io/google_containers/hyperkube:v1.3.0 \ /hyperkube proxy --master=http://127.0.0.1:8080 --v=2 

这些步骤工作正常,最终我可以在浏览器中访问Kubernetes UI。

 vagrant@trusty-vm:~$ kubectl cluster-info Kubernetes master is running at http://localhost:8080 KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. 

现在,我想在该Kubernetes集群中运行Heapster,并使用InfluxDB后端和Grafana UI,就像本指南中所述 。 为了做到这一点,我克隆了Heapster repo,并通过添加type: NodePort来configurationgrafana-service.yaml以使用外部IP type: NodePort

 apiVersion: v1 kind: Service metadata: labels: kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-grafana name: monitoring-grafana namespace: kube-system spec: # In a production setup, we recommend accessing Grafana through an external Loadbalancer # or through a public IP. type: NodePort ports: - port: 80 targetPort: 3000 selector: name: influxGrafana 

创build服务,rcs等

 vagrant@trusty-vm:~/heapster$ kubectl create -f deploy/kube-config/influxdb/ You have exposed your service on an external port on all nodes in your cluster. If you want to expose this service to the external internet, you may need to set up firewall rules for the service port(s) (tcp:30593) to serve traffic. See http://releases.k8s.io/release-1.3/docs/user-guide/services-firewalls.md for more details. service "monitoring-grafana" created replicationcontroller "heapster" created service "heapster" created replicationcontroller "influxdb-grafana" created service "monitoring-influxdb" created vagrant@trusty-vm:~/heapster$ kubectl cluster-info Kubernetes master is running at http://localhost:8080 Heapster is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/heapster KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard monitoring-grafana is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana vagrant@trusty-vm:~/heapster$ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system heapster-y2yci 1/1 Running 0 32m kube-system influxdb-grafana-6udas 2/2 Running 0 32m kube-system k8s-master-192.168.10.30 4/4 Running 0 58m kube-system k8s-proxy-192.168.10.30 1/1 Running 0 58m kube-system kube-addon-manager-192.168.10.30 2/2 Running 0 57m kube-system kube-dns-v17-y4cwh 3/3 Running 0 58m kube-system kubernetes-dashboard-v1.1.0-bnbnp 1/1 Running 0 58m vagrant@trusty-vm:~/heapster$ kubectl get svc --all-namespaces NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes 10.0.0.1 <none> 443/TCP 18m kube-system heapster 10.0.0.234 <none> 80/TCP 3s kube-system kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 18m kube-system kubernetes-dashboard 10.0.0.58 <none> 80/TCP 18m kube-system monitoring-grafana 10.0.0.132 <nodes> 80/TCP 3s kube-system monitoring-influxdb 10.0.0.197 <none> 8083/TCP,8086/TCP 16m 

正如你所看到的,一切似乎都能顺利运行,并且我还可以通过浏览器访问Grafana的用户界面: http:// localhost:8080 / api / v1 / proxy / namespaces / kube-system / services / monitoring-grafana / 。

然而,在1分钟后,Heapster和Grafana端点从kubectl cluster-info中消失。

 vagrant@trusty-vm:~/heapster$ kubectl cluster-info Kubernetes master is running at http://localhost:8080 KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard 

浏览器输出:

 { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "endpoints \"monitoring-grafana\" not found", "reason": "NotFound", "details": { "name": "monitoring-grafana", "kind": "endpoints" }, "code": 404 } 

豆荚仍然在运行…

 vagrant@trusty-vm:~/heapster$ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system heapster-y2yci 1/1 Running 0 32m kube-system influxdb-grafana-6udas 2/2 Running 0 32m kube-system k8s-master-192.168.10.30 4/4 Running 0 58m kube-system k8s-proxy-192.168.10.30 1/1 Running 0 58m kube-system kube-addon-manager-192.168.10.30 2/2 Running 0 57m kube-system kube-dns-v17-y4cwh 3/3 Running 0 58m kube-system kubernetes-dashboard-v1.1.0-bnbnp 1/1 Running 0 58m 

但是Heapster和Grafana服务已经消失:

 vagrant@trusty-vm:~/heapster$ kubectl get svc --all-namespaces NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes 10.0.0.1 <none> 443/TCP 19m kube-system kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 19m kube-system kubernetes-dashboard 10.0.0.58 <none> 80/TCP 19m kube-system monitoring-influxdb 10.0.0.197 <none> 8083/TCP,8086/TCP 17m 

在检查kubectl cluster-info dump的输出时,我发现了以下错误:

 I0713 09:31:09.088567 1 proxier.go:427] Adding new service "kube-system/monitoring-grafana:" at 10.0.0.227:80/TCP E0713 09:31:09.273385 1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$ I0713 09:31:09.395280 1 proxier.go:427] Adding new service "kube-system/heapster:" at 10.0.0.111:80/TCP E0713 09:31:09.466306 1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$ I0713 09:31:09.480468 1 proxier.go:502] Setting endpoints for "kube-system/monitoring-grafana:" to [172.17.0.5:3000] E0713 09:31:09.519698 1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$ I0713 09:31:09.532026 1 proxier.go:502] Setting endpoints for "kube-system/heapster:" to [172.17.0.4:8082] E0713 09:31:09.558527 1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$ E0713 09:31:17.249001 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:31:22.252280 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:31:27.257895 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:31:31.126035 1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$ E0713 09:31:32.264430 1 server.go:294] Starting health server failed: E0709 09:32:01.153168 1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" ($ E0713 09:31:37.265109 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:31:42.269035 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:31:47.270950 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:31:52.272354 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:31:57.273424 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use E0713 09:32:01.153168 1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$ E0713 09:32:02.276318 1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use I0713 09:32:06.105878 1 proxier.go:447] Removing service "kube-system/monitoring-grafana:" I0713 09:32:07.175025 1 proxier.go:447] Removing service "kube-system/heapster:" I0713 09:32:07.210270 1 proxier.go:517] Removing endpoints for "kube-system/monitoring-grafana:" I0713 09:32:07.249824 1 proxier.go:517] Removing endpoints for "kube-system/heapster:" 

显然,由于nodePort已被使用,Heapster&Grafana的服务和端点被删除。 我没有在grafana-service.yaml指定一个指定的nodePort grafana-service.yaml ,这意味着Kubernetes可以select一个尚未使用的节点 – 那么这怎么会是一个错误呢? 另外,有没有办法解决这个问题?


操作系统:Ubuntu 14.04.4 LTS(可靠)| Kubernetes:v1.3.0 | Docker:v1.11.2

我遇到了一个非常类似的问题。

在grafana-service.yaml文件(可能是heapster-service.yaml文件)中,有以下行: kubernetes.io/cluster-service: 'true' 。 这个标签意味着这个服务将由插件pipe理器pipe理。 当插件pipe理器运行定期检查时,将会看到/etc/kubernetes/addons没有定义grafana / heapster服务,并将删除服务。

要解决这个问题,你有两个select:

  1. 将标签更改为kubernetes.io/cluster-service: 'false'
  2. 将控制器和服务yaml文件移动到主节点上的/etc/kubernetes/addons (或addon-manager被configuration为查找yaml文件的任何位置)。

希望有所帮助

同样的问题在我们的环境。 K8S版本= 1.3.4,Docker 1.12,Heapster是最新的主分支