1. Node status is NotReady 之一
Container Runtime Version: docker://Unknown
查看node 状态,发现是节点的状态是 Notready
kubectl get nodes NAME STATUS ROLES AGE VERSION vm1 Ready master 13d v1.12.5 vm2 NotReady <none> 13d v1.12.5
查看node 状态,发现是节点上的docker 版本有问题
kubectl describe node vm2 System Info: ...... Container Runtime Version: docker://Unknown
登录node , 发现docker 被删除了,重装node 的docker , 重启 docker
systemctl restart docker systemctl status docker
再回到master ,查看node 的状态,变成 ready 。 System Info 下 的 container runtime....
$ kubectl get nodes NAME STATUS ROLES AGE VERSION vm1 Ready master 13d v1.12.5 vm2 Ready <none> 13d v1.12.5 $ kubectl describe node vm2 Name: vm2 ....... System Info: ..... Container Runtime Version: docker://18.6.3
2. Node status is NotReady 之二
runtime network not ready: NetworkReady=false
查看node 状态,发现是节点的状态是 Notready
Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Wed, 01 Apr 2020 13:35:29 +0800 Tue, 31 Mar 2020 22:22:18 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Wed, 01 Apr 2020 13:35:29 +0800 Tue, 31 Mar 2020 22:22:18 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Wed, 01 Apr 2020 13:35:29 +0800 Tue, 31 Mar 2020 22:22:18 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready False Wed, 01 Apr 2020 13:35:29 +0800 Tue, 31 Mar 2020 22:22:18 +0800 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
网络有问题
You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/
忘了部署podnetwork
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
然后重启kubelet
systemctl restart kubelet
查看node 状态
kubectl describe node vm2
可以看到最后的event 正在加载
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 41m kube-proxy, vm2 Starting kube-proxy. Normal Starting 11m kubelet, vm2 Starting kubelet. Normal NodeHasSufficientMemory 11m kubelet, vm2 Node vm2 status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 11m kubelet, vm2 Node vm2 status is now: NodeHasNoDiskPressure Normal NodeHasSufficientPID 11m kubelet, vm2 Node vm2 status is now: NodeHasSufficientPID Normal NodeAllocatableEnforced 11m kubelet, vm2 Updated Node Allocatable limit across pods Normal NodeReady 7m49s kubelet, vm2 Node vm2 status is now: NodeReady
最后再查看状态, 节点已经Ready
kubectl get nodes
3. Node status is NotReady 之三
Kubelet stopped posting node status.
去 node vm2 上查看kubelet状态
systemctl status kubelet
报错: kubelet.service: Failed with result 'exit-code'.
排查后发现是swap 被打开了,再将注释掉,并使当前生效,重启kubelet
sed -i 's|/swap.img|#/swap.img|g' /etc/fstab swapoff -a systemctl restart kubelet systemctl status kubelet
去masters查看node ,正常。
4. Node status is NotReady 之三
failed to find plugin "portmap" in path [/opt/cni/bin]
在node 上查看 kubelet , 发现使唤weave 部署网络,节点上无 portmap , 需要从master 将其 拷贝到node ,再重启node 上的kubelet 。
root@gua-vm3:~# systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Mon 2020-11-30 08:01:36 UTC; 24h ago Docs: https://kubernetes.io/docs/home/ Main PID: 4555 (kubelet) Tasks: 16 (limit: 9451) Memory: 42.3M CGroup: /system.slice/kubelet.service └─4555 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registr> Dec 01 08:47:34 gua-vm3 kubelet[4555]: }, Dec 01 08:47:34 gua-vm3 kubelet[4555]: { Dec 01 08:47:34 gua-vm3 kubelet[4555]: "type": "portmap", Dec 01 08:47:34 gua-vm3 kubelet[4555]: "capabilities": {"portMappings": true}, Dec 01 08:47:34 gua-vm3 kubelet[4555]: "snat": true Dec 01 08:47:34 gua-vm3 kubelet[4555]: } Dec 01 08:47:34 gua-vm3 kubelet[4555]: ] Dec 01 08:47:34 gua-vm3 kubelet[4555]: } Dec 01 08:47:34 gua-vm3 kubelet[4555]: : [failed to find plugin "portmap" in path [/opt/cni/bin]] Dec 01 08:47:34 gua-vm3 kubelet[4555]: W1201 08:47:34.898238 4555 cni.go:239] Unable to update cni config: no valid networks found in /etc/cni/net.d
解决:
# on Master cd /opt/cni/bin scp portmap root@nodeIP:/opt/cni/bin # on node ls /opt/cni/bin/portmap systemctl restart kubelet
5.imagepullbackoff 的网络问题
部署helloworld 时 task 无法启动
imagepullbackoff 的网络问题
镜像加速:
在init 时重新指定批量拉取镜像的仓库:
kubeadm reset kubeadm init --config config.yaml
root@vm1:~/k8ssetup# cat config.yaml
apiVersion: kubeadm.k8s.io/v1beta2 kubernetesVersion: v1.17.3 kind: ClusterConfiguration networking: podSubnet: 192.168.0.0/16 image-repository: registry.cn-hangzhou.aliyuncs.com/google_containers
6. 没有设置允许master 部署pod 的报错
1 node(s) had taints that the pod didn't tolerate.
root@vm1:~/k8ssetup# kubectl -n tekton-pipelines get pods NAME READY STATUS RESTARTS AGE tekton-dashboard-6d9f5b4fc5-vj7q2 0/1 Pending 0 2m33s tekton-pipelines-controller-7f66b8bd95-c7kn8 0/1 Pending 0 3m5s tekton-pipelines-webhook-7cddbc485f-m6t7f 0/1 Pending 0 3m3s root@vm1:~/k8ssetup# kubectl -n tekton-pipelines describe pod tekton-dashboard-6d9f5b4fc5-vj7q2 Name: tekton-dashboard-6d9f5b4fc5-vj7q2 Namespace: tekton-pipelines ............ Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 11s (x4 over 2m47s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
taint master:
root@vm1:~/k8ssetup# kubectl taint nodes --all node-role.kubernetes.io/master- node/vm1 untainted root@vm1:~/k8ssetup# kubectl -n tekton-pipelines describe pod tekton-dashboard-6d9f5b4fc5-vj7q2 Name: tekton-dashboard-6d9f5b4fc5-vj7q2 ........ Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 41s (x5 over 4m48s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate. Normal Scheduled 18s default-scheduler Successfully assigned tekton-pipelines/tekton-dashboard-6d9f5b4fc5-vj7q2 to vm1 root@vm1:~/k8ssetup# kubectl -n tekton-pipelines get pods NAME READY STATUS RESTARTS AGE tekton-dashboard-6d9f5b4fc5-vj7q2 0/1 ContainerCreating 0 4m57s tekton-pipelines-controller-7f66b8bd95-c7kn8 0/1 ContainerCreating 0 5m29s tekton-pipelines-webhook-7cddbc485f-m6t7f 0/1 ContainerCreating 0 5m27s
node 的roles 显示为none
root@gua-vm2:/etc/cni/net.d# kubectl get nodes NAME STATUS ROLES AGE VERSION gua-vm1 Ready master 25h v1.19.4 gua-vm2 Ready worker 25h v1.19.4 gua-vm3 Ready <none> 25h v1.19.4 root@gua-vm2:/etc/cni/net.d# kubectl label nodes gua-vm3 kubernetes.io/role=worker node/gua-vm3 labeled root@gua-vm2:/etc/cni/net.d# kubectl get nodes NAME STATUS ROLES AGE VERSION gua-vm1 Ready master 25h v1.19.4 gua-vm2 Ready worker 25h v1.19.4 gua-vm3 Ready worker 25h v1.19.4
No Leanote account? Sign up now.