如何解决如何从HA群集以及etcd群集中删除主节点
我是k8s的新手,我发现了一个无法解决的问题。
我正在构建主节点的HA群集。我正在运行一些测试(删除一个节点并再次添加该节点)。通过此过程,我注意到etcd群集不会更新群集列表。
下面的问题示例:
$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
cri-o-metrics-exporter cri-o-metrics-exporter-77c9cf9746-qlp4d 0/1 Pending 0 16h
haproxy-controller haproxy-ingress-769d858699-b8r8q 0/1 Pending 0 16h
haproxy-controller ingress-default-backend-5fd4986454-kvbw8 0/1 Pending 0 16h
kube-system calico-kube-controllers-574d679d8c-tkcjj 1/1 Running 3 16h
kube-system calico-node-95t6l 1/1 Running 2 16h
kube-system calico-node-m5txs 1/1 Running 2 16h
kube-system coredns-7588b55795-gkfjq 1/1 Running 2 16h
kube-system coredns-7588b55795-lxpmj 1/1 Running 2 16h
kube-system etcd-masterNode1 1/1 Running 2 16h
kube-system etcd-masterNode2 1/1 Running 2 16h
kube-system kube-apiserver-masterNode1 1/1 Running 3 16h
kube-system kube-apiserver-masterNode2 1/1 Running 3 16h
kube-system kube-controller-manager-masterNode1 1/1 Running 4 16h
kube-system kube-controller-manager-masterNode2 1/1 Running 4 16h
kube-system kube-proxy-5q6xs 1/1 Running 2 16h
kube-system kube-proxy-k8p6h 1/1 Running 2 16h
kube-system kube-scheduler-masterNode1 1/1 Running 3 16h
kube-system kube-scheduler-masterNode2 1/1 Running 6 16h
kube-system metrics-server-575bd7f776-jtfsh 0/1 Pending 0 16h
kubernetes-dashboard dashboard-metrics-scraper-6f78bc588b-khjjr 1/1 Running 2 16h
kubernetes-dashboard kubernetes-dashboard-978555c5b-9jsxb 1/1 Running 2 16h
$ kubectl exec etcd-masterNode2 -n kube-system -it -- sh
sh-5.0# etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list -w table
+------------------+---------+----------------------------+---------------------------+---------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+----------------------------+---------------------------+---------------------------+------------+
| 4c209e5bc1ca9593 | started | masterNode1 | https://IP1:2380 | https://IP1:2379 | false |
| 676d4bfab319fa22 | started | masterNode2 | https://IP2:2380 | https://IP2:2379 | false |
| a9af4b00e33f87d4 | started | masterNode3 | https://IP3:2380 | https://IP3:2379 | false |
+------------------+---------+----------------------------+---------------------------+---------------------------+------------+
sh-5.0# exit
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
masterNode1 Ready master 16h v1.19.0
masterNode2 Ready master 16h v1.19.0
我假设我从集群中正确删除了该节点。我正在执行的步骤:
- kubectl排空--ignore-daemonsets --delete-local-data
- kubectl删除
- 重置节点kubeadm
- rm -f /etc/cni/net.d/*#删除CNI配置
- rm -rf / var / lib / kubelet#删除/ var / lib / kubeler目录
- rm -rf / var / lib / etcd#删除/ var / lib / etcd
- iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X && iptables -t过滤器-F && iptables -t过滤器-X#删除iptables
- ipvsadm --clear
- rm -rf / etc / kubernetes#删除/ etc / kubernetes(如果更改字符)
我正在运行版本1.19.0
和etcd etcd:3.4.9-1
的kubernetes。
集群在裸机节点上运行。
这是错误还是我没有从etcd群集中正确删除节点?
解决方法
感谢Mariusz K.,我找到了解决问题的答案。如果其他人可能遇到相同的问题,这是我如何解决。
首先在集群(HA)中查询etcd成员(代码示例):
$ kubectl exec etcd-< nodeNameMasterNode > -n kube-system -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list
1863b58e85c8a808,started,nodeNameMaster1,https://IP1:2380,https://IP1:2379,false
676d4bfab319fa22,nodeNameMaster2,https://IP2:2380,https://IP2:2379,false
b0c50c50d563ed51,nodeNameMaster3,https://IP3:2380,https://IP3:2379,false
然后,在获得节点列表后,您可以删除所需的任何成员。代码示例:
kubectl exec etcd-nodeNameMaster1 -n kube-system -- etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member remove b0c50c50d563ed51
Member b0c50c50d563ed51 removed from cluster d1e1de99e3d19634
我希望能够从etcd集群中删除一个成员,而无需连接到pod并运行辅助命令。这样,我通过exec执行到pod的命令。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。