kubernetes中etcd群集备份恢复

etcd在kubernetes中是一个很核心的组件,所有数据都存储在etcd中,如果etcd发生故障将导致整个群集的不可用,生产中etcd一定要做高可用和数据的备份与恢复。

etcd 版本为3.2.26,kubernetes为1.14.2所以这里使用的是etcd v3

备份

1
2
ETCDCTL_API=3 etcdctl --endpoints=${endpoints} --cert=/usr/local/kubernetes/ssl/etcd.pem --key=/usr/local/kubernetes/ssl/etcd-key.pem --cacert=/usr/local/kubernetes/ssl/ca.pem snapshot save back.db
Snapshot saved at back.db

恢复

  • 停止etcd群集
1
systemctl stop etcd

  • 删除etcd目录
1
rm -rf /opt/etcd

需要将整个目录删除,恢复时会自动创建


  • 复制备份文件到群集所有节点
1
scp back.db 10.0.20.12:~/

  • 恢复数据
1
ETCDCTL_API=3 etcdctl --endpoints=https://10.0.20.11:2379,https://10.0.20.12:2379,https://10.0.20.13:2379 --cert=/usr/local/kubernetes/ssl/etcd.pem --key=/usr/local/kubernetes/ssl/etcd-key.pem --cacert=/usr/local/kubernetes/ssl/ca.pem --initial-cluster etcd1=https://10.0.20.11:2380,etcd2=https://10.0.20.12:2380,etcd3=https://10.0.20.13:2380 --initial-advertise-peer-urls https://10.0.20.12:2380 snapshot restore back.db --data-dir=/opt/etcd/ --name etcd2
  • --initial-advertise-peer-urls(每台不一样)和--initial-cluster 参考你自己的etcd配置文件填写
  • --data-dir=/opt/etcd/ 指定etcd的数据目录
  • --name etcd2 etcd名称

  • 验证
1
2
3
4
ETCDCTL_API=3 etcdctl --endpoints=https://10.0.20.11:2379,https://10.0.20.12:2379,https://10.0.20.13:2379 --cert=/usr/local/kubernetes/ssl/etcd.pem --key=/usr/local/kubernetes/ssl/etcd-key.pem --cacert=/usr/local/kubernetes/ssl/ca.pem member list
b40a71b8cf44c74, started, etcd3, https://10.0.20.13:2380, https://10.0.20.13:2379
a9027edffe4ef2d2, started, etcd1, https://10.0.20.11:2380, https://10.0.20.11:2379
c1e9eb55fcf40d38, started, etcd2, https://10.0.20.12:2380, https://10.0.20.12:2379
1
2
3
4
5
6
7
kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health": "true"}
etcd-2 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}

etcd数据恢复成功

当前网速较慢或者你使用的浏览器不支持博客特定功能,请尝试刷新或换用Chrome、Firefox等现代浏览器