Files
Kubernetes/Helm/Helm对接外部Ceph.md
2025-08-25 17:53:08 +08:00

383 lines
9.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

> 本文作者:丁辉
# Helm对接外部Ceph
[Github仓库](https://github.com/ceph/ceph-csi)
| 节点名称 | IP |
| :---------: | :----------: |
| ceph-node-1 | 192.168.1.10 |
| ceph-node-2 | 192.168.1.20 |
| ceph-node-3 | 192.168.1.30 |
**添加仓库**
```bash
helm repo add ceph-csi https://ceph.github.io/csi-charts
helm repo update
```
## 对接 CephFS 共享文件系统
### CephFS基础环境准备
请查看此篇文章 [Ceph创建文件系统](https://gitee.com/offends/Kubernetes/blob/main/%E5%AD%98%E5%82%A8/Ceph/Ceph%E5%88%9B%E5%BB%BA%E6%96%87%E4%BB%B6%E7%B3%BB%E7%BB%9F.md)
### 开始部署
[官方文档](https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/#configure-ceph-csi-plugins) [官方参数解释](https://github.com/ceph/ceph-csi/tree/devel/charts/ceph-csi-cephfs)
1. 配置 values.yaml 文件
```bash
vi ceph-csi-cephfs-values.yaml
```
内容如下
```yaml
csiConfig:
# 使用 ceph mon dump 命令查看clusterID
- clusterID: "619ac911-7e23-4e7e-9e15-7329291de385"
monitors:
- "192.168.1.10:6789"
- "192.168.1.20:6789"
- "192.168.1.30:6789"
secret:
create: true
name: csi-cephfs-secret
adminID: admin
# 使用 ceph auth get client.admin 命令查看用户密钥
adminKey: AQByaidmineVLRAATw9GO+iukAb6leMiJflm9A==
storageClass:
create: true
name: csi-cephfs-sc
# 使用 ceph mon dump 命令查看clusterID
clusterID: 619ac911-7e23-4e7e-9e15-7329291de385
fsName: cephfs
pool: "cephfs_data"
provisionerSecret: csi-cephfs-secret
provisionerSecretNamespace: "ceph-csi-cephfs"
controllerExpandSecret: csi-cephfs-secret
controllerExpandSecretNamespace: "ceph-csi-cephfs"
nodeStageSecret: csi-cephfs-secret
nodeStageSecretNamespace: "ceph-csi-cephfs"
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discard
cephconf: |
[global]
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
fuse_set_user_groups = false
fuse_big_writes = true
provisioner:
# 配置 ceph-csi-cephfs-provisioner 副本数
replicaCount: 3
# 配置镜像加速
provisioner:
image:
repository: registry.aliyuncs.com/google_containers/csi-provisioner
# 当 extra-create-metadata 设置为 false 时它指示存储插件在创建持久卷PV或持久卷声明PVC时不生成额外的元数据。这可以减少存储操作的复杂性和提升性能特别是在不需要额外元数据的情况下。
#extraArgs:
#- extra-create-metadata=false
resizer:
image:
repository: registry.aliyuncs.com/google_containers/csi-resizer
snapshotter:
image:
repository: registry.aliyuncs.com/google_containers/csi-snapshotter
nodeplugin:
registrar:
image:
repository: registry.aliyuncs.com/google_containers/csi-node-driver-registrar
plugin:
image:
repository: quay.dockerproxy.com/cephcsi/cephcsi
```
2. 安装
```bash
helm install ceph-csi-cephfs ceph-csi/ceph-csi-cephfs \
--namespace ceph-csi-cephfs --create-namespace \
-f ceph-csi-cephfs-values.yaml
```
3. 在 `cephfs` 文件系统中创建一个子卷组名为 `csi`
```bash
ceph fs subvolumegroup create cephfs csi
```
检查
```bash
ceph fs subvolumegroup ls cephfs
```
### 卸载
```bash
helm uninstall ceph-csi-cephfs -n ceph-csi-cephfs
```
### Cephfs 挂载测试
#### 部署测试容器
1. 创建 Pvc
```yaml
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-cephfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: csi-cephfs-sc
EOF
```
2. 创建 Pod
```yaml
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: csi-cephfs-pod
spec:
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- name: pvc
mountPath: /usr/share/nginx/html
volumes:
- name: pvc
persistentVolumeClaim:
claimName: csi-cephfs-pvc
readOnly: false
EOF
```
#### 卸载测试容器
1. 卸载 Pod
```bash
kubectl delete pod csi-cephfs-pod
```
2. 卸载 Pvc
```bash
kubectl delete pvc csi-cephfs-pvc
```
### 问题记录
> 使用最新版 `ceph-csi-cephfs` 对接外部 CEPH 集群后无法使用报错
**环境信息**
| Ceph部模式 | Ceph版本 | Kubernetes版本 |
| :--------: | :----------------------------------: | :------------: |
| Docker | ceph version 16.2.5 pacific (stable) | v1.23 |
**报错如下**
```bash
Warning FailedMount 3s kubelet MountVolume.MountDevice failed for volume "pvc-342d9156-70f0-42f8-b288-8521035f8fd4" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 192.168.1.10:6789,192.168.1.20:6789,192.168.1.30:6789:/volumes/csi/csi-vol-d850ba82-4198-4862-b26a-52570bcb1320/1a202392-a8cc-4386-8fc7-a340d9389e66 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-342d9156-70f0-42f8-b288-8521035f8fd4/globalmount -o name=admin,secretfile=/tmp/csi/keys/keyfile-99277731,mds_namespace=cephfs,discard,ms_mode=secure,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon
2024-05-02T08:12:18.622+0000 7f62cd3e3140 -1 failed for service _ceph-mon._tcp
mount error 22 = Invalid argument
```
**解决方案**
> 降低 `ceph-csi-cephfs` Helm 版本到 3.8.1(经过多次测试得出来的结论)
```
helm install ceph-csi-cephfs ceph-csi/ceph-csi-cephfs \
--namespace ceph-csi-cephfs --create-namespace \
-f ceph-csi-cephfs-values.yaml \
--version 3.8.1
```
## 对接 RBD 块存储
### RBD基础环境准备
请查看此篇文章 [Ceph创建RBD块存储](https://gitee.com/offends/Kubernetes/blob/main/%E5%AD%98%E5%82%A8/Ceph/Ceph%E5%88%9B%E5%BB%BARBD%E5%9D%97%E5%AD%98%E5%82%A8.md)
### 开始部署
1. 配置 values.yaml 文件
```bash
vi ceph-csi-rbd-values.yaml
```
内容如下
```yaml
csiConfig:
# 使用 ceph mon dump 命令查看clusterID
- clusterID: "619ac911-7e23-4e7e-9e15-7329291de385"
monitors:
- "192.168.1.10:6789"
- "192.168.1.20:6789"
- "192.168.1.30:6789"
secret:
create: true
name: csi-rbd-secret
userID: kubernetes
# 使用 ceph auth get client.kubernetes 命令查看用户密钥
userKey: AQByaidmineVLRAATw9GO+iukAb6leMiJflm9A==
encryptionPassphrase: kubernetes_pass
storageClass:
create: true
name: csi-rbd-sc
# 使用 ceph mon dump 命令查看clusterID
clusterID: 619ac911-7e23-4e7e-9e15-7329291de385
pool: "kubernetes"
imageFeatures: "layering"
provisionerSecret: csi-rbd-secret
provisionerSecretNamespace: "ceph-csi-rbd"
controllerExpandSecret: csi-rbd-secret
controllerExpandSecretNamespace: "ceph-csi-rbd"
nodeStageSecret: csi-rbd-secret
nodeStageSecretNamespace: "ceph-csi-rbd"
fstype: xfs
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
- discard
cephconf: |
[global]
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
provisioner:
# 配置 ceph-csi-cephfs-provisioner 副本数
replicaCount: 3
# 配置镜像加速
provisioner:
image:
repository: registry.aliyuncs.com/google_containers/csi-provisioner
attacher:
image:
repository: registry.aliyuncs.com/google_containers/csi-attacher
resizer:
image:
repository: registry.aliyuncs.com/google_containers/csi-resizer
snapshotter:
image:
repository: registry.aliyuncs.com/google_containers/csi-snapshotter
nodeplugin:
registrar:
image:
repository: registry.aliyuncs.com/google_containers/csi-node-driver-registrar
plugin:
image:
repository: quay.dockerproxy.com/cephcsi/cephcsi
```
2. 安装
```bash
helm install ceph-csi-rbd ceph-csi/ceph-csi-rbd \
--namespace ceph-csi-rbd --create-namespace \
-f ceph-csi-rbd-values.yaml
```
### 卸载
```bash
helm uninstall ceph-csi-rbd -n ceph-csi-rbd
```
### RBD 测试挂载
#### 部署测试容器
1. 创建 Pvc
```yaml
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: csi-rbd-sc
EOF
```
2. 创建 Pod
```yaml
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: csi-rbd-pod
spec:
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- name: pvc
mountPath: /usr/share/nginx/html
volumes:
- name: pvc
persistentVolumeClaim:
claimName: csi-rbd-pvc
readOnly: false
EOF
```
#### 卸载测试容器
1. 卸载 Pod
```bash
kubectl delete pod csi-rbd-pod
```
2. 卸载 Pvc
```bash
kubectl delete pvc csi-rbd-pvc
```