Files
Kubernetes/Helm/Helm对接外部Ceph.md
2025-08-25 17:53:08 +08:00

9.3 KiB
Raw Permalink Blame History

本文作者:丁辉

Helm对接外部Ceph

Github仓库

节点名称 IP
ceph-node-1 192.168.1.10
ceph-node-2 192.168.1.20
ceph-node-3 192.168.1.30

添加仓库

helm repo add ceph-csi https://ceph.github.io/csi-charts
helm repo update

对接 CephFS 共享文件系统

CephFS基础环境准备

请查看此篇文章 Ceph创建文件系统

开始部署

官方文档 官方参数解释

  1. 配置 values.yaml 文件

    vi ceph-csi-cephfs-values.yaml
    

    内容如下

    csiConfig:
      # 使用 ceph mon dump 命令查看clusterID
      - clusterID: "619ac911-7e23-4e7e-9e15-7329291de385"
        monitors:
          - "192.168.1.10:6789"
          - "192.168.1.20:6789"
          - "192.168.1.30:6789"
    
    secret:
      create: true
      name: csi-cephfs-secret
      adminID: admin
      # 使用 ceph auth get  client.admin 命令查看用户密钥
      adminKey: AQByaidmineVLRAATw9GO+iukAb6leMiJflm9A==
    
    storageClass:
      create: true
      name: csi-cephfs-sc
      # 使用 ceph mon dump 命令查看clusterID
      clusterID: 619ac911-7e23-4e7e-9e15-7329291de385
      fsName: cephfs
      pool: "cephfs_data"
      provisionerSecret: csi-cephfs-secret
      provisionerSecretNamespace: "ceph-csi-cephfs"
      controllerExpandSecret: csi-cephfs-secret
      controllerExpandSecretNamespace: "ceph-csi-cephfs"
      nodeStageSecret: csi-cephfs-secret
      nodeStageSecretNamespace: "ceph-csi-cephfs"
      reclaimPolicy: Delete
      allowVolumeExpansion: true
      mountOptions:
        - discard
    
    cephconf: |
      [global]
        auth_cluster_required = cephx
        auth_service_required = cephx
        auth_client_required = cephx
        fuse_set_user_groups = false
        fuse_big_writes = true
    
    provisioner:
      # 配置 ceph-csi-cephfs-provisioner 副本数
      replicaCount: 3
    
      # 配置镜像加速
      provisioner:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-provisioner
    
        # 当 extra-create-metadata 设置为 false 时它指示存储插件在创建持久卷PV或持久卷声明PVC时不生成额外的元数据。这可以减少存储操作的复杂性和提升性能特别是在不需要额外元数据的情况下。
        #extraArgs:
        #- extra-create-metadata=false
    
      resizer:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-resizer
    
      snapshotter:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-snapshotter
    
    nodeplugin:
      registrar:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-node-driver-registrar
      plugin:
        image:
          repository: quay.dockerproxy.com/cephcsi/cephcsi
    
  2. 安装

    helm install ceph-csi-cephfs ceph-csi/ceph-csi-cephfs \
      --namespace ceph-csi-cephfs --create-namespace \
       -f ceph-csi-cephfs-values.yaml
    
  3. cephfs 文件系统中创建一个子卷组名为 csi

    ceph fs subvolumegroup create cephfs csi
    

    检查

    ceph fs subvolumegroup ls cephfs
    

卸载

helm uninstall ceph-csi-cephfs -n ceph-csi-cephfs

Cephfs 挂载测试

部署测试容器

  1. 创建 Pvc

    cat <<EOF | kubectl apply -f -  
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: csi-cephfs-pvc
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 1Gi
      storageClassName: csi-cephfs-sc
    EOF
    
  2. 创建 Pod

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: csi-cephfs-pod
    spec:
      containers:
        - name: nginx
          image: nginx:latest
          volumeMounts:
            - name: pvc
              mountPath: /usr/share/nginx/html
      volumes:
        - name: pvc
          persistentVolumeClaim:
            claimName: csi-cephfs-pvc
            readOnly: false
    EOF
    

卸载测试容器

  1. 卸载 Pod

    kubectl delete pod csi-cephfs-pod
    
  2. 卸载 Pvc

    kubectl delete pvc csi-cephfs-pvc
    

问题记录

使用最新版 ceph-csi-cephfs 对接外部 CEPH 集群后无法使用报错

环境信息

Ceph部模式 Ceph版本 Kubernetes版本
Docker ceph version 16.2.5 pacific (stable) v1.23

报错如下

Warning  FailedMount  3s  kubelet  MountVolume.MountDevice failed for volume "pvc-342d9156-70f0-42f8-b288-8521035f8fd4" : rpc error: code = Internal desc = an error (exit status 32) occurred while running mount args: [-t ceph 192.168.1.10:6789,192.168.1.20:6789,192.168.1.30:6789:/volumes/csi/csi-vol-d850ba82-4198-4862-b26a-52570bcb1320/1a202392-a8cc-4386-8fc7-a340d9389e66 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-342d9156-70f0-42f8-b288-8521035f8fd4/globalmount -o name=admin,secretfile=/tmp/csi/keys/keyfile-99277731,mds_namespace=cephfs,discard,ms_mode=secure,_netdev] stderr: unable to get monitor info from DNS SRV with service name: ceph-mon
2024-05-02T08:12:18.622+0000 7f62cd3e3140 -1 failed for service _ceph-mon._tcp
mount error 22 = Invalid argument

解决方案

降低 ceph-csi-cephfs Helm 版本到 3.8.1(经过多次测试得出来的结论)

helm install ceph-csi-cephfs ceph-csi/ceph-csi-cephfs \
  --namespace ceph-csi-cephfs --create-namespace \
   -f ceph-csi-cephfs-values.yaml \
   --version 3.8.1

对接 RBD 块存储

RBD基础环境准备

请查看此篇文章 Ceph创建RBD块存储

开始部署

  1. 配置 values.yaml 文件

    vi ceph-csi-rbd-values.yaml
    

    内容如下

    csiConfig:
      # 使用 ceph mon dump 命令查看clusterID
      - clusterID: "619ac911-7e23-4e7e-9e15-7329291de385"
        monitors:
          - "192.168.1.10:6789"
          - "192.168.1.20:6789"
          - "192.168.1.30:6789"
    
    secret:
      create: true
      name: csi-rbd-secret
      userID: kubernetes
      # 使用 ceph auth get  client.kubernetes 命令查看用户密钥
      userKey: AQByaidmineVLRAATw9GO+iukAb6leMiJflm9A==
      encryptionPassphrase: kubernetes_pass
    
    storageClass:
      create: true
      name: csi-rbd-sc
      # 使用 ceph mon dump 命令查看clusterID
      clusterID: 619ac911-7e23-4e7e-9e15-7329291de385
      pool: "kubernetes"
      imageFeatures: "layering"
      provisionerSecret: csi-rbd-secret
      provisionerSecretNamespace: "ceph-csi-rbd"
      controllerExpandSecret: csi-rbd-secret
      controllerExpandSecretNamespace: "ceph-csi-rbd"
      nodeStageSecret: csi-rbd-secret
      nodeStageSecretNamespace: "ceph-csi-rbd"
      fstype: xfs
      reclaimPolicy: Delete
      allowVolumeExpansion: true
      mountOptions:
        - discard
    
    cephconf: |
      [global]
        auth_cluster_required = cephx
        auth_service_required = cephx
        auth_client_required = cephx
    
    provisioner:
      # 配置 ceph-csi-cephfs-provisioner 副本数
      replicaCount: 3
    
      # 配置镜像加速
      provisioner:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-provisioner
      attacher:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-attacher
      resizer:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-resizer
      snapshotter:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-snapshotter
    
    nodeplugin:
      registrar:
        image:
          repository: registry.aliyuncs.com/google_containers/csi-node-driver-registrar
      plugin:
        image:
          repository: quay.dockerproxy.com/cephcsi/cephcsi
    
  2. 安装

    helm install ceph-csi-rbd ceph-csi/ceph-csi-rbd \
      --namespace ceph-csi-rbd --create-namespace \
       -f ceph-csi-rbd-values.yaml
    

卸载

helm uninstall ceph-csi-rbd -n ceph-csi-rbd

RBD 测试挂载

部署测试容器

  1. 创建 Pvc

    cat <<EOF | kubectl apply -f -  
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: csi-rbd-pvc
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
      storageClassName: csi-rbd-sc
    EOF
    
  2. 创建 Pod

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: csi-rbd-pod
    spec:
      containers:
        - name: nginx
          image: nginx:latest
          volumeMounts:
            - name: pvc
              mountPath: /usr/share/nginx/html
      volumes:
        - name: pvc
          persistentVolumeClaim:
            claimName: csi-rbd-pvc
            readOnly: false
    EOF
    

卸载测试容器

  1. 卸载 Pod

    kubectl delete pod csi-rbd-pod
    
  2. 卸载 Pvc

    kubectl delete pvc csi-rbd-pvc