简介
Ceph在一个统一的系统中提供块、文件和对象存储。Ceph 高度可靠、易于管理且免费,提供了可扩展性——数以千计的客户端访问 PB 到 EB 的数据。
一个 Ceph 存储集群至少需要一个 Ceph Monitor、Ceph Manager 和 Ceph OSD。
Monitors(监视器)
Ceph Monitor ( ceph-mon) 维护集群状态的映射,包括监视器映射、管理器映射、OSD 映射、MDS 映射和 CRUSH 映射。监视器还负责管理守护进程和客户端之间的身份验证。冗余和高可用性通常需要至少三个监视器。
Managers(管理)
Ceph Manager守护进程 ( ceph-mgr) 负责跟踪运行时指标和 Ceph 集群的当前状态,包括存储利用率、当前性能指标和系统负载。高可用性通常需要至少两个管理器。
Ceph OSD(对象存储守护进程)
一个Ceph OSD(ceph-osd)存储数据、处理数据复制、恢复、重新平衡,并通过检查其他 Ceph OSD 守护进程的心跳来向 Ceph 监视器和管理器提供一些监视信息。冗余和高可用性通常需要至少三个 Ceph OSD。
MDS
Ceph 元数据服务器(MDS ceph-mds) 代表Ceph 文件系统存储元数据(即 Ceph 块设备和 Ceph 对象存储不使用 MDS)。Ceph 元数据服务器允许 POSIX 文件系统用户执行基本命令(如 ls、find等),而不会给 Ceph 存储集群带来巨大负担
Ceph 在 k8s 中用做共享存储还是非常方便的,所以接下来就详细讲解如何在 k8s 使用 ceph。
部署Ceph集群
部署rook operator
Rook是一个开源的云原生存储编排工具,提供平台、框架和对各种存储解决方案的支持,以和云原生环境进行本地集成。
Rook 利用扩展功能将其深度地集成到云原生环境中,并为调度、生命周期管理、资源管理、安全性、监控等提供了无缝的体验。Rook 目前支持 Ceph、NFS、Minio Object Store 和 CockroachDB。
# 部署rook operator # cd rook/deploy/charts/rook-ceph/
helm install --create-namespace --namespace rook-ceph rook-ceph .
创建rook ceph集群
# 创建rook ceph集群 # cd rook/deploy/examples
kubectl apply -f cluster.yaml
# 镜像列表
registry.k8s.io/sig-storage/csi-attacher:v4.8.1
registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.13.0
registry.k8s.io/sig-storage/csi-provisioner:v5.2.0
registry.k8s.io/sig-storage/csi-resizer:v1.13.2
registry.k8s.io/sig-storage/csi-snapshotter:v8.2.1
quay.io/ceph/ceph:v19.2.2
quay.io/cephcsi/cephcsi:v3.14.0
rook/ceph:v1.17.2
kubectl get cephcluster -A
NAMESPACE NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL FSID
rook-ceph rook-ceph /var/lib/rook 3 136m Ready Cluster created successfully HEALTH_WARN 40a66dd6-69ed-4401-b97f-ef1edaae17b2
部署rook ceph工具
# cd rook/deploy/examples
kubectl apply -f toolbox.yaml
# 通过ceph-tool工具查看ceph集群状态,正常需要3个OSD(准备三个节点,每个节点至少有一块盘)
kubectl exec -it `kubectl get pods -n rook-ceph|grep rook-ceph-tools|awk '{print $1}'` -n rook-ceph -- bash
bash-5.1$ ceph -s
cluster:
id: e7abf175-ad9c-420a-9205-1328f8097a75
health: HEALTH_WARN
mon b is low on available space
services:
mon: 3 daemons, quorum a,b,c (age 12h)
mgr: a(active, since 12h), standbys: b
mds: 1/1 daemons up, 1 hot standby
osd: 3 osds: 3 up (since 12h), 3 in (since 13h)
data:
volumes: 1/1 healthy
pools: 4 pools, 81 pgs
objects: 34 objects, 639 KiB
usage: 203 MiB used, 120 GiB / 120 GiB avail
pgs: 81 active+clean
io:
client: 1.2 KiB/s rd, 2 op/s rd, 0 op/s wr
OSD 可用存储设备满足条件:
设备必须没有分区。
设备不得具有任何 LVM 状态。
不得安装设备。
该设备不得包含文件系统。
该设备不得包含 Ceph BlueStore OSD。
设备必须大于 5 GB。
使用Ceph
块存储
创建存储池、Storageclass和块存储PVC
# 创建rbd pool和创建storageclass # cd rook/deploy/example/csi/rbd/
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph # namespace:cluster
spec:
failureDomain: host
replicated:
size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com # csi-provisioner-name
parameters:
clusterID: rook-ceph # namespace:cluster
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner 连接ceph集群admin用户密码
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/fstype: ext4
allowVolumeExpansion: true
reclaimPolicy: Delete
kubectl apply -f storageclass.yaml # 创建storageclass
kubectl get CephBlockPool -A
NAMESPACE NAME PHASE TYPE FAILUREDOMAIN AGE
rook-ceph replicapool Ready Replicated host 6m57s
# 创建 pvc
kubectl apply -f pvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rbd-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: rook-ceph-block
应用挂载块存储使用
# pod挂载pvc
---
apiVersion: v1
kind: Pod
metadata:
name: csirbd-demo-pod
spec:
containers:
- name: web-server
image: registry-huadong1.crs-internal.ctyun.cn/open-source/nginx:1.25-alpine # nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: rbd-pvc
readOnly: false
# kubectl exec -it csirbd-demo-pod -- sh
/ # df -h /var/lib/www/html
Filesystem Size Used Available Use% Mounted on
/dev/rbd0 973.4M 24.0K 957.4M 0% /var/lib/www/html
cephfs文件系统
创建storageclass和文件系统
# 创建storageclass # csi/cephfs/storageclass.yaml
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com # csi-provisioner-name
parameters:
clusterID: rook-ceph # namespace:cluster
fsName: myfs
pool: myfs-replicated
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph # namespace:cluster
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # namespace:cluster
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
# uncomment the following line for debugging
#- debug
# 创建文件系统 deploy/example/filesystem.yaml
---
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
name: myfs
namespace: rook-ceph # namespace:cluster
spec:
# The metadata pool spec. Must use replication.
metadataPool:
replicated:
size: 3
requireSafeReplicaSize: true
parameters:
compression_mode:
none
dataPools:
- name: replicated
failureDomain: host
replicated:
size: 3
requireSafeReplicaSize: true
parameters:
# Inline compression mode for the data pool
# Further reference: https://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#inline-compression
compression_mode:
none
# Whether to preserve filesystem after CephFilesystem CRD deletion
preserveFilesystemOnDelete: true
# The metadata service (mds) configuration
metadataServer:
# The number of active MDS instances
activeCount: 1
activeStandby: true
# The affinity rules to apply to the mds deployment
placement:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-mds
topologyKey: kubernetes.io/hostname
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-mds
topologyKey: topology.kubernetes.io/zone
priorityClassName: system-cluster-critical
livenessProbe:
disabled: false
startupProbe:
disabled: false
---
# create default csi subvolume group
apiVersion: ceph.rook.io/v1
kind: CephFilesystemSubVolumeGroup
metadata:
name: myfs-csi # lets keep the svg crd name same as `filesystem name + csi` for the default csi svg
namespace: rook-ceph # namespace:cluster
spec:
name: csi
filesystemName: myfs
pinning:
distributed: 1 # distributed=<0, 1> (disabled=0)
kubectl get CephFileSystem -A
NAMESPACE NAME ACTIVEMDS AGE PHASE
rook-ceph myfs 1 2m10s Ready
kubectl get CephFilesystemSubVolumeGroup -A
NAMESPACE NAME PHASE FILESYSTEM QUOTA AGE
rook-ceph myfs-csi Ready myfs 2m13s
创建PVC并挂载使用cephfs
# 创建pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
labels:
group: snapshot-test
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: rook-cephfs
# pod挂载pvc
---
apiVersion: v1
kind: Pod
metadata:
name: csicephfs-demo-pod
spec:
containers:
- name: web-server
image: registry-huadong1.crs-internal.ctyun.cn/open-source/nginx:1.25-alpine # nginx
volumeMounts:
- name: mypvc
mountPath: /var/lib/www/html
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: cephfs-pvc
readOnly: false
# 查看挂载情况
# kubectl exec -it csicephfs-demo-pod -- sh
/ # df -h /var/lib/www/html
Filesystem Size Used Available Use% Mounted on
10.96.36.10:6789,10.96.102.212:6789,10.96.135.99:6789:/volumes/csi/csi-vol-ffa040c1-c81b-44ba-87d0-a7cd8bfc0055/66004596-55c9-4315-81aa-d7c094ce93f1
1.0G 0 1.0G 0% /var/lib/www/html