CEPH
Actions
Shut down the cluster for maintenance
All clients must stop operation on the cluster (obviously…).
The following flags must be set for the OSD daemons:
noout
: OSDs will not automatically be marked out after the configured intervalnobackfill
: Backfilling of PGs is suspendednorecover
: Recovery of PGs is suspendednorebalance
: OSD will choose not to backfill unless PG is also degradednodown
: OSD failure reports are being ignored, such that the monitors will not mark OSDs downpause
: Pauses reads and writes
ceph osd set noout
ceph osd set nobackfill
ceph osd set norecover
ceph osd set norebalance
ceph osd set nodown
ceph osd set pause
Cephadm
Bootstrap Ceph daemons with systemd and containers
config
ceph config set global public_network 192.168.122.0/24
ceph config set global cluster_network 10.4.0.0/24
crash
The crash module collects information about daemon crashdumps and stores it in the Ceph cluster for later analysis.
root@cephadmin:/etc/ceph# ceph crash ls
INFO:cephadm:Using recent ceph image ceph/ceph:v15
ID ENTITY NEW
2020-06-16T20:54:01.009899Z_f4ad9af2-fb8a-4844-b892-c59c53062ff8 mgr.cephadmin.ciozlx *
root@cephadmin:/etc/ceph# ceph crash archive 2020-06-16T20:54:01.009899Z_f4ad9af2-fb8a-4844-b892-c59c53062ff8
log
ceph log last cephadm
orchestrator
ceph orch ps
ceph orch daemon stop mgr.cephhost01.urlllo
ceph orch daemon restart mgr.cephadmin.ciozlx
root@cephadmin:~# ceph orch stop osd
stop osd.0 from host 'cephhost01'
stop osd.3 from host 'cephhost01'
stop osd.4 from host 'cephhost02'
stop osd.1 from host 'cephhost02'
stop osd.2 from host 'cephhost03'
stop osd.5 from host 'cephhost03'
root@cephadmin:~# ceph orch stop mon
stop mon.cephadmin from host 'cephadmin'
stop mon.cephhost01 from host 'cephhost01'
stop mon.cephhost02 from host 'cephhost02'
stop mon.cephhost03 from host 'cephhost03'
https://docs.ceph.com/docs/master/mgr/orchestrator/
Important considerations
Every node must:
- run docker for cephadm to work
- have either DNS resolution or a
/etc/hosts
entry of all other nodes - have all other nodes in their
/root/.ssh/known_hosts
- have the cluster SSH key in their
/root/.ssh/authorized_keys
Concepts
Cache tiering
Docker
Ceph uses docker for its daemons and the containers have names like ceph-55f960fa-af0f-11ea-987f-09d125b534ca-osd.0
which contains the fsid
.
The fsid
is a unique identifier for the cluster, and stands for File System ID from the days when the Ceph Storage Cluster was principally for the Ceph Filesystem. Ceph now supports native interfaces, block devices, and object storage gateway interfaces too, so fsid
is a bit of a misnomer.
CEPHFS
POSIX Filesystem
(docker-container)@container / $ ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
Tools
mount.ceph
The debian package with mount.ceph
is ceph-common