CEPH

Actions

Shut down the cluster for maintenance

All clients must stop operation on the cluster (obviously…).

The following flags must be set for the OSD daemons:

  • noout: OSDs will not automatically be marked out after the configured interval
  • nobackfill: Backfilling of PGs is suspended
  • norecover: Recovery of PGs is suspended
  • norebalance: OSD will choose not to backfill unless PG is also degraded
  • nodown: OSD failure reports are being ignored, such that the monitors will not mark OSDs down
  • pause: Pauses reads and writes
ceph osd set noout
ceph osd set nobackfill
ceph osd set norecover
ceph osd set norebalance
ceph osd set nodown
ceph osd set pause

Cephadm

Bootstrap Ceph daemons with systemd and containers

config

ceph config set global public_network 192.168.122.0/24
ceph config set global cluster_network 10.4.0.0/24

crash

The crash module collects information about daemon crashdumps and stores it in the Ceph cluster for later analysis.

root@cephadmin:/etc/ceph# ceph crash ls
INFO:cephadm:Using recent ceph image ceph/ceph:v15
ID                                                                ENTITY                NEW  
2020-06-16T20:54:01.009899Z_f4ad9af2-fb8a-4844-b892-c59c53062ff8  mgr.cephadmin.ciozlx   *   
root@cephadmin:/etc/ceph# ceph crash archive 2020-06-16T20:54:01.009899Z_f4ad9af2-fb8a-4844-b892-c59c53062ff8

log

ceph log last cephadm

orchestrator

ceph orch ps
ceph orch daemon stop mgr.cephhost01.urlllo
ceph orch daemon restart mgr.cephadmin.ciozlx
root@cephadmin:~# ceph orch stop osd
stop osd.0 from host 'cephhost01'
stop osd.3 from host 'cephhost01'
stop osd.4 from host 'cephhost02'
stop osd.1 from host 'cephhost02'
stop osd.2 from host 'cephhost03'
stop osd.5 from host 'cephhost03'
root@cephadmin:~# ceph orch stop mon
stop mon.cephadmin from host 'cephadmin'
stop mon.cephhost01 from host 'cephhost01'
stop mon.cephhost02 from host 'cephhost02'
stop mon.cephhost03 from host 'cephhost03'

https://docs.ceph.com/docs/master/mgr/orchestrator/

Important considerations

Every node must:

  • run docker for cephadm to work
  • have either DNS resolution or a /etc/hosts entry of all other nodes
  • have all other nodes in their /root/.ssh/known_hosts
  • have the cluster SSH key in their /root/.ssh/authorized_keys

Concepts

Cache tiering

Docker

Ceph uses docker for its daemons and the containers have names like ceph-55f960fa-af0f-11ea-987f-09d125b534ca-osd.0 which contains the fsid.

The fsid is a unique identifier for the cluster, and stands for File System ID from the days when the Ceph Storage Cluster was principally for the Ceph Filesystem. Ceph now supports native interfaces, block devices, and object storage gateway interfaces too, so fsid is a bit of a misnomer.

CEPHFS

POSIX Filesystem

(docker-container)@container / $ ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]

Tools

mount.ceph

The debian package with mount.ceph is ceph-common