Shut down the cluster for maintenance

All clients must stop operation on the cluster (obviously…).

The following flags must be set for the OSD daemons:

  • noout: OSDs will not automatically be marked out after the configured interval
  • nobackfill: Backfilling of PGs is suspended
  • norecover: Recovery of PGs is suspended
  • norebalance: OSD will choose not to backfill unless PG is also degraded
  • nodown: OSD failure reports are being ignored, such that the monitors will not mark OSDs down
  • pause: Pauses reads and writes
ceph osd set noout
ceph osd set nobackfill
ceph osd set norecover
ceph osd set norebalance
ceph osd set nodown
ceph osd set pause


Bootstrap Ceph daemons with systemd and containers


ceph config set global public_network
ceph config set global cluster_network


The crash module collects information about daemon crashdumps and stores it in the Ceph cluster for later analysis.

root@cephadmin:/etc/ceph# ceph crash ls
INFO:cephadm:Using recent ceph image ceph/ceph:v15
ID                                                                ENTITY                NEW  
2020-06-16T20:54:01.009899Z_f4ad9af2-fb8a-4844-b892-c59c53062ff8  mgr.cephadmin.ciozlx   *   
root@cephadmin:/etc/ceph# ceph crash archive 2020-06-16T20:54:01.009899Z_f4ad9af2-fb8a-4844-b892-c59c53062ff8


ceph log last cephadm


ceph orch ps
ceph orch daemon stop mgr.cephhost01.urlllo
ceph orch daemon restart mgr.cephadmin.ciozlx
root@cephadmin:~# ceph orch stop osd
stop osd.0 from host 'cephhost01'
stop osd.3 from host 'cephhost01'
stop osd.4 from host 'cephhost02'
stop osd.1 from host 'cephhost02'
stop osd.2 from host 'cephhost03'
stop osd.5 from host 'cephhost03'
root@cephadmin:~# ceph orch stop mon
stop mon.cephadmin from host 'cephadmin'
stop mon.cephhost01 from host 'cephhost01'
stop mon.cephhost02 from host 'cephhost02'
stop mon.cephhost03 from host 'cephhost03'


Important considerations

Every node must:

  • run docker for cephadm to work
  • have either DNS resolution or a /etc/hosts entry of all other nodes
  • have all other nodes in their /root/.ssh/known_hosts
  • have the cluster SSH key in their /root/.ssh/authorized_keys


Cache tiering


Ceph uses docker for its daemons and the containers have names like ceph-55f960fa-af0f-11ea-987f-09d125b534ca-osd.0 which contains the fsid.

The fsid is a unique identifier for the cluster, and stands for File System ID from the days when the Ceph Storage Cluster was principally for the Ceph Filesystem. Ceph now supports native interfaces, block devices, and object storage gateway interfaces too, so fsid is a bit of a misnomer.


POSIX Filesystem

(docker-container)@container / $ ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]



The debian package with mount.ceph is ceph-common