Distributed ceph @ CSE notes
Goal
To run a ceph distributed storage cluster primarily on CSE lab computers using unused local disk storage to create one OSD per host.
In addition to running ceph-osd, the cluster will also run ceph-mon and ceph-mgr on one host per lab.
conform product
The product is Local.plceph.
Gaol environment
The ceph software (currently ceph-12.1.0) is not compatible with the current CSE conform software set though it is compatible with the kernel. To run, the necessary binaries and their required libraries are located in a chroot gaol at /usr/local/ceph/gaol. Scripts to support the gaol are located in /usr/local/ceph/scripts.
As much as possible, only the files necessary to support the ceph daemons are in the gaol. Also as much as possible, binaries are located (relatively) in /bin and libraries are located in /lib.
Three primary binaries are supported:
- ceph-osd
- ceph-mon
- ceph-mgr
The binaries and libraries are copied from a dedicated, i386, cleanly-installed Debian host.
The ceph cluster operates on CSE conformed hosts with, at most, one OSD daemon per host. The OSD daemon has its datastore on an XFS-formatted filesystem on a loop(back) device whose associated file is /export/<hostname>/1/SCRATCH/plinich/plceph_osd.dat.
See also /etc/fstab.
ceph user
In the gaol environment, programs run as user games. This was chosen because it already existed in /etc/passwd and was not used by anything else. See also the product spec file and /usr/local/ceph/scripts/rungaol.
Create a new MON
The first MON is created using the bootstrap instructions on www.ceph.com.
Subsequent MONs are done like this:
- On the management host:
- Use monmaptool to create a new monitor map, with actual fsid, containing the current MON hosts/IP addresses plus the new host
- Copy the new monitor map to the new MON host as /usr/local/ceph/gaol/monmap
- On the new MON host:
- Copy /usr/local/ceph/gaol/var/lib/ceph/mon/ceph-<hostname>/keyring from a running MON host to /usr/local/ceph/gaol/
- chown games /usr/local/ceph/gaol/monmap
- /usr/local/ceph/scripts/rungaol /bin/ceph-mon -i <hostname> --mkfs --keyring /keyring --monmap /monmap -f -d
- mv /usr/local/ceph/gaol/keyring /usr/local/ceph/gaol/var/lib/ceph/mon/ceph-viola09/keyring
- /usr/local/ceph/scripts/rungaol /bin/ceph-mon -i <hostname>
- rm /usr/local/ceph/gaol/monmap
Notes
The /usr/local/ceph/gaol/var/lib/ceph/mon/ceph-<hostname>/keyring file is the same on all MON hosts.
Create a new OSD
- Add host to plceph_cluster_osd class, add xfsprogs package and run conform on host,
- On OSD host:
- Create sparse disk file for filesystem using dd if=/dev/zero of=/export/<hostname>/1/SCRATCH/plinich/plceph_osd.dat count=1 bs=1 seek=<SIZE>G
- losetup /dev/loop0 /export/<hostname>/1/SCRATCH/plinich/plceph_osd.dat
- mkfs.xfs /dev/loop0
- losetup -d /dev/loop0
- mount /usr/local/ceph/gaol/osd
- chown games:games /usr/local/ceph/gaol/osd
- On management host:
- ceph -c /etc/plceph/ceph.conf osd create (outpts new OSD number)
- On OSD host:
- /usr/local/ceph/scripts/rungaol /bin/ceph-osd --mkfs --mkkey -i <osdnum> --osd-data /osd --osd-journal /osd/journal
- Copy /usr/local/ceph/gaol/osd/keyring to management host
- On management host:
- ceph -c /etc/plceph/ceph.conf auth add osd.<osdnum> osd 'allow *' mon 'allow profile osd' -i <path-to-keyring-file>
- On OSD host:
- Start OSD daemon = /usr/local/ceph/scripts/rungaol /bin/ceph-osd -i <osdnum> --osd-data /osd --osd-journal /osd/journal
See also /etc/plceph/create_new_osd_on_host on management host for script implementation.
Creating a new MGR
ceph-mgr should run on the same hosts as ceph-mon.
- On MGR host:
- Create the directory /usr/local/ceph/gaol/var/lib/ceph/mgr/ceph-<hostname>. Make it root-owned with it and its parent directories 755.
- On management host:
- ceph -c /etc/plceph/ceph.conf auth get-or-create mgr.<hostname> mon 'allow profile mgr' osd 'allow *' mds 'allow *' > /tmp/keyring.
- Copy /tmp/keyring to MGR host directory created above. chown root:games plus chmod 640.
- On MGR host:
- /usr/local/ceph/scripts/rungaol /bin/ceph-mgr -i <hostname>