Difference between revisions of "Ceph: Commands and Cheatsheet"
(adding section about offfline nodes and readding previously used osds) |
|||
| Line 178: | Line 178: | ||
? someone already removed? | ? someone already removed? | ||
</pre> | </pre> | ||
| + | |||
| + | == Re-add a drained node == | ||
| + | |||
| + | if you have drainied a node and want to re-add it | ||
| + | |||
| + | re-add it to ceph | ||
| + | |||
| + | <pre> | ||
| + | ceph orch host add node003 10.10.15.16 | ||
| + | </pre> | ||
| + | |||
| + | check to see if the disks are available for ceph | ||
| + | |||
| + | <pre> | ||
| + | # ceph orch device ls --wide | ||
| + | Hostname Path Type Transport RPM Vendor Model Serial Size Health Ident Fault Available Reject Reasons | ||
| + | deploy /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45447 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | deploy /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43506 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | deploy /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43517 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | deploy /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45440 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node001 /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43508 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node001 /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB48184 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node001 /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45448 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node001 /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB48182 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node002 /dev/nvme0n1 ssd Unknown Unknown N/A T408-AIC TA19-09-02-C33-0030 536G Unknown N/A N/A Yes | ||
| + | node002 /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43509 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node002 /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45445 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node002 /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43514 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node002 /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB48189 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node003 /dev/nvme0n1 ssd Unknown Unknown N/A T408-AIC TA19-09-02-C33-0016 536G Unknown N/A N/A Yes | ||
| + | node003 /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43507 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node003 /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43499 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node003 /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45446 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | node003 /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45461 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked | ||
| + | </pre> | ||
| + | |||
| + | here they aren't because the old VG still exists | ||
| + | |||
| + | ssh into the node and jump into the cephadm shell and zap them! | ||
| + | first run ceph-volume lvm list to see which drives they were | ||
| + | then zap each one after making sure you are on the right node | ||
| + | <pre> | ||
| + | [root@node003 ~]# cephadm shell | ||
| + | Inferring fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c | ||
| + | Using recent ceph image quay.io/ceph/ceph@sha256:5755c3a5c197ef186b8186212e023565f15b799f1ed411207f2c3fcd4a80ab45 | ||
| + | [ceph: root@node003 /]# ceph-volume lvm list | ||
| + | |||
| + | |||
| + | ====== osd.12 ====== | ||
| + | |||
| + | [block] /dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724 | ||
| + | |||
| + | block device /dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724 | ||
| + | block uuid kYkioH-r7BC-c3b5-dVLT-EKRp-3QNh-m2CDI5 | ||
| + | cephx lockbox secret | ||
| + | cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c | ||
| + | cluster name ceph | ||
| + | crush device class None | ||
| + | encrypted 0 | ||
| + | osd fsid 24b9b57d-f351-46dc-949f-8d28282e9724 | ||
| + | osd id 12 | ||
| + | osdspec affinity dashboard-admin-1636997944996 | ||
| + | type block | ||
| + | vdo 0 | ||
| + | devices /dev/sdc | ||
| + | |||
| + | ====== osd.13 ====== | ||
| + | |||
| + | [block] /dev/ceph-a30f427b-8300-4feb-b630-eb49e55ec34e/osd-block-d11072db-2d8b-4668-a391-f0f68816f7fd | ||
| + | |||
| + | block device /dev/ceph-a30f427b-8300-4feb-b630-eb49e55ec34e/osd-block-d11072db-2d8b-4668-a391-f0f68816f7fd | ||
| + | block uuid TScZzF-ej2H-cpWM-dPUH-fioK-XXHu-Zbe942 | ||
| + | cephx lockbox secret | ||
| + | cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c | ||
| + | cluster name ceph | ||
| + | crush device class None | ||
| + | encrypted 0 | ||
| + | osd fsid d11072db-2d8b-4668-a391-f0f68816f7fd | ||
| + | osd id 13 | ||
| + | osdspec affinity dashboard-admin-1636997944996 | ||
| + | type block | ||
| + | vdo 0 | ||
| + | devices /dev/sdd | ||
| + | |||
| + | ====== osd.14 ====== | ||
| + | |||
| + | [block] /dev/ceph-1780c956-58c3-4d3e-9975-e5712e5d50ce/osd-block-7c783e2b-3c7b-4b5c-b57c-c5b2e824959f | ||
| + | |||
| + | block device /dev/ceph-1780c956-58c3-4d3e-9975-e5712e5d50ce/osd-block-7c783e2b-3c7b-4b5c-b57c-c5b2e824959f | ||
| + | block uuid 8PsGoM-AP1r-0FLC-BOUZ-T4Np-1BRe-KEuMwE | ||
| + | cephx lockbox secret | ||
| + | cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c | ||
| + | cluster name ceph | ||
| + | crush device class None | ||
| + | encrypted 0 | ||
| + | osd fsid 7c783e2b-3c7b-4b5c-b57c-c5b2e824959f | ||
| + | osd id 14 | ||
| + | osdspec affinity dashboard-admin-1636997944996 | ||
| + | type block | ||
| + | vdo 0 | ||
| + | devices /dev/sde | ||
| + | |||
| + | ====== osd.16 ====== | ||
| + | |||
| + | [block] /dev/ceph-bfd69196-153f-46e8-9398-9134fa307733/osd-block-46046b45-03d1-46e9-9cc7-a2a442bc6d02 | ||
| + | |||
| + | block device /dev/ceph-bfd69196-153f-46e8-9398-9134fa307733/osd-block-46046b45-03d1-46e9-9cc7-a2a442bc6d02 | ||
| + | block uuid ed9iDy-MVGC-o8VM-06v5-P8e2-CxnK-5OxFQx | ||
| + | cephx lockbox secret | ||
| + | cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c | ||
| + | cluster name ceph | ||
| + | crush device class None | ||
| + | encrypted 0 | ||
| + | osd fsid 46046b45-03d1-46e9-9cc7-a2a442bc6d02 | ||
| + | osd id 16 | ||
| + | osdspec affinity dashboard-admin-1636997944996 | ||
| + | type block | ||
| + | vdo 0 | ||
| + | devices /dev/sdf | ||
| + | |||
| + | [ceph: root@node003 /]# ceph-volume lvm zap /dev/sdc --destroy | ||
| + | --> Zapping: /dev/sdc | ||
| + | --> Zapping lvm member /dev/sdc. lv_path is /dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724 | ||
| + | Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724 bs=1M count=10 conv=fsync | ||
| + | stderr: 10+0 records in | ||
| + | 10+0 records out | ||
| + | stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.072238 s, 145 MB/s | ||
| + | --> Only 1 LV left in VG, will proceed to destroy volume group ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c | ||
| + | Running command: /usr/sbin/vgremove -v -f ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c | ||
| + | stderr: Removing ceph--cb7a9da8--e813--4783--87cb--5333d5b8d51c-osd--block--24b9b57d--f351--46dc--949f--8d28282e9724 (253:5) | ||
| + | stderr: Archiving volume group "ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c" metadata (seqno 5). | ||
| + | stderr: Releasing logical volume "osd-block-24b9b57d-f351-46dc-949f-8d28282e9724" | ||
| + | stderr: Creating volume group backup "/etc/lvm/backup/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c" (seqno 6). | ||
| + | stdout: Logical volume "osd-block-24b9b57d-f351-46dc-949f-8d28282e9724" successfully removed | ||
| + | stderr: Removing physical volume "/dev/sdc" from volume group "ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c" | ||
| + | stdout: Volume group "ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c" successfully removed | ||
| + | Running command: /usr/bin/dd if=/dev/zero of=/dev/sdc bs=1M count=10 conv=fsync | ||
| + | stderr: 10+0 records in | ||
| + | 10+0 records out | ||
| + | stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0708403 s, 148 MB/s | ||
| + | --> Zapping successful for: <Raw Device: /dev/sdc> | ||
| + | [ceph: root@node003 /]# ceph -s | ||
| + | cluster: | ||
| + | id: 40d3a83a-4639-11ec-8388-3cecef04bf3c | ||
| + | health: HEALTH_WARN | ||
| + | 1/4 mons down, quorum node001,node002,node003 | ||
| + | 1 pool(s) do not have an application enabled | ||
| + | |||
| + | services: | ||
| + | mon: 4 daemons, quorum node001,node002,node003 (age 10h), out of quorum: deploy | ||
| + | mgr: node001.thgtmt(active, since 9h), standbys: node002.gsprmz | ||
| + | osd: 12 osds: 12 up (since 17h), 12 in (since 5M) | ||
| + | |||
| + | data: | ||
| + | pools: 9 pools, 584 pgs | ||
| + | objects: 401.05k objects, 1.8 TiB | ||
| + | usage: 5.3 TiB used, 5.2 TiB / 10 TiB avail | ||
| + | pgs: 584 active+clean | ||
| + | |||
| + | io: | ||
| + | client: 16 MiB/s rd, 570 KiB/s wr, 4.01k op/s rd, 86 op/s wr | ||
| + | [ceph: root@node003 /]# ceph-volume lvm zap /dev/sdd --destroy | ||
| + | <snip> | ||
| + | --> Zapping successful for: <Raw Device: /dev/sdd> | ||
| + | [ceph: root@node003 /]# ceph-volume lvm zap /dev/sde --destroy | ||
| + | --> Zapping: /dev/sde | ||
| + | <snip> | ||
| + | --> Zapping successful for: <Raw Device: /dev/sde> | ||
| + | [ceph: root@node003 /]# ceph-volume lvm zap /dev/sdf --destroy | ||
| + | --> Zapping: /dev/sdf | ||
| + | </pre> | ||
| + | |||
| + | Assuming that the ceph orchestrator is still running then it should start them for you | ||
| + | <pre> | ||
| + | [root@deploy ~]# ceph orch ps | ||
| + | NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID | ||
| + | alertmanager.node001 node001 *:9093,9094 running (5M) 7m ago 22M 75.8M - 0.20.0 0881eb8f169f 016d9674735d | ||
| + | crash.deploy deploy running (5M) 9m ago 22M 28.9M - 16.2.6 02a72919e474 4bc3ee42dd35 | ||
| + | crash.node001 node001 running (5M) 7m ago 22M 23.4M - 16.2.6 02a72919e474 c1d2dde0dde4 | ||
| + | crash.node002 node002 running (5M) 4s ago 22M 7105k - 16.2.6 02a72919e474 0955608ff18d | ||
| + | crash.node003 node003 running (60m) 3m ago 60m 6413k - 16.2.6 02a72919e474 f7f8bb1676da | ||
| + | grafana.node001 node001 *:3000 running (5M) 7m ago 22M 151M - 6.7.4 557c83e11646 3ccda42d7d9f | ||
| + | mgr.node001.thgtmt node001 *:8443,9283 running (5M) 7m ago 22M 643M - 16.2.6 02a72919e474 d6a1964a7879 | ||
| + | mgr.node002.gsprmz node002 *:8443,9283 running (5M) 4s ago 22M 485M - 16.2.6 02a72919e474 bdd0924ab48e | ||
| + | mon.deploy deploy error 9m ago 18M - 2048M <unknown> <unknown> <unknown> | ||
| + | mon.node001 node001 running (5M) 7m ago 22M 1635M 2048M 16.2.6 02a72919e474 abe841493d7a | ||
| + | mon.node002 node002 running (5M) 4s ago 22M 1666M 2048M 16.2.6 02a72919e474 cfbacb26d1a7 | ||
| + | mon.node003 node003 running (12h) 3m ago 22M 977M 2048M 16.2.6 02a72919e474 edde532df630 | ||
| + | node-exporter.deploy deploy *:9100 running (5M) 9m ago 22M 66.6M - 0.18.1 e5a616e4b9cf fb4d8a3007a7 | ||
| + | node-exporter.node001 node001 *:9100 running (5M) 7m ago 22M 67.3M - 0.18.1 e5a616e4b9cf 100daf55dc10 | ||
| + | node-exporter.node002 node002 *:9100 running (5M) 4s ago 22M 67.0M - 0.18.1 e5a616e4b9cf bcc94fa5029b | ||
| + | node-exporter.node003 node003 *:9100 running (60m) 3m ago 60m 42.1M - 0.18.1 e5a616e4b9cf e28345452f6a | ||
| + | osd.0 deploy running (5M) 9m ago 22M 42.8G 4096M 16.2.6 02a72919e474 b3b83e70e39e | ||
| + | osd.1 deploy running (5M) 9m ago 22M 44.7G 4096M 16.2.6 02a72919e474 62875261b781 | ||
| + | osd.10 node002 running (5M) 4s ago 22M 16.2G 4096M 16.2.6 02a72919e474 b93e2c9f522a | ||
| + | osd.11 node002 running (5M) 4s ago 22M 15.9G 4096M 16.2.6 02a72919e474 b0a6708c217e | ||
| + | osd.12 node003 running (59m) 3m ago 59m 2942M 4096M 16.2.6 02a72919e474 d8a9e6b34e8e | ||
| + | osd.13 node003 running (59m) 3m ago 59m 3081M 4096M 16.2.6 02a72919e474 3e5129206a99 | ||
| + | osd.14 node003 running (59m) 3m ago 59m 3033M 4096M 16.2.6 02a72919e474 5e7eaece5ac8 | ||
| + | osd.15 node003 running (59m) 3m ago 59m 3259M 4096M 16.2.6 02a72919e474 670189a8ba11 | ||
| + | osd.2 deploy running (5M) 9m ago 22M 27.9G 4096M 16.2.6 02a72919e474 797fec7a5249 | ||
| + | osd.3 deploy running (5M) 9m ago 13M 68.6G 4096M 16.2.6 02a72919e474 20bdf4cd7e21 | ||
| + | osd.4 node001 running (5M) 7m ago 22M 13.5G 4096M 16.2.6 02a72919e474 e920a032c446 | ||
| + | osd.5 node001 running (5M) 7m ago 22M 10.4G 4096M 16.2.6 02a72919e474 cfd25cd7297d | ||
| + | osd.6 node001 running (5M) 7m ago 22M 15.4G 4096M 16.2.6 02a72919e474 6a09fd60fa9c | ||
| + | osd.7 node001 running (5M) 7m ago 22M 25.4G 4096M 16.2.6 02a72919e474 4e429076d3f5 | ||
| + | osd.8 node002 running (5M) 4s ago 22M 14.7G 4096M 16.2.6 02a72919e474 a97212b40475 | ||
| + | osd.9 node002 running (5M) 4s ago 22M 9731M 4096M 16.2.6 02a72919e474 c655f1983800 | ||
| + | prometheus.node001 node001 *:9095 running (59m) 7m ago 22M 291M - 2.18.1 de242295e225 dfe09178132a | ||
| + | </pre> | ||
| + | |||
| + | == ceph node appears offline but has fuctioning components == | ||
| + | this looks like this | ||
| + | <pre> | ||
| + | [root@deploy ~]# ceph orch host ls | ||
| + | HOST ADDR LABELS STATUS | ||
| + | deploy 10.10.15.1 _admin Offline | ||
| + | node001 10.10.15.14 _admin | ||
| + | node002 10.10.15.15 _admin | ||
| + | node003 10.10.15.16 _admin | ||
| + | |||
| + | [root@deploy ~]# ceph osd tree | ||
| + | ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF | ||
| + | -1 13.97253 root default | ||
| + | -13 13.97253 room testroom | ||
| + | -12 13.97253 rack rack1 | ||
| + | -11 13.97253 chassis AMD_twin_1 | ||
| + | -3 3.49316 host deploy | ||
| + | 0 ssd 0.87329 osd.0 up 1.00000 1.00000 | ||
| + | 1 ssd 0.87329 osd.1 up 1.00000 1.00000 | ||
| + | 2 ssd 0.87329 osd.2 up 1.00000 1.00000 | ||
| + | 3 ssd 0.87329 osd.3 up 1.00000 1.00000 | ||
| + | snip | ||
| + | </pre> | ||
| + | |||
| + | |||
| + | I looked in the journal for this node and noticed that there were ssh login fails from the active ceph mgmt node. Turns out the cpeh pubkey had been lost in this node and adding it back allowed it to be re-detected. | ||
| + | |||
| + | <pre> | ||
| + | Oct 10 09:55:23 deploy sshd[3610000]: Failed password for root from 10.10.15.14 port 54210 ssh2 | ||
| + | Oct 10 09:55:23 deploy sshd[3610000]: Connection closed by authenticating user root 10.10.15.14 port 54210 [preauth] | ||
| + | </pre> | ||
| + | There was a scary moment when it briefly did something that interrupted the connection to the mons. . . | ||
| + | |||
| + | but this only affected this node and it immediately started applying the osd change from the previous section | ||
Revision as of 11:11, 10 October 2023
Working with Ceph pools
Check the pools
[root@deploy-ext kolla]# ceph osd pool ls device_health_metrics images volumes backups manila_data manila_metadata .rgw.root default.rgw.log default.rgw.control default.rgw.meta default.rgw.buckets.index default.rgw.buckets.data default.rgw.buckets.non-ec
Create a pool
ceph osd pool create dptest 128 128 [root@deploy-ext kolla]# rbd create --size 20480 --pool dptest vol01 [root@deploy-ext kolla]# rbd info dptest/vol01 rbd image 'vol01': size 20 GiB in 5120 objects order 22 (4 MiB objects) snapshot_count: 0 id: 180b9ee11c2183 block_name_prefix: rbd_data.180b9ee11c2183 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Thu Sep 2 07:28:16 2021 access_timestamp: Thu Sep 2 07:28:16 2021 modify_timestamp: Thu Sep 2 07:28:16 2021
Resize a pool
[root@deploy-ext kolla]# rbd resize dptest/vol01 --size 51200 Resizing image: 100% complete...done. [root@deploy-ext kolla]# rbd info dptest/vol01 rbd image 'vol01': size 50 GiB in 12800 objects order 22 (4 MiB objects) snapshot_count: 0 id: 180b9ee11c2183 block_name_prefix: rbd_data.180b9ee11c2183 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Thu Sep 2 07:28:16 2021 access_timestamp: Thu Sep 2 07:28:16 2021 modify_timestamp: Thu Sep 2 07:28:16 2021
Working with Volumes
Map a device / mount a volume
[root@deploy-ext kolla]# rbd map dptest/vol01 /dev/rbd0 [root@deploy-ext kolla]# mkfs.xfs /dev/rbd/dptest/vol01 [root@deploy-ext kolla]# mount /dev/rbd/dptest/vol01 ./ceph-vol/ [root@deploy-ext kolla]# df -h | grep ceph-vol /dev/rbd0 50G 390M 50G 1% /root/kolla/ceph-vol
Ceph maintenance
Stop rebalancing (useful for rebooting systems etc)
ceph osd set noout; ceph osd set norebalance # perform reboot / maintenance ;
Ceph locks
# here we had an instance loose connectivity to ceph # in openstack set state to active (node reset-state --active UUID) # openstack server stop UUID # then remove the lock as below # then we could snapshot ceph osd unset norebalance; ceph osd unset noout # checking volume info (openstack env timed out, went into error root@str-237:~# rbd info volumes/volume-417feeef-d79d-4a31-af13-f1bee971284b rbd image 'volume-417feeef-d79d-4a31-af13-f1bee971284b': size 3.9 TiB in 512000 objects order 23 (8 MiB objects) snapshot_count: 0 id: 5ba8f5a87a6a8 block_name_prefix: rbd_data.5ba8f5a87a6a8 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Tue Feb 14 19:37:14 2023 access_timestamp: Thu Feb 23 19:18:17 2023 modify_timestamp: Thu Feb 23 19:18:21 2023 parent: volumes/volume-7aedaaa5-f547-401a-b4cb-1d2271eb1c7d@snapshot-509cbdc3-4eeb-4acf-ae0e-0cf64aa9f824 overlap: 3.9 TiB root@str-237:~# rbd lock ls volumes/volume-417feeef-d79d-4a31-af13-f1bee971284b There is 1 exclusive lock on this image. Locker ID Address client.6007059 auto 139760250032464 10.16.31.12:0/3416256037 root@str-237:~# rbd lock rm volumes/volume-417feeef-d79d-4a31-af13-f1bee971284b "auto 139760250032464" client.6007059 root@str-237:~# rbd lock ls volumes/volume-417feeef-d79d-4a31-af13-f1bee971284b
Ceph drain a host
cephadm shell ceph orch host ls ceph orch host drain node003 ceph orch osd rm status ceph orch ps node003 # outputs [ceph: root@node003 /]# ceph orch host ls HOST ADDR LABELS STATUS deploy 10.10.15.1 _admin node001 10.10.15.14 _admin node002 10.10.15.15 _admin node003 10.10.15.16 _admin [ceph: root@node003 /]# ceph orch host drain node003 Scheduled to remove the following daemons from host 'node003' type id -------------------- --------------- osd 12 osd 13 osd 14 crash node003 mon node003 node-exporter node003 osd 16 [ceph: root@node003 /]# ceph orch osd rm status OSD_ID HOST STATE PG_COUNT REPLACE FORCE DRAIN_STARTED_AT 12 node003 draining 71 False False 2023-10-09 15:05:52.022780 13 node003 draining 63 False False 2023-10-09 15:05:52.832505 14 node003 draining 63 False False 2023-10-09 15:05:53.833922 16 node003 draining 61 False False 2023-10-09 15:05:51.014545 [ceph: root@node003 /]# ceph orch ps node003 NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mon.node003 node003 running (5M) 82s ago 22M 1570M 2048M 16.2.6 02a72919e474 e96bc70dbff9 osd.12 node003 running (5M) 82s ago 22M 20.0G 4096M 16.2.6 02a72919e474 930a4def245a osd.13 node003 running (5M) 82s ago 22M 17.0G 4096M 16.2.6 02a72919e474 87c6bd55a2ac osd.14 node003 running (5M) 82s ago 22M 30.7G 4096M 16.2.6 02a72919e474 5c6c858a2bd3 osd.16 node003 running (5M) 82s ago 22M 33.1G 4096M 16.2.6 02a72919e474 76708b8a5a9e [ceph: root@node001 /]# ceph orch ps node003 NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mon.node003 node003 running (5M) 5m ago 22M 1642M 2048M 16.2.6 02a72919e474 e96bc70dbff9 [ceph: root@node001 /]# ceph orch rm node003 Failed to remove service. <node003> was not found. Running service names can be found with "ceph orch ls" ? someone already removed?
Re-add a drained node
if you have drainied a node and want to re-add it
re-add it to ceph
ceph orch host add node003 10.10.15.16
check to see if the disks are available for ceph
# ceph orch device ls --wide Hostname Path Type Transport RPM Vendor Model Serial Size Health Ident Fault Available Reject Reasons deploy /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45447 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked deploy /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43506 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked deploy /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43517 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked deploy /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45440 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node001 /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43508 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node001 /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB48184 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node001 /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45448 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node001 /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB48182 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node002 /dev/nvme0n1 ssd Unknown Unknown N/A T408-AIC TA19-09-02-C33-0030 536G Unknown N/A N/A Yes node002 /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43509 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node002 /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45445 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node002 /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43514 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node002 /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB48189 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node003 /dev/nvme0n1 ssd Unknown Unknown N/A T408-AIC TA19-09-02-C33-0016 536G Unknown N/A N/A Yes node003 /dev/sdc ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43507 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node003 /dev/sdd ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB43499 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node003 /dev/sde ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45446 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked node003 /dev/sdf ssd Unknown Unknown ATA SAMSUNG MZ7LH960 S45NNC0NB45461 960G Unknown N/A N/A No Insufficient space (<10 extents) on vgs, LVM detected, locked
here they aren't because the old VG still exists
ssh into the node and jump into the cephadm shell and zap them! first run ceph-volume lvm list to see which drives they were then zap each one after making sure you are on the right node
[root@node003 ~]# cephadm shell
Inferring fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c
Using recent ceph image quay.io/ceph/ceph@sha256:5755c3a5c197ef186b8186212e023565f15b799f1ed411207f2c3fcd4a80ab45
[ceph: root@node003 /]# ceph-volume lvm list
====== osd.12 ======
[block] /dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724
block device /dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724
block uuid kYkioH-r7BC-c3b5-dVLT-EKRp-3QNh-m2CDI5
cephx lockbox secret
cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c
cluster name ceph
crush device class None
encrypted 0
osd fsid 24b9b57d-f351-46dc-949f-8d28282e9724
osd id 12
osdspec affinity dashboard-admin-1636997944996
type block
vdo 0
devices /dev/sdc
====== osd.13 ======
[block] /dev/ceph-a30f427b-8300-4feb-b630-eb49e55ec34e/osd-block-d11072db-2d8b-4668-a391-f0f68816f7fd
block device /dev/ceph-a30f427b-8300-4feb-b630-eb49e55ec34e/osd-block-d11072db-2d8b-4668-a391-f0f68816f7fd
block uuid TScZzF-ej2H-cpWM-dPUH-fioK-XXHu-Zbe942
cephx lockbox secret
cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c
cluster name ceph
crush device class None
encrypted 0
osd fsid d11072db-2d8b-4668-a391-f0f68816f7fd
osd id 13
osdspec affinity dashboard-admin-1636997944996
type block
vdo 0
devices /dev/sdd
====== osd.14 ======
[block] /dev/ceph-1780c956-58c3-4d3e-9975-e5712e5d50ce/osd-block-7c783e2b-3c7b-4b5c-b57c-c5b2e824959f
block device /dev/ceph-1780c956-58c3-4d3e-9975-e5712e5d50ce/osd-block-7c783e2b-3c7b-4b5c-b57c-c5b2e824959f
block uuid 8PsGoM-AP1r-0FLC-BOUZ-T4Np-1BRe-KEuMwE
cephx lockbox secret
cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c
cluster name ceph
crush device class None
encrypted 0
osd fsid 7c783e2b-3c7b-4b5c-b57c-c5b2e824959f
osd id 14
osdspec affinity dashboard-admin-1636997944996
type block
vdo 0
devices /dev/sde
====== osd.16 ======
[block] /dev/ceph-bfd69196-153f-46e8-9398-9134fa307733/osd-block-46046b45-03d1-46e9-9cc7-a2a442bc6d02
block device /dev/ceph-bfd69196-153f-46e8-9398-9134fa307733/osd-block-46046b45-03d1-46e9-9cc7-a2a442bc6d02
block uuid ed9iDy-MVGC-o8VM-06v5-P8e2-CxnK-5OxFQx
cephx lockbox secret
cluster fsid 40d3a83a-4639-11ec-8388-3cecef04bf3c
cluster name ceph
crush device class None
encrypted 0
osd fsid 46046b45-03d1-46e9-9cc7-a2a442bc6d02
osd id 16
osdspec affinity dashboard-admin-1636997944996
type block
vdo 0
devices /dev/sdf
[ceph: root@node003 /]# ceph-volume lvm zap /dev/sdc --destroy
--> Zapping: /dev/sdc
--> Zapping lvm member /dev/sdc. lv_path is /dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724
Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c/osd-block-24b9b57d-f351-46dc-949f-8d28282e9724 bs=1M count=10 conv=fsync
stderr: 10+0 records in
10+0 records out
stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.072238 s, 145 MB/s
--> Only 1 LV left in VG, will proceed to destroy volume group ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c
Running command: /usr/sbin/vgremove -v -f ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c
stderr: Removing ceph--cb7a9da8--e813--4783--87cb--5333d5b8d51c-osd--block--24b9b57d--f351--46dc--949f--8d28282e9724 (253:5)
stderr: Archiving volume group "ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c" metadata (seqno 5).
stderr: Releasing logical volume "osd-block-24b9b57d-f351-46dc-949f-8d28282e9724"
stderr: Creating volume group backup "/etc/lvm/backup/ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c" (seqno 6).
stdout: Logical volume "osd-block-24b9b57d-f351-46dc-949f-8d28282e9724" successfully removed
stderr: Removing physical volume "/dev/sdc" from volume group "ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c"
stdout: Volume group "ceph-cb7a9da8-e813-4783-87cb-5333d5b8d51c" successfully removed
Running command: /usr/bin/dd if=/dev/zero of=/dev/sdc bs=1M count=10 conv=fsync
stderr: 10+0 records in
10+0 records out
stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0708403 s, 148 MB/s
--> Zapping successful for: <Raw Device: /dev/sdc>
[ceph: root@node003 /]# ceph -s
cluster:
id: 40d3a83a-4639-11ec-8388-3cecef04bf3c
health: HEALTH_WARN
1/4 mons down, quorum node001,node002,node003
1 pool(s) do not have an application enabled
services:
mon: 4 daemons, quorum node001,node002,node003 (age 10h), out of quorum: deploy
mgr: node001.thgtmt(active, since 9h), standbys: node002.gsprmz
osd: 12 osds: 12 up (since 17h), 12 in (since 5M)
data:
pools: 9 pools, 584 pgs
objects: 401.05k objects, 1.8 TiB
usage: 5.3 TiB used, 5.2 TiB / 10 TiB avail
pgs: 584 active+clean
io:
client: 16 MiB/s rd, 570 KiB/s wr, 4.01k op/s rd, 86 op/s wr
[ceph: root@node003 /]# ceph-volume lvm zap /dev/sdd --destroy
<snip>
--> Zapping successful for: <Raw Device: /dev/sdd>
[ceph: root@node003 /]# ceph-volume lvm zap /dev/sde --destroy
--> Zapping: /dev/sde
<snip>
--> Zapping successful for: <Raw Device: /dev/sde>
[ceph: root@node003 /]# ceph-volume lvm zap /dev/sdf --destroy
--> Zapping: /dev/sdf
Assuming that the ceph orchestrator is still running then it should start them for you
[root@deploy ~]# ceph orch ps NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID alertmanager.node001 node001 *:9093,9094 running (5M) 7m ago 22M 75.8M - 0.20.0 0881eb8f169f 016d9674735d crash.deploy deploy running (5M) 9m ago 22M 28.9M - 16.2.6 02a72919e474 4bc3ee42dd35 crash.node001 node001 running (5M) 7m ago 22M 23.4M - 16.2.6 02a72919e474 c1d2dde0dde4 crash.node002 node002 running (5M) 4s ago 22M 7105k - 16.2.6 02a72919e474 0955608ff18d crash.node003 node003 running (60m) 3m ago 60m 6413k - 16.2.6 02a72919e474 f7f8bb1676da grafana.node001 node001 *:3000 running (5M) 7m ago 22M 151M - 6.7.4 557c83e11646 3ccda42d7d9f mgr.node001.thgtmt node001 *:8443,9283 running (5M) 7m ago 22M 643M - 16.2.6 02a72919e474 d6a1964a7879 mgr.node002.gsprmz node002 *:8443,9283 running (5M) 4s ago 22M 485M - 16.2.6 02a72919e474 bdd0924ab48e mon.deploy deploy error 9m ago 18M - 2048M <unknown> <unknown> <unknown> mon.node001 node001 running (5M) 7m ago 22M 1635M 2048M 16.2.6 02a72919e474 abe841493d7a mon.node002 node002 running (5M) 4s ago 22M 1666M 2048M 16.2.6 02a72919e474 cfbacb26d1a7 mon.node003 node003 running (12h) 3m ago 22M 977M 2048M 16.2.6 02a72919e474 edde532df630 node-exporter.deploy deploy *:9100 running (5M) 9m ago 22M 66.6M - 0.18.1 e5a616e4b9cf fb4d8a3007a7 node-exporter.node001 node001 *:9100 running (5M) 7m ago 22M 67.3M - 0.18.1 e5a616e4b9cf 100daf55dc10 node-exporter.node002 node002 *:9100 running (5M) 4s ago 22M 67.0M - 0.18.1 e5a616e4b9cf bcc94fa5029b node-exporter.node003 node003 *:9100 running (60m) 3m ago 60m 42.1M - 0.18.1 e5a616e4b9cf e28345452f6a osd.0 deploy running (5M) 9m ago 22M 42.8G 4096M 16.2.6 02a72919e474 b3b83e70e39e osd.1 deploy running (5M) 9m ago 22M 44.7G 4096M 16.2.6 02a72919e474 62875261b781 osd.10 node002 running (5M) 4s ago 22M 16.2G 4096M 16.2.6 02a72919e474 b93e2c9f522a osd.11 node002 running (5M) 4s ago 22M 15.9G 4096M 16.2.6 02a72919e474 b0a6708c217e osd.12 node003 running (59m) 3m ago 59m 2942M 4096M 16.2.6 02a72919e474 d8a9e6b34e8e osd.13 node003 running (59m) 3m ago 59m 3081M 4096M 16.2.6 02a72919e474 3e5129206a99 osd.14 node003 running (59m) 3m ago 59m 3033M 4096M 16.2.6 02a72919e474 5e7eaece5ac8 osd.15 node003 running (59m) 3m ago 59m 3259M 4096M 16.2.6 02a72919e474 670189a8ba11 osd.2 deploy running (5M) 9m ago 22M 27.9G 4096M 16.2.6 02a72919e474 797fec7a5249 osd.3 deploy running (5M) 9m ago 13M 68.6G 4096M 16.2.6 02a72919e474 20bdf4cd7e21 osd.4 node001 running (5M) 7m ago 22M 13.5G 4096M 16.2.6 02a72919e474 e920a032c446 osd.5 node001 running (5M) 7m ago 22M 10.4G 4096M 16.2.6 02a72919e474 cfd25cd7297d osd.6 node001 running (5M) 7m ago 22M 15.4G 4096M 16.2.6 02a72919e474 6a09fd60fa9c osd.7 node001 running (5M) 7m ago 22M 25.4G 4096M 16.2.6 02a72919e474 4e429076d3f5 osd.8 node002 running (5M) 4s ago 22M 14.7G 4096M 16.2.6 02a72919e474 a97212b40475 osd.9 node002 running (5M) 4s ago 22M 9731M 4096M 16.2.6 02a72919e474 c655f1983800 prometheus.node001 node001 *:9095 running (59m) 7m ago 22M 291M - 2.18.1 de242295e225 dfe09178132a
ceph node appears offline but has fuctioning components
this looks like this
[root@deploy ~]# ceph orch host ls HOST ADDR LABELS STATUS deploy 10.10.15.1 _admin Offline node001 10.10.15.14 _admin node002 10.10.15.15 _admin node003 10.10.15.16 _admin [root@deploy ~]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 13.97253 root default -13 13.97253 room testroom -12 13.97253 rack rack1 -11 13.97253 chassis AMD_twin_1 -3 3.49316 host deploy 0 ssd 0.87329 osd.0 up 1.00000 1.00000 1 ssd 0.87329 osd.1 up 1.00000 1.00000 2 ssd 0.87329 osd.2 up 1.00000 1.00000 3 ssd 0.87329 osd.3 up 1.00000 1.00000 snip
I looked in the journal for this node and noticed that there were ssh login fails from the active ceph mgmt node. Turns out the cpeh pubkey had been lost in this node and adding it back allowed it to be re-detected.
Oct 10 09:55:23 deploy sshd[3610000]: Failed password for root from 10.10.15.14 port 54210 ssh2 Oct 10 09:55:23 deploy sshd[3610000]: Connection closed by authenticating user root 10.10.15.14 port 54210 [preauth]
There was a scary moment when it briefly did something that interrupted the connection to the mons. . .
but this only affected this node and it immediately started applying the osd change from the previous section