Lustre: General steps for debugging lustre (IEEL) problems
Jump to navigation
Jump to search
In this situation we had the following Lustre setup;
- 2x MDS nodes in a HA configuration
- 4x OSS nodes not in a HA configuration (direct attached storage)
Verify Network Connectivity
- Can the systems ping one another?
- Can the client ping all the of the lustre nodes?
- Check the LNET (Lustre Network) to ensure networking is working on all nodes also.
# check the IPs are reported correctly on each node
[root@lustre01-mds1 ~]# lctl list_nids
10.10.17.193@tcp
[root@lustre02-mds1 ~]# lctl list_nids
10.10.17.194@tcp
# Can we ping through LNET/lctl
[root@lustre02-mds1 ~]# lctl ping 10.10.17.194
12345-0@lo
12345-10.10.17.194@tcp
[root@lustre02-mds1 ~]# lctl ping 10.10.17.195
failed to ping 10.10.17.195@tcp: Input/output error
# note .195 doesnt exist on the fabric so the above is just to demonstrate the output to expectCheck the disks / arrays are reported and mounted
- Verify the RAID arrays are being reported correctly and healthy (using the LSI storcli utility)
- Depending on where StorCli was installed and if its setup in your $PATH, the commands below may need to be updated.
# check everything the controller reports. (LOT of output)
/usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 /c0 show all
# check the drives and their status
/usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 /c0 /eall /sall show
# Note
# their state should be ONLINE
#
# check if there are any rebuilds in place
/usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 /c0 /eall /sall show rebuild- Verify that the MDT / MGT and OST are all mounted on the systems
[root@lustre01-mds1 ~]# df -h | grep lustre
/dev/sda 9.5G 24M 9.0G 1% /lustre/mgt
/dev/sdc 1.3T 92M 1.2T 1% /lustre/lfs2-mdt
/dev/sdb 1.3T 92M 1.2T 1% /lustre/lfs1-mdt
[root@lustre02-oss1 ~]# df -h | grep lustre
/dev/sdb 59T 27G 56T 1% /lustre/lfs2-ost00
[root@lustre02-oss2 ~]# df -h | grep lustre
/dev/sdb 59T 31G 56T 1% /lustre/lfs2-ost01Example process for replacing drives
- In this scenario we ended up with 4x UBAD drives and 1x UGOOD which was a replacement drive inserted. (If a disk is improperly removed then re-attached to the RAID controller, it will be recognised as UBAD (Unconfigured Bad). This does not mean the drive is bad but means the configuration state is (or both) trying to re-attach it if the disk you are re-connecting is new or was working should have no negative effect but before using it you need to change it to good)
# get the IDs of the UBAD drives / IDs are reported as 4:16 which represents the controller enclosure and slot.
[root@lustre02-oss2 ~]# /usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 /c0 /eall /sall show | grep UBad
4:16 21 UBad - 3.637 TB SATA HDD N N 512B HGST HUS724040ALA640 U
4:17 22 UBad - 3.637 TB SATA HDD N N 512B HGST HUS724040ALA640 U
4:18 23 UBad - 3.637 TB SATA HDD N N 512B HGST HUS724040ALA640 U
4:19 24 UBad - 3.637 TB SATA HDD N N 512B HGST HUS724040ALA640 U
# The enclosure above is 4 and slots 16-19. So we set the disks to GOOD using /e4 and /s16 etc
[root@lustre02-oss2 ~]# /usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 /c0 /e4 /s16 set good
Controller = 0
Status = Success
Description = Set Drive Good Succeeded.
# repeat for other disks
storcli64 /c0 /e4 /s17 set good
storcli64 /c0 /e4 /s18 set good
storcli64 /c0 /e4 /s19 set good