Lustre SFF

From Define Wiki
Jump to navigation Jump to search

Introduction

Lustre SFF (Small Form Factor), is a compact deployment of ZFS-backed (Zettabyte File System) Lustre intended as an alternative to NFS for a comparable capacity and scalability.

http://lustre.ornl.gov/ecosystem-2016/documents/tutorials/Stearman-LLNL-ZFS.pdf

Intel Enterprise Edition for Lustre White Paper

The Intel January 2014 white paper "Architecting a high performance storage system" serves as a good starting point for optimizing Lustre SFF.

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/architecting-lustre-storage-white-paper.pdf

Backend Storage

smartctl

smartctl (smartmontools; Self-Monitoring, Analysis and Reporting Technology System) is used to uniquely identify devices, conduct device testing, and assess the health of devices.

sgpdd_survey

sgpdd-survey (sg3_utils{,-libs} and Lustre iokit https://downloads.hpdd.intel.com/public/lustre ) is used to analyze backend storage (dd is not suitable as response to multiple IO threads is of interest).

rszlo-rszhi
record size in KB
Affects how many blocks can be transferred in each transaction. Simulates Lustre RPC size.
crglo-crghi
number of regions
Simulates multiple Lustre clients per OST. More regions requires more seeking and hence lower performance.
thrlo-thrhi
number of threads
Simulates OSS threads.
size
total size in MB
blocksize
512 B
Default size is 8 GB and blocksize is 512 B but 32 GB (or 2x system memory) and 1 MB blocksize recommended to simulate Lustre sequential workload.

Recommended parameters: rszhi=1024, thrhi=16, crghi=16, size=32768 (or twice RAM), dio=1, oflag=direct, iflag=direct bs=1048576

obdfilter-survey

case
local-disk, network-echo, network-disk
Run survey on disk-backed local obdfilter instances, network loopback or disk instances.
thrlo-thrhi
Number of threads
nobjlo-nobjhi
Number of objects to read/write.
rszlo-rszhi
Record size in KB.
size
Total IO size in MB.
targets
Names of obdfilter instances.

Recommended parameters: rszlo=rszhi=1024, nobjhi=128, thrhi=128

http://wiki.lustre.org/images/4/40/Wednesday_shpc-2009-benchmarking.pdf

Installing Non-Production Test Lustre (ZFS provided)

yum install epel-release # Provides DKMS.
gpg --quiet --with-fingerprint /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 
# pub  4096R/352C64E5 2013-12-16 Fedora EPEL (7) <epel@fedoraproject.org>
#       Key fingerprint = 91E9 7D7C 4A5E 96F1 7F3E  888F 6A2F AEA2 352C 64E5

https://getfedora.org/keys/


yum install http://download.zfsonlinux.org/epel/zfs-release$(rpm -E %dist).noarch.rpm
gpg --quiet --with-fingerprint /etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux
# pub  2048R/F14AB620 2013-03-21 ZFS on Linux <zfs@zfsonlinux.org>
#    Key fingerprint = C93A FFFD 9F3F 7B03 C310  CEB6 A9D5 A1C0 F14A B620
#    sub  2048R/99685629 2013-03-21

https://github.com/zfsonlinux/zfs/wiki/RHEL-%26-CentOS

yum install lustre-dkms-* lustre-osd-zfs-mount* # Downloaded from HPDD.

Installing ZFS

http://lustre.ornl.gov/ecosystem-2016/documents/tutorials/Stearman-LLNL-ZFS.pdf

yum localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el7.noarch.rpm
yum install kernel-devel zfs

ZFS Pools

ZFS Best Practices: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practics_Guide

Example pool creation:

zpool create scratchZ -o cachefile=none -o ashift=12 -O recordsize=1M -f $(lsscsi -i | grep ST1 | awk '{printf " /dev/disk/by-id/scsi-"$7 }')

Lustre

http://doc.lustre.org/lustre_manual.xhtml