Difference between revisions of "FhGFS Installation and Configuration"

From Define Wiki
Jump to navigation Jump to search
 
(2 intermediate revisions by one other user not shown)
Line 3: Line 3:
  
 
== System Architecture ==
 
== System Architecture ==
http://www.fhgfs.com/wiki/images/sysarch2.png
+
[[File:fhgfs_arch.png|400px]]
  
== Prepare Hosts ==
+
= File Systems Tips =
 +
* Taken from: http://www.fhgfs.com/wiki/wikka.php?wakka=PartitionAlignment
 +
 
 +
== Partition Alignment ==
 +
* Avoid problems with partition alignment by not using partitions
 +
<syntaxhighlight>
 +
  mkfs.[ext4|xfs] /dev/sdX
 +
</syntaxhighlight>
 +
 
 +
== Raid Optimised FS ==
 +
* Create a RAID optimised filesystem
 +
* Assuming we are using a 10drive RAID5 (or 9 usable drives less parity) and a raid stripe/chunk size of 64k
 +
<syntaxhighlight>
 +
  $ mkfs.[xfs|ext4] -d su=64k,sw=9 -l version=2,su=64k /dev/sdc1
 +
</syntaxhighlight>
 +
 
 +
== MetaData FS Optimisation ==
 +
* Use EA (Extended Attributes) and Ext4 for the metadata FS
 +
<syntaxhighlight>
 +
  $ mkfs.ext4 -i 2048 -I 512 -J size=400 -Odir_index,filetype,^extents /dev/sdX
 +
</syntaxhighlight>
 +
 
 +
* Mount using <tt>user_xattr</tt>
 +
<syntaxhighlight>
 +
  $ tune2fs -o user_xattr /dev/sdX
 +
</syntaxhighlight>
 +
 
 +
* Other mount options
 +
<syntaxhighlight>
 +
  $ mount -onoatime,nodiratime,nobarrier /dev/sdX <mountpoint>
 +
</syntaxhighlight>
 +
 
 +
* Other MetaData Server Suggestions
 +
<syntaxhighlight>
 +
  # The deadline scheduler typically yields best results for metadata access.
 +
  $ echo deadline > /sys/block/sdX/queue/scheduler
 +
 
 +
  # In order to avoid latencies, the number of schedulable requests should not be too high, the linux default of 128 is a good value.
 +
  $ echo 128 > /sys/block/sdX/queue/nr_requests
 +
</syntaxhighlight>
 +
 
 +
* '''Hint''': When tuning your metadata servers, keep in mind that this is often not so much about throughput, but rather about latency and also some amount of fairness: There are probably also some interactive users on your cluster, who want to see the results of their <tt>ls</tt> and other commands in an acceptable time, so you should try to work on that. This means, for instance, that you probably don't want to set a high value for <tt>/sys/block/sdX/iosched/read_expire</tt> on the metadata servers to make sure that users won't be waiting too long for their operations to complete.
 +
 
 +
= Prepare Hosts for Installation =
 
* Add the FhGFS repo to your environment
 
* Add the FhGFS repo to your environment
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 16: Line 59:
 
   rpm --import http://www.fhgfs.com/release/latest-stable/gpg/RPM-GPG-KEY-fhgfs
 
   rpm --import http://www.fhgfs.com/release/latest-stable/gpg/RPM-GPG-KEY-fhgfs
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
= Host Installation =
  
 
== Setup the Hosts ==
 
== Setup the Hosts ==
Line 35: Line 80:
 
<syntaxhighlight>
 
<syntaxhighlight>
 
   yum install fhgfs-meta
 
   yum install fhgfs-meta
 +
</syntaxhighlight>
 +
 +
* Enable IB
 +
<syntaxhighlight>
 +
  $ fhgfs-opentk-lib-update-ib
 +
  Running Infiniband auto-detection...
 +
 +
  Setting symlink in /opt/fhgfs/lib: libfhgfs-opentk.so -> libfhgfs-opentk-enabledIB.so
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Line 41: Line 94:
 
<syntaxhighlight>
 
<syntaxhighlight>
 
   yum install fhgfs-storage
 
   yum install fhgfs-storage
 +
</syntaxhighlight>
 +
 +
* Enable IB
 +
<syntaxhighlight>
 +
  $ fhgfs-opentk-lib-update-ib
 +
  Running Infiniband auto-detection...
 +
 +
  Setting symlink in /opt/fhgfs/lib: libfhgfs-opentk.so -> libfhgfs-opentk-enabledIB.so
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Line 46: Line 107:
 
* On each client
 
* On each client
 
<syntaxhighlight>
 
<syntaxhighlight>
   yum install fhgfs-client
+
   yum install fhgfs-client fhgfs-helperd fhgfs-utils
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Line 83: Line 144:
 
* '''Note: The standard module only build for ethernet only, no IB support'''
 
* '''Note: The standard module only build for ethernet only, no IB support'''
 
* Rebuild client for IB (OFED)
 
* Rebuild client for IB (OFED)
 +
<syntaxhighlight>
 +
  # file: /etc/fhgfs/fhgfs-client-autobuild.conf
 +
  # change:
 +
  buildArgs=-j8
 +
  # to: (if using the kernel ofed source)
 +
  buildArgs=-j8 FHGFS_OPENTK_IBVERBS=1
 +
  # or: (if using own OFED)
 +
  buildArgs=-j8 FHGFS_OPENTK_IBVERBS=1 OFED_INCLUDE_PATH=/usr/src/openib/include
 +
  # then rebuild
 +
  # command for mlnx ofed: buildArgs=-j8 FHGFS_OPENTK_IBVERBS=1 OFED_INCLUDE_PATH=/usr/src/ofa_kernel/default/include
 +
</syntaxhighlight>
 +
 +
= Host Configuration =
 +
 +
== Management Server Configuration ==
 +
<syntaxhighlight>
 +
  mkdir /data/fhgfs/fhgfs_mgmt
 +
  # edit: /etc/fhgfs/fhgfs-mgmtd.conf
 +
  storeMgmtdDirectory=/data/fhgfs/fhgfs_mgmtd
 +
  /etc/init.d/fhgfs-mgmtd restart
 +
</syntaxhighlight>
 +
 +
== MetaData Server ==
 +
* Recommended to use ext4, see: http://www.fhgfs.com/wiki/wikka.php?wakka=MetaServerTuning for more details on the advantages of ext4 over XFS for metadata
 +
<syntaxhighlight>
 +
  $ mkdir -p /data/fhgfs/fhgfs_meta
 +
  $ vim /etc/fhgfs/fhgfs-meta.conf
 +
 +
  # Find the option "storeMetaDirectory" and set it to "storeMetaDirectory=/data/fhgfs/fhgfs_meta".
 +
  # Find the option "sysMgmtdHost" and set it to "sysMgmtdHost=blade3"
 +
</syntaxhighlight>
 +
 +
== Storage Server ==
 +
* On each storage server
 +
<syntaxhighlight>
 +
  $ mkdir -p /data/fhgfs/fhgfs_storage
 +
  $ vim /etc/fhgfs/fhgfs-storage.conf
 +
 +
  # Find the option "storeStorageDirectory" and set it to "storeStorageDirectory=/data/fhgfs/fhgfs_storage".
 +
  # Find the option "sysMgmtdHost" and set it to "sysMgmtdHost=blade3"
 +
</syntaxhighlight>
 +
 +
== Client ==
 +
* Start up the helperd (for DNS resolution)
 +
<syntaxhighlight>
 +
  $ /etc/init.d/fhgfs-helperd restart
 +
</syntaxhighlight>
 +
 +
* Configure the client
 +
<syntaxhighlight>
 +
  $ vim /etc/fhgfs/fhgfs-client.conf
 +
 +
  # Find the option "sysMgmtdHost" and set it to "sysMgmtdHost=blade3".
 +
  $ /etc/init.d/fhgfs-client restart
 +
 +
  # To change the mount point:
 +
  # You will typically have a line like this in the fhgfs-mounts.conf file above:
 +
  # /mnt/fhgfs /etc/fhgfs/fhgfs-client.conf
 +
</syntaxhighlight>
 +
 +
= Verify the Setup =
 +
* On a client with <tt>fhgfs-utils</tt> installed
 +
<syntaxhighlight>
 +
[root@Blade7 ~]# fhgfs-ctl --listnodes --nodetype=meta --details
 +
Blade4 [ID: 2987]
 +
  Ports: UDP: 8005; TCP: 8005
 +
  Interfaces: ib0(RDMA) eth2(RDMA) ib0(TCP) eth2(TCP) eth0(TCP)
 +
 +
Number of nodes: 1
 +
Root: 2987
 +
[root@Blade7 ~]# fhgfs-ctl --listnodes --nodetype=storage --details
 +
Blade6 [ID: 1969]
 +
  Ports: UDP: 8003; TCP: 8003
 +
  Interfaces: eth2(RDMA) ib0(RDMA) eth2(TCP) ib0(TCP) eth0(TCP)
 +
Blade5 [ID: 53469]
 +
  Ports: UDP: 8003; TCP: 8003
 +
  Interfaces: eth2(RDMA) ib0(RDMA) eth2(TCP) ib0(TCP) eth0(TCP)
 +
 +
Number of nodes: 2
 +
[root@Blade7 ~]# fhgfs-net
 +
 +
mgmt_nodes
 +
=============
 +
Blade3 [ID: 1]
 +
  Connections: TCP: 1 (101.101.101.3:8008);
 +
 +
meta_nodes
 +
=============
 +
Blade4 [ID: 2987]
 +
  Connections: RDMA: 1 (101.101.101.4:8005);
 +
 +
storage_nodes
 +
=============
 +
Blade6 [ID: 1969]
 +
  Connections: RDMA: 1 (101.101.101.6:8003);
 +
Blade5 [ID: 53469]
 +
  Connections: RDMA: 1 (101.101.101.5:8003);
 +
 +
</syntaxhighlight>

Latest revision as of 16:44, 31 July 2013

System Architecture

Error creating thumbnail: File missing

File Systems Tips

Partition Alignment

  • Avoid problems with partition alignment by not using partitions
  mkfs.[ext4|xfs] /dev/sdX

Raid Optimised FS

  • Create a RAID optimised filesystem
  • Assuming we are using a 10drive RAID5 (or 9 usable drives less parity) and a raid stripe/chunk size of 64k
  $ mkfs.[xfs|ext4] -d su=64k,sw=9 -l version=2,su=64k /dev/sdc1

MetaData FS Optimisation

  • Use EA (Extended Attributes) and Ext4 for the metadata FS
  $ mkfs.ext4 -i 2048 -I 512 -J size=400 -Odir_index,filetype,^extents /dev/sdX
  • Mount using user_xattr
  $ tune2fs -o user_xattr /dev/sdX
  • Other mount options
  $ mount -onoatime,nodiratime,nobarrier /dev/sdX <mountpoint>
  • Other MetaData Server Suggestions
  # The deadline scheduler typically yields best results for metadata access.
  $ echo deadline > /sys/block/sdX/queue/scheduler

  # In order to avoid latencies, the number of schedulable requests should not be too high, the linux default of 128 is a good value.
  $ echo 128 > /sys/block/sdX/queue/nr_requests
  • Hint: When tuning your metadata servers, keep in mind that this is often not so much about throughput, but rather about latency and also some amount of fairness: There are probably also some interactive users on your cluster, who want to see the results of their ls and other commands in an acceptable time, so you should try to work on that. This means, for instance, that you probably don't want to set a high value for /sys/block/sdX/iosched/read_expire on the metadata servers to make sure that users won't be waiting too long for their operations to complete.

Prepare Hosts for Installation

  • Add the FhGFS repo to your environment
  cd /etc/yum.repos.d/
  wget http://www.fhgfs.com/release/latest-stable/dists/fhgfs-rhel6.repo
  • Repository Signature Keys
  rpm --import http://www.fhgfs.com/release/latest-stable/gpg/RPM-GPG-KEY-fhgfs

Host Installation

Setup the Hosts

  • In this configuration we will be running an 8 node setup
    • Node1, Management
    • Node2, Metadata
    • Node3, Storage Server
    • Node4, Storage Server
    • Nodes5-8, Clients

Management Server

  • Install the management server
  yum install fhgfs-mgmtd

MetaData Server

  • Install the metadata server package
  yum install fhgfs-meta
  • Enable IB
  $ fhgfs-opentk-lib-update-ib 
  Running Infiniband auto-detection...

  Setting symlink in /opt/fhgfs/lib: libfhgfs-opentk.so -> libfhgfs-opentk-enabledIB.so

Storage Server(s)

  • On each storage server
  yum install fhgfs-storage
  • Enable IB
  $ fhgfs-opentk-lib-update-ib 
  Running Infiniband auto-detection...

  Setting symlink in /opt/fhgfs/lib: libfhgfs-opentk.so -> libfhgfs-opentk-enabledIB.so

Clients

  • On each client
  yum install fhgfs-client fhgfs-helperd fhgfs-utils
  • For each client, there will need to be a kernel built.
  • This happens automatically when the service starts
  • Ensure 'Development tools' and 'kernel-devel' are installed
  • Verify the kernel in /usr/src/kernels matches the output of uname -r
  $ pwd
  /usr/src/kernels
  $ uname -r
  2.6.32-358.el6.x86_64
  $ ls
  2.6.32-358.11.1.el6.x86_64
  ln -s 2.6.32-358.11.1.el6.x86_64/ `uname -r`
  $ ls
  2.6.32-358.11.1.el6.x86_64  2.6.32-358.el6.x86_64
  • Then either build the kernel manually or start the service
  $ /etc/init.d/fhgfs-client rebuild
  - FhGFS module autobuild
  Building fhgfs-client-opentk module
  Building fhgfs client module
  $ modinfo fhgfs
  filename:       /lib/modules/2.6.32-358.el6.x86_64/updates/fs/fhgfs_autobuild/fhgfs.ko
  author:         Fraunhofer ITWM, CC-HPC
  description:    FhGFS parallel file system client (http://www.fhgfs.com)
  license:        GPL v2
  srcversion:     D15ED391FD2704BA5D65B86
  depends:        fhgfs-client-opentk
  vermagic:       2.6.32-358.11.1.el6.x86_64 SMP mod_unload modversions
  • Note: The standard module only build for ethernet only, no IB support
  • Rebuild client for IB (OFED)
  # file: /etc/fhgfs/fhgfs-client-autobuild.conf
  # change:
  buildArgs=-j8
  # to: (if using the kernel ofed source)
  buildArgs=-j8 FHGFS_OPENTK_IBVERBS=1 
  # or: (if using own OFED)
  buildArgs=-j8 FHGFS_OPENTK_IBVERBS=1 OFED_INCLUDE_PATH=/usr/src/openib/include 
  # then rebuild
  # command for mlnx ofed: buildArgs=-j8 FHGFS_OPENTK_IBVERBS=1 OFED_INCLUDE_PATH=/usr/src/ofa_kernel/default/include

Host Configuration

Management Server Configuration

  mkdir /data/fhgfs/fhgfs_mgmt
  # edit: /etc/fhgfs/fhgfs-mgmtd.conf
  storeMgmtdDirectory=/data/fhgfs/fhgfs_mgmtd
  /etc/init.d/fhgfs-mgmtd restart

MetaData Server

  $ mkdir -p /data/fhgfs/fhgfs_meta
  $ vim /etc/fhgfs/fhgfs-meta.conf

  # Find the option "storeMetaDirectory" and set it to "storeMetaDirectory=/data/fhgfs/fhgfs_meta".
  # Find the option "sysMgmtdHost" and set it to "sysMgmtdHost=blade3"

Storage Server

  • On each storage server
  $ mkdir -p /data/fhgfs/fhgfs_storage
  $ vim /etc/fhgfs/fhgfs-storage.conf

  # Find the option "storeStorageDirectory" and set it to "storeStorageDirectory=/data/fhgfs/fhgfs_storage".
  # Find the option "sysMgmtdHost" and set it to "sysMgmtdHost=blade3"

Client

  • Start up the helperd (for DNS resolution)
  $ /etc/init.d/fhgfs-helperd restart
  • Configure the client
  $ vim /etc/fhgfs/fhgfs-client.conf

  # Find the option "sysMgmtdHost" and set it to "sysMgmtdHost=blade3".
  $ /etc/init.d/fhgfs-client restart

  # To change the mount point: 
  # You will typically have a line like this in the fhgfs-mounts.conf file above:
  # /mnt/fhgfs /etc/fhgfs/fhgfs-client.conf

Verify the Setup

  • On a client with fhgfs-utils installed
[root@Blade7 ~]# fhgfs-ctl --listnodes --nodetype=meta --details
Blade4 [ID: 2987]
   Ports: UDP: 8005; TCP: 8005
   Interfaces: ib0(RDMA) eth2(RDMA) ib0(TCP) eth2(TCP) eth0(TCP) 

Number of nodes: 1
Root: 2987
[root@Blade7 ~]# fhgfs-ctl --listnodes --nodetype=storage --details
Blade6 [ID: 1969]
   Ports: UDP: 8003; TCP: 8003
   Interfaces: eth2(RDMA) ib0(RDMA) eth2(TCP) ib0(TCP) eth0(TCP) 
Blade5 [ID: 53469]
   Ports: UDP: 8003; TCP: 8003
   Interfaces: eth2(RDMA) ib0(RDMA) eth2(TCP) ib0(TCP) eth0(TCP) 

Number of nodes: 2
[root@Blade7 ~]# fhgfs-net

mgmt_nodes
=============
Blade3 [ID: 1]
   Connections: TCP: 1 (101.101.101.3:8008); 

meta_nodes
=============
Blade4 [ID: 2987]
   Connections: RDMA: 1 (101.101.101.4:8005); 

storage_nodes
=============
Blade6 [ID: 1969]
   Connections: RDMA: 1 (101.101.101.6:8003); 
Blade5 [ID: 53469]
   Connections: RDMA: 1 (101.101.101.5:8003);