Difference between revisions of "Lustre Intel: Install IEEL 1.0.0"

From Define Wiki
Jump to navigation Jump to search
Line 136: Line 136:
 
* Once logged in Click on the '''Configuration''' tab, then '''Add Server'''
 
* Once logged in Click on the '''Configuration''' tab, then '''Add Server'''
  
 +
http://wiki.bostonlabs.co.uk/w/images/4/44/Ieel_installation_add_server.png
  
 +
Alternatively, you can do this using the CLI:
 +
<syntaxhighlight>
 +
# This is how it starts off, providing a list of jobs to complete
 +
[root@st15-iml1 ~]# chroma --username admin --password admin server-add st15-oss1 --server_profile base_managed
 +
Setting up host st15-oss1, waiting on jobs: [13] 
 +
 +
# Then once finished
 +
[root@st15-iml1 ~]# chroma --username admin --password admin server-add st15-oss1 --server_profile base_managed
 +
Setting up host st15-oss1: Finished
 +
</syntaxhighlight>
 
== Getting IB to work ==
 
== Getting IB to work ==
 
Intel EEL only provide the kernel package and not the kernel-devel package. For IB to work we need to rebuild MLNX_OFED.
 
Intel EEL only provide the kernel package and not the kernel-devel package. For IB to work we need to rebuild MLNX_OFED.

Revision as of 14:46, 29 January 2014

Downloaded ieel-1.0.0.tar.gz from the Intel Software Centre (product needs to be registered per user, get packages on headnode:/home/david/software/ieel)

Install the IML Master

  [root@st15-iml1 ~]# tar zxvf ieel-latest.tar.gz 
  ieel-1.0.2/install
  ieel-1.0.2/lesskey.out
  ieel-1.0.2/EULA.txt
  ieel-1.0.2/base_managed.profile
  ieel-1.0.2/base_monitored.profile
  ieel-1.0.2/lustre-client-2.4.0-bundle.tar.gz
  ieel-1.0.2/iml-manager-2.0.2.0.tar.gz
  ieel-1.0.2/e2fsprogs-1.42.3.wc3-bundle.tar.gz 
  ieel-1.0.2/iml-agent-2.0.2.0-bundle.tar.gz
  ieel-1.0.2/lustre-2.3.11-bundle.tar.gz
  ieel-1.0.2/hadoop/
  ieel-1.0.2/hadoop/hadoop-lustre-plugin-2.0.4-Intel.tar.gz

Run the installation script

  [root@st15-iml1 ieel-1.0.2]# ./install

Check the Installation log while you install (once the installer kicks off, it'll create a directory /var/log/chroma

[root@st15-iml1 chroma]# tail -f /var/log/chroma/install.log 
[29/Jan/2014:07:39:44] DEBUG 0.000101:  policycoreutils         x86_64  2.0.83-19.39.el6         base            648 k
[29/Jan/2014:07:39:44] DEBUG 0.000102:  sg3_utils-libs          x86_64  1.28-5.el6               base             51 k
[29/Jan/2014:07:39:44] DEBUG 0.000082: 
[29/Jan/2014:07:39:44] DEBUG 0.000129: Transaction Summary
[29/Jan/2014:07:39:44] DEBUG 0.000103: ================================================================================
[29/Jan/2014:07:39:44] DEBUG 0.000089: Install     102 Package(s)
[29/Jan/2014:07:39:44] DEBUG 0.000089: Upgrade       4 Package(s)
[29/Jan/2014:07:39:44] DEBUG 0.000082:

This is what you'll see during the Installation (provide a user and email)

Starting Intel(R) Manager for Lustre* software installation
Testing YUM
Loaded plugins: fastestmirror
Unpacking installation package
Installing Intel(R) Manager for Lustre*
|
Starting setup...

Setting up PostgreSQL service...
Creating database owner 'chroma'...

Creating database 'chroma'...

Creating database tables...
Loaded 11 default power device types.
Creating groups...
An administrative user account will now be created using the credentials which you provide.
Username: admin
Email: david.power@boston.co.uk
Password: 
Confirm password: 
User 'admin' successfully created.
Building static directory...
NTP Server [localhost]: 
Writing ntp configuration: localhost 
Opening firewall for ntp
Restarting ntp
Starting RabbitMQ...
Restarting rabbitmq-server: RabbitMQ is not running
SUCCESS
rabbitmq-server.
Creating RabbitMQ user...
Creating RabbitMQ vhost...
Enabling daemons
Starting daemons
Checking service configuration...

Setup complete.
Registering profiles
chroma-manager 2.0.2.0-3424 is currently installed

Intel(R) Manager for Lustre* software installation completed successfully

Intel IML is installed! Lets go to the IML web interface

  https://206.221.159.56:8015/ui/
  # by default, the port will be 8080

Verify the Installation files/directories:

  • /usr/bin/chroma - Command Line Interface
  • /var/log/chroma - IML log files
  • /var/lib/chroma/repo - IML repository
  • /usr/lib/pyhthon2.6/site-packages
  • /usr/share/chroma-manager - IML Files

Check the chroma processes that are running

  ps -ef | grep chroma

Prepare the Rest of the Cluster

Make sure all the below is followed:

  1. Install a vanilla Centos 64 for all the OSS and MDS nodes
  2. Ensure ssh keys are setup between all the hosts (from IML head to all nodes)
  3. Ensure all the hosts files are consistent (FQDN resolutions)
  4. Create an LNET configurations (See below)
  5. Ensure High Availability cables are installed
  6. NTP setup and running on the IML head (Note: Someone reported problems with NTP when running on a VM, dont do)

Create the /etc/modprobe.d/lustre.conf

# For Ethernet
[root@st15-mds1 ~]# cat /etc/modprobe.d/lustre.conf 
options lnet networks=tcp1(eth1)

# For IB
root@st15-mds1 ~]# cat /etc/modprobe.d/lustre.conf 
options lnet networks=o2ib(ib0)

Make sure YOU DO NOT do the following:

  1. Do NOT use or enable the EPEL repos
  2. Do NOT install CMAN (Cluster Manager)
  3. Do NOT configure crossover cable interfaces
  4. Do NOT configure Lustre, Corosync or Pacemaker
  5. Do NOT configure NTP time synchonisation

Install the MDS / OSS nodes

  • Go to the IML interface (Assuming you have all the dos and donts completed above)
  • Login with the details provided during the installation (so admin/admin as above)
  • Once logged in Click on the Configuration tab, then Add Server

http://wiki.bostonlabs.co.uk/w/images/4/44/Ieel_installation_add_server.png

Alternatively, you can do this using the CLI:

# This is how it starts off, providing a list of jobs to complete
[root@st15-iml1 ~]# chroma --username admin --password admin server-add st15-oss1 --server_profile base_managed
Setting up host st15-oss1, waiting on jobs: [13]   

# Then once finished
[root@st15-iml1 ~]# chroma --username admin --password admin server-add st15-oss1 --server_profile base_managed
Setting up host st15-oss1: Finished

Getting IB to work

Intel EEL only provide the kernel package and not the kernel-devel package. For IB to work we need to rebuild MLNX_OFED.

  • Get the kernel-devel RPM
  wget http://downloads.whamcloud.com/public/lustre/lustre-2.1.6/el6/server/RPMS/x86_64/kernel-devel-2.6.32-358.11.1.el6_lustre.x86_64.rpm
  rpm -ivh kernel-devel-2.6.32-358.11.1.el6_lustre.x86_64.rpm  --force
  • Rebuild Mellanox OFED
  • Assuming were using MLNX_OFED_LINUX-2.0-2.0.5-rhel6.4-x86_64.iso

<syntaxhighlight>

 mkdir MLNX
 mount -o loop MLNX_OFED_LINUX-2.0-2.0.5-rhel6.4-x86_64.iso MLNX
 cd MLNX
 ./mlnx_add_kernel_support.sh -m .
 ## Output
 [root@blade1 MLNX]# ./mlnx_add_kernel_support.sh -m .
 Note: This program will create MLNX_OFED_LINUX TGZ for rhel6.4 under /tmp directory.
     All Mellanox, OEM, OFED, or Distribution IB packages will be removed.
 Do you want to continue?[y/N]:y
 See log file /tmp/mlnx_ofed_iso.8662.log
 Building OFED RPMs. Please wait...
 Removing OFED RPMs...
 C