Lustre intel: Setup Mellanox 10GB Modules for IEEL 2

From Define Wiki
Jump to navigation Jump to search
  • System was a centos 6.5 base image that was deployed with IML
  • Intel Agent repo was installed as part of the IML deployment
  • No mellanox drivers for the ethernet cards was provided


Copy across the Mellanox Bundle

  scp mlnx-en-2.3-1.0.0.tgz oss7:
  ssh oss7
  tar zxvf mlnx-en-2.3-1.0.0.tgz

Install the required dev tools

  • Make sure you are running the lustre kernel (should be after IML is deployed)
[root@oss7 ~]# uname -a
Linux oss7.cm.cluster 2.6.32-431.20.5.el6_lustre.x86_64 #1 SMP Fri Jul 25 16:51:42 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux
  • Then add the dev tools
yum -y install kernel-devel-2.6.32-431.20.5.el6_lustre.x86_64 kernel-headers-2.6.32-431.20.5.el6_lustre.x86_64
yum groupinstall 'Development tools'

Install the Mellanox Driver

  • Assuming youre in the directory where the MLNX bundle was untarred
[root@oss7 ~]$ cd mlnx-en-2.3-1.0.0
[root@oss7 mlnx-en-2.3-1.0.0]$ ./install.sh

Setup the new interface

[root@oss7 network-scripts]# pwd
/etc/sysconfig/network-scripts
[root@oss7 network-scripts]# 
[root@oss7 network-scripts]# cat ifcfg-eth2
DEVICE=eth2
HWADDR=E4:1D:2D:15:4D:70
TYPE=Ethernet
UUID=470563d6-1376-46ea-b5fe-cb70587ec519
ONBOOT=yes
NM_CONTROLLED=no
BOOTPROTO=static
IPADDR=172.23.19.47
NETMASK=255.255.255.0
[root@oss7 network-scripts]# ifup eth2
Determining if ip address 172.23.19.47 is already in use for device eth2...
i[root@oss7 network-scripts]# ifconfig eth2
eth2      Link encap:Ethernet  HWaddr E4:1D:2D:15:4D:70  
          inet addr:172.23.19.47  Bcast:172.23.19.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

Change the LNET setup in the IML interface

Output from Install

[root@oss7 mlnx-en-2.3-1.0.0]# ./install.sh 
Installing mlnx-en for Linux
Starting installation at Mon Jun 29 06:52:27 MDT 2015...
Verifying dependencies
Building mlnx-en binary RPMs
no zlib on the machine, skipping mstflint installation
Installing mlnx-en
Attempting to perform Firmware update...
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX3
  Part Number:      MCX311A-XCA_Ax
  Description:      ConnectX-3 EN network interface card; 10GigE; single-port SFP+; PCIe3.0 x4 8GT/s; RoHS R6
  PSID:             MT_1170110023
  PCI Device Name:  0000:05:00.0
  Port1 MAC:        e41d2d154d70
  Port2 MAC:        e41d2d154d71
  Versions:         Current        Available     
     FW             2.32.5100      2.32.5100     
     PXE            3.4.0306       3.4.0306      

  Status:           Up to date


Log File: /tmp/install-mlx4_en.log.23535_fw_update.log

   In order for newly installed mlx4 modules to load, 
   previous modules must first be unloaded.
   Do you wish to reload the driver now? (y/n) [y] 
Reloading mlx4 modules
Installation finished successfully.