Difference between revisions of "Linux: Checking the Infiniband Fabric"

From Define Wiki
Jump to navigation Jump to search
(Created page with "* Assuming Platform Cluster Manager 3.2 is installed, otherwise please make sure the latest version of OFED which will include openmpi == Check the IB links == Use a command ...")
 
Line 15: Line 15:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
In this instance we can see that the <tt>state:</tt> is only in an INIT stage. This typically means that the IB link is having trouble with the subnet manager. This will result in warning where running MPI performance tests (check the output from openmpi mpirun for clues:
+
In this instance we can see that the <tt>state:</tt> is only in an INIT stage. This typically means that the IB link is having trouble with the subnet manager.  
 +
 
 +
== Warning form mpirun? ==
 +
 
 +
This will result in warnings where running MPI performance tests (check the output from openmpi mpirun for clues:
  
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 25: Line 29:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 +
== Check the fabric performance ==
 
OpenMPI will default back to using Ethernet, you can tell by the high latency and low bandwidth:
 
OpenMPI will default back to using Ethernet, you can tell by the high latency and low bandwidth:
 
<syntaxhighlight>
 
<syntaxhighlight>
Line 67: Line 72:
 
       4194304          10    36169.60      110.59  # <-- Typical 1GB bandwidth
 
       4194304          10    36169.60      110.59  # <-- Typical 1GB bandwidth
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
== Make sure the subnet manager is running ==
 +
Sometime the subnet manager will be running on the switch, other times it will need to be started manually on one of the hosts on the IB fabric. OFED provides a utility to run a subnet manager on a host (from the opensm pacakge)
 +
<syntaxhighlight>
 +
/etc/init.d/opensmd restart
 +
# checking the ibstatus output, we have an ACTIVE link!
 +
[david@compute000 imb]$ ibstatus
 +
Infiniband device 'mlx4_0' port 1 status:
 +
default gid: fe80:0000:0000:0000:0030:48ff:ffff:e57d
 +
base lid: 0x1
 +
sm lid: 0x1
 +
state: 4: ACTIVE
 +
phys state: 5: LinkUp
 +
rate: 40 Gb/sec (4X QDR)
 +
link_layer: InfiniBand
 +
</syntaxhighlight>
 +
 +
Ok now we are looking much better, test performance again:
 +
 +
== QDR MPI Performance Figures ==

Revision as of 12:55, 8 November 2012

  • Assuming Platform Cluster Manager 3.2 is installed, otherwise please make sure the latest version of OFED which will include openmpi

Check the IB links

Use a command called ibstatus to the current state of the IB link

david@compute000 imb]$ ibstatus
Infiniband device 'mlx4_0' port 1 status:
	default gid:	 fe80:0000:0000:0000:0030:48ff:ffff:e57d
	base lid:	 0x0
	sm lid:		 0x0
	state:		 2: INIT
	phys state:	 5: LinkUp
	rate:		 40 Gb/sec (4X QDR)
	link_layer:	 InfiniBand

In this instance we can see that the state: is only in an INIT stage. This typically means that the IB link is having trouble with the subnet manager.

Warning form mpirun?

This will result in warnings where running MPI performance tests (check the output from openmpi mpirun for clues:

WARNING: There is at least one OpenFabrics device found but there are
no active ports detected (or Open MPI was unable to use them).  This
is most certainly not what you wanted.  Check your cables, subnet
manager configuration, etc.  The openib BTL will be ignored for this
job.

Check the fabric performance

OpenMPI will default back to using Ethernet, you can tell by the high latency and low bandwidth:

david@compute000 imb]$ module load openmpi-x86_64
[david@compute000 imb]$ which mpirun
/usr/lib64/openmpi/bin/mpirun
[david@compute000 imb]$ pwd
/home/david/benchmarks/imb
[david@compute000 imb]$ cat hosts 
compute000
compute001
[david@compute000 imb]$ /usr/lib64/openmpi/bin/mpirun -np 2 -hostfile ./hosts /usr/lib64/openmpi/bin/mpitests-IMB-MPI1 
# lots of warning cut out
#---------------------------------------------------
# Benchmarking PingPong 
# #processes = 2 
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000        47.79         0.00    # <-- This is high ethernet latency, typical 1GB eth0 latency can be as low as 25usec
            1         1000        44.85         0.02
            2         1000        45.24         0.04
            4         1000        45.87         0.08
            8         1000        44.51         0.17
           16         1000        43.21         0.35
           32         1000        43.76         0.70
           64         1000        43.92         1.39
          128         1000        43.48         2.81
          256         1000        48.91         4.99
          512         1000        52.95         9.22
         1024         1000        96.30        10.14
         2048         1000       403.23         4.84
         4096         1000       262.84        14.86
         8192         1000       279.54        27.95
        16384         1000       333.65        46.83
        32768         1000       686.98        45.49
        65536          640      1364.94        45.79
       131072          320      1668.31        74.93
       262144          160      2683.26        93.17
       524288           80      5044.39        99.12
      1048576           40      9498.91       105.28
      2097152           20     18256.90       109.55
      4194304           10     36169.60       110.59   # <-- Typical 1GB bandwidth

Make sure the subnet manager is running

Sometime the subnet manager will be running on the switch, other times it will need to be started manually on one of the hosts on the IB fabric. OFED provides a utility to run a subnet manager on a host (from the opensm pacakge)

/etc/init.d/opensmd restart
# checking the ibstatus output, we have an ACTIVE link!
[david@compute000 imb]$ ibstatus
Infiniband device 'mlx4_0' port 1 status:
	default gid:	 fe80:0000:0000:0000:0030:48ff:ffff:e57d
	base lid:	 0x1
	sm lid:		 0x1
	state:		 4: ACTIVE
	phys state:	 5: LinkUp
	rate:		 40 Gb/sec (4X QDR)
	link_layer:	 InfiniBand

Ok now we are looking much better, test performance again:

QDR MPI Performance Figures