Linux: Checking the Infiniband Fabric

From Define Wiki
Revision as of 12:50, 8 November 2012 by David (talk | contribs) (Created page with "* Assuming Platform Cluster Manager 3.2 is installed, otherwise please make sure the latest version of OFED which will include openmpi == Check the IB links == Use a command ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
  • Assuming Platform Cluster Manager 3.2 is installed, otherwise please make sure the latest version of OFED which will include openmpi

Check the IB links

Use a command called ibstatus to the current state of the IB link

david@compute000 imb]$ ibstatus
Infiniband device 'mlx4_0' port 1 status:
	default gid:	 fe80:0000:0000:0000:0030:48ff:ffff:e57d
	base lid:	 0x0
	sm lid:		 0x0
	state:		 2: INIT
	phys state:	 5: LinkUp
	rate:		 40 Gb/sec (4X QDR)
	link_layer:	 InfiniBand

In this instance we can see that the state: is only in an INIT stage. This typically means that the IB link is having trouble with the subnet manager. This will result in warning where running MPI performance tests (check the output from openmpi mpirun for clues:

WARNING: There is at least one OpenFabrics device found but there are
no active ports detected (or Open MPI was unable to use them).  This
is most certainly not what you wanted.  Check your cables, subnet
manager configuration, etc.  The openib BTL will be ignored for this
job.

OpenMPI will default back to using Ethernet, you can tell by the high latency and low bandwidth:

david@compute000 imb]$ module load openmpi-x86_64
[david@compute000 imb]$ which mpirun
/usr/lib64/openmpi/bin/mpirun
[david@compute000 imb]$ pwd
/home/david/benchmarks/imb
[david@compute000 imb]$ cat hosts 
compute000
compute001
[david@compute000 imb]$ /usr/lib64/openmpi/bin/mpirun -np 2 -hostfile ./hosts /usr/lib64/openmpi/bin/mpitests-IMB-MPI1 
# lots of warning cut out
#---------------------------------------------------
# Benchmarking PingPong 
# #processes = 2 
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000        47.79         0.00    # <-- This is high ethernet latency, typical 1GB eth0 latency can be as low as 25usec
            1         1000        44.85         0.02
            2         1000        45.24         0.04
            4         1000        45.87         0.08
            8         1000        44.51         0.17
           16         1000        43.21         0.35
           32         1000        43.76         0.70
           64         1000        43.92         1.39
          128         1000        43.48         2.81
          256         1000        48.91         4.99
          512         1000        52.95         9.22
         1024         1000        96.30        10.14
         2048         1000       403.23         4.84
         4096         1000       262.84        14.86
         8192         1000       279.54        27.95
        16384         1000       333.65        46.83
        32768         1000       686.98        45.49
        65536          640      1364.94        45.79
       131072          320      1668.31        74.93
       262144          160      2683.26        93.17
       524288           80      5044.39        99.12
      1048576           40      9498.91       105.28
      2097152           20     18256.90       109.55
      4194304           10     36169.60       110.59   # <-- Typical 1GB bandwidth