Benchmarking: MPI Message Rates: SQMR

From Define Wiki
Jump to navigation Jump to search

Download

wget https://asc.llnl.gov/sequoia/benchmarks/phloem_v1.0.tgz

Build

  • tar, extract etc and run make (make sure mpicc is setup in PATH)
[viglen@fp2-hn phloem-1.0.0]$ which mpicc 
/opt/platform_mpi/v8.1/bin/mpicc
[viglen@fp2-hn phloem-1.0.0]$ make 
================================================================================
Building benchmark presta-1.0.0
================================================================================
make[1]: Entering directory `/home/viglen/scratch/phloem-1.0.0/presta-1.0.0'
mpicc -DPRINT_ENV -O2 -g   -c com.c
mpicc -DPRINT_ENV -O2 -g   -c util.c
mpicc -DPRINT_ENV -O2 -g   -o com com.o util.o  
make[1]: Leaving directory `/home/viglen/scratch/phloem-1.0.0/presta-1.0.0'
================================================================================
Building benchmark sqmr-1.0.0
================================================================================
make[1]: Entering directory `/home/viglen/scratch/phloem-1.0.0/sqmr-1.0.0'
mpicc -DPRINT_ENV -O2 -g  -c sqmr.c
mpicc -o sqmr sqmr.o  ../presta*/util.o
make[1]: Leaving directory `/home/viglen/scratch/phloem-1.0.0/sqmr-1.0.0'
================================================================================
Building benchmark mpiBench-1.0.0
================================================================================
make[1]: Entering directory `/home/viglen/scratch/phloem-1.0.0/mpiBench-1.0.0'
mpicc -DPRINT_ENV -O2 -g  -DVERS=\"1.0.0\" -o mpiBench_Allreduce mpiBench_Allreduce.c ../presta*/util.o
mpicc -DPRINT_ENV -O2 -g  -DVERS=\"1.0.0\" -o mpiBench_Barrier mpiBench_Barrier.c ../presta*/util.o
mpicc -DPRINT_ENV -O2 -g  -DVERS=\"1.0.0\" -o mpiBench_Bcast mpiBench_Bcast.c ../presta*/util.o
make[1]: Leaving directory `/home/viglen/scratch/phloem-1.0.0/mpiBench-1.0.0'
================================================================================
Building benchmark linktest-1.0.0
================================================================================
make[1]: Entering directory `/home/viglen/scratch/phloem-1.0.0/linktest-1.0.0'
mkdir -p linux
mpicc -c bmtime.c -o linux/bmtime.o -DPRINT_ENV -O2 -g 
mpicc -c nodelib-linux.c -o linux/nodelib.o -DPRINT_ENV -O2 -g 
mpicc -c linktest.c -o linux/linktest.o -DPRINT_ENV -O2 -g 
mpicc -o linux/linktest linux/bmtime.o linux/nodelib.o linux/linktest.o  ../presta*/util.o
Compiled linktest with ARCH=linux using nodelib-linux.c
make[1]: Leaving directory `/home/viglen/scratch/phloem-1.0.0/linktest-1.0.0'
================================================================================
Building benchmark torustest-1.0.0
================================================================================
make[1]: Entering directory `/home/viglen/scratch/phloem-1.0.0/torustest-1.0.0'
mkdir -p linux
mpicc -c bmtime.c -o linux/bmtime.o -DPRINT_ENV -O2 -g 
mpicc -c torustest.c -o linux/torustest.o -DPRINT_ENV -O2 -g 
mpicc -o linux/torustest linux/bmtime.o linux/torustest.o  -lm ../presta*/util.o
mpicc -o linux/generate -DPRINT_ENV -O2 -g  generate.c
Compiled torustest with ARCH=linux using nodelib-linux.c
make[1]: Leaving directory `/home/viglen/scratch/phloem-1.0.0/torustest-1.0.0'
================================================================================
Building benchmark mpiGraph-1.0.0
================================================================================
make[1]: Entering directory `/home/viglen/scratch/phloem-1.0.0/mpiGraph-1.0.0'
rm -rf mpiGraph.o mpiGraph mpiGraph.out mpiGraph.tgz
mpicc -DPRINT_ENV -O2 -g  -o mpiGraph mpiGraph.c ../presta*/util.o
make[1]: Leaving directory `/home/viglen/scratch/phloem-1.0.0/mpiGraph-1.0.0'
[viglen@fp2-hn phloem-1.0.0]$ ls
linktest-1.0.0  Makefile.inc    mpiGraph-1.0.0                          presta-1.0.0  README.sow  sqmr-1.0.0
Makefile        mpiBench-1.0.0  PhloemMPIBenchmarks_summary_v1.0.0.pdf  README        run_script  torustest-1.0.0
  • The build process will create a folder called sqmr-1.0.0. (Sequioa message rate)


Run

  • Run on Qlogic IB (include -PSM)
[viglen@fp2-1 sqmr-1.0.0]$ mpirun -np 2 -PSM -hostlist "fp2-1 fp2-2" ./sqmr
################################################################################
# SQMR v1.0.0 - MPI maximal message rate benchmark
# Run at 06/27/11 05:17:43, with rank 0 on fp2-1
#
# MPI tasks per node                 : 1
# Neighbor tasks                     : 1
# Iterations per message size        : 4096
# Send/Recv operations per iteration : 1
#
#                            average               max                  min
# msgsize iters time     msgs/sec MiB/sec     msgs/sec MiB/sec     msgs/sec MiB/sec
       0  4096  0.01    870461.79    0.00    870461.79    0.00    870461.79    0.00
       1  3277  0.01    890722.20    0.85    890722.20    0.85    890722.20    0.85
       2  2622  0.01    890627.23    1.70    890627.23    1.70    890627.23    1.70
       4  2098  0.00    895046.51    3.41    895046.51    3.41    895046.51    3.41
       8  1679  0.00    884425.30    6.75    884425.30    6.75    884425.30    6.75
      16  1344  0.00    740900.91   11.31    740900.91   11.31    740900.91   11.31
      32  1076  0.00    742831.22   22.67    742831.22   22.67    742831.22   22.67
      64   861  0.00    736172.82   44.93    736172.82   44.93    736172.82   44.93
     128   689  0.00    715050.22   87.29    715050.22   87.29    715050.22   87.29
     256   552  0.00    669923.56  163.56    669923.56  163.56    669923.56  163.56
     512   442  0.00    576724.96  281.60    576724.96  281.60    576724.96  281.60
    1024   354  0.00    477038.91  465.86    477038.91  465.86    477038.91  465.86
    2048   284  0.00    374467.88  731.38    374467.88  731.38    374467.88  731.38
    4096   228  0.00    238241.48  930.63    238241.48  930.63    238241.48  930.63
    8192   183  0.00    108473.38  847.45    108473.38  847.45    108473.38  847.45
   16384   147  0.00     71270.68 1113.60     71270.68 1113.60     71270.68 1113.60
   32768   118  0.01     45986.33 1437.07     45986.33 1437.07     45986.33 1437.07
   65536    95  0.01     30769.03 1923.06     30769.03 1923.06     30769.03 1923.06
  131072    76  0.01     18377.51 2297.19     18377.51 2297.19     18377.51 2297.19
  262144    61  0.01     11487.63 2871.91     11487.63 2871.91     11487.63 2871.91
  524288    49  0.01      6572.36 3286.18      6572.36 3286.18      6572.36 3286.18
 1048576    40  0.02      3426.86 3426.86      3426.86 3426.86      3426.86 3426.86
 2097152    32  0.04      1817.88 3635.76      1817.88 3635.76      1817.88 3635.76
 4194304    26  0.06       943.38 3773.52       943.38 3773.52       943.38 3773.52
  • Run on ethernet
[viglen@fp2-1 sqmr-1.0.0]$ mpirun -np 2 -TCP -hostlist "fp2-1 fp2-2" ./sqmr
################################################################################
# SQMR v1.0.0 - MPI maximal message rate benchmark
# Run at 06/27/11 05:19:31, with rank 0 on fp2-1
#
# MPI tasks per node                 : 1
# Neighbor tasks                     : 1
# Iterations per message size        : 4096
# Send/Recv operations per iteration : 1
#
#                            average               max                  min
# msgsize iters time     msgs/sec MiB/sec     msgs/sec MiB/sec     msgs/sec MiB/sec
       0  4096  0.10     83743.78    0.00     83743.78    0.00     83743.78    0.00
       1  3277  0.08     83218.96    0.08     83218.96    0.08     83218.96    0.08
       2  2622  0.06     83505.81    0.16     83505.81    0.16     83505.81    0.16
       4  2098  0.05     82844.40    0.32     82844.40    0.32     82844.40    0.32
       8  1679  0.04     80727.19    0.62     80727.19    0.62     80727.19    0.62
      16  1344  0.03     80052.04    1.22     80052.04    1.22     80052.04    1.22
      32  1076  0.03     78692.11    2.40     78692.11    2.40     78692.11    2.40
      64   861  0.02     75815.79    4.63     75815.79    4.63     75815.79    4.63
     128   689  0.02     60134.95    7.34     60134.95    7.34     60134.95    7.34
     256   552  0.02     58286.49   14.23     58286.49   14.23     58286.49   14.23
     512   442  0.02     36594.24   17.87     36594.24   17.87     36594.24   17.87
    1024   354  0.03     21331.11   20.83     21331.11   20.83     21331.11   20.83
    2048   284  0.05     11034.27   21.55     11034.27   21.55     11034.27   21.55
    4096   228  0.05      8357.93   32.65      8357.93   32.65      8357.93   32.65
    8192   183  0.05      8074.67   63.08      8074.67   63.08      8074.67   63.08
   16384   147  0.04      6941.83  108.47      6941.83  108.47      6941.83  108.47
   32768   118  0.08      3031.12   94.72      3031.12   94.72      3031.12   94.72
   65536    95  0.09      2035.05  127.19      2035.05  127.19      2035.05  127.19
  131072    76  0.12      1248.99  156.12      1248.99  156.12      1248.99  156.12
  262144    61  0.17       725.46  181.37       725.46  181.37       725.46  181.37
  524288    49  0.25       398.37  199.18       398.37  199.18       398.37  199.18
 1048576    40  0.38       208.50  208.50       208.50  208.50       208.50  208.50
 2097152    32  0.59       107.92  215.83       107.92  215.83       107.92  215.83
 4194304    26  0.95        55.01  220.05        55.01  220.05        55.01  220.05
  • As you can see above, a factor of 10x improvement running on QDR over 1GB eth