Mellanox:training

From Define Wiki
Revision as of 17:13, 7 November 2023 by Antony (talk | contribs) (save mellanox fw from one card and upload to another)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Check the card is detected

[root@nodeA ~]# lspci | grep -i Mellanox
06:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]


Mellanox OFED

Don't change the kernel. Its build for the default version of the kernel.

If the kernel changes MLNX_OFED must be rebuilt for the running kernel


Installation

There are a number of options available. To see them all run:

./mlnxofedinstall --l
yum install tcl tk libnl-devel gcc-gfortran
./mlnxofedinstall

To install against a non-running kernel

./mlnxofedinstall -k 3.10.0-514.6.1.el7.x86_64 (--without-fw-update  if in chroot in OpenHPC)

It will try to update the firmware at the end of the install:

Device #1:
----------

  Device:        0000:06:00.0
  Part Number:
  Description:
  PSID:          MT_1060110019

  Versions:      Current        Available
     FW          2.10.0000      N/A

  Status:        No matching image found

Restart the driver

Either reboot the node or run:

/etc/init.d/openibd restart


Check the state

[root@nodeB MLNX_OFED_LINUX-2.1-1.0.6-rhel6.5-x86_64]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
        default gid:     fe80:0000:0000:0000:0030:48ff:ffff:e535
        base lid:        0x0
        sm lid:          0x0
        state:           1: DOWN
        phys state:      4: PortConfigurationTraining
        rate:            10 Gb/sec (4X)
        link_layer:      InfiniBand

Start the Subnet manager

the subnet manager must be running somewhere - the switch, a node or a service

Start it on the swtich

IB SM Management
Base SM
SM enable
apply

The state of the connection will become active and a LID will be assigned.

CA 'mlx4_0'
        CA type: MT4099
        Number of ports: 1
        Firmware version: 2.10.0
        Hardware version: 0
        Node GUID: 0x003048ffffffe534
        System image GUID: 0x003048ffffffe537
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 56
                Base lid: 2
                LMC: 0
                SM lid: 1
                Capability mask: 0x02514868
                Port GUID: 0x003048ffffffe535
                Link layer: InfiniBand

Each Link will have a separate GUID - these are basically the equivalent of a MAC address. They should be unique to rvery device unless someone has been messing around.

Communation is based of the LID - this


Subnet manager

Only one subnet manager needs to be running. An extra instances will be used if the running on fails. If there are multiple back ups there is an election to decide who takes over.

The subnet manager assigns the LIDs and builds the routing table. This can take a while depending on how complicated the topology is.

If the SM is running on the switch it can be managed under the IB SM MGMT tab.


Testing

Again there are numerous options, but they must be the same on both sides.

ib_read_bw <any ip on system>
[root@nodeB MLNX_OFED_LINUX-2.1-1.0.6-rhel6.5-x86_64]# ib_read_bw

************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
                    RDMA_Read BW Test
 Dual-port       : OFF          Device         : mlx4_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 CQ Moderation   : 100
 Mtu             : 2048[B]
 Link type       : IB
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x02 QPN 0x0058 PSN 0xfd98dc OUT 0x10 RKey 0x001900 VAddr 0x007f247ebd0000
 remote address: LID 0x03 QPN 0x0058 PSN 0xcdbe1f OUT 0x10 RKey 0x001900 VAddr 0x007f6012610000
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]
 65536      1000           6041.81            6037.05              0.096593
---------------------------------------------------------------------------------------


ib_read_bw -a -b. If warnings stop cpuspeed, bios cpu max perf, should be around 14


ib_read_lat -a shows latnecies


Bad perforance?

Verify the fabric using melanox tools. It should be version 2 or higher.

ibdiagnet 
</bahs>


<syntaxhighlight>
#clear the counters
ibdiagnet  -pc
# run 
ibdiagnet -P all=1
Summary
-I- Stage                     Warnings   Errors     Comment
-I- Discovery                 0          0
-I- Lids Check                0          0
-I- Links Check               0          0
-I- Subnet Manager            0          0
-I- Port Counters             2          0
-I- Nodes Information         0          2
-I- Speed / Width checks      0          0
-I- Partition Keys            0          0
-I- Alias GUIDs               0          0


vim /var/tmp/ibdiagnet2/ibdiagnet2.log
vim /var/tmp/ibdiagnet2/ibdiagnet2.pm
ibdiagnet -P all=1 --ber_test --pm_pause_time 30
ibdiagnet -P all=1 --get_cable_info


Firmware updates

mst start
mst status


[root@nodeB MLNX_OFED_LINUX-2.1-1.0.6-rhel6.5-x86_64]# mst status
MST modules:
------------
    MST PCI module loaded
    MST PCI configuration module loaded

MST devices:
------------
/dev/mst/mt4099_pciconf0         - PCI configuration cycles access.
                                   domain:bus:dev.fn=0000:06:00.0 addr.reg=88 data.reg=92
                                   Chip revision is: 01
/dev/mst/mt4099_pci_cr0          - PCI direct access.
                                   domain:bus:dev.fn=0000:06:00.0 bar=0xdf900000 size=0x100000
                                   Chip revision is: 01


flint -d <dev> -i <file>


OEM firmware

mellanox.com support OEM firmwre supermicro

Extract Firmware from NIC and update another

You will need to do this if you have an order with mixed FW versions and they are custom SMC ones and you don't want to wait for SMC support

I assume both nodes have mellanox fw tools installed , ion this case both nodes are our ubunutu desktop USB boot stick which does.

donor node has new firmware on it we run these commands to get the device path and show version info

mst start
root@ubuntu:~# mlxfwmanager
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX4LX
  Part Number:      Super_Micro_AOC-S25G-m2S
  Description:      ConnectX-4 Lx EN adapter card; 25GbE dual-port SFP28; PCIe3.0 x8
  PSID:             SM_2001000001034
  PCI Device Name:  /dev/mst/mt4117_pciconf0
  Base MAC:         7cc255633c32
  Versions:         Current        Available
     FW             14.32.1250
     PXE            3.6.0502
     UEFI           14.25.0017

now we can download it and scp it to our 2nd node like this (it takes about 60s to read the FW). note this file does NOT exist yet and you should give it a sensible name

flint -d /dev/mst/mt4117_pciconf0 ri mlnx_fw_14.32.1250.bin
scp mlnx_fw_14.32.1250.bin 172.16.40.8:

now ssh into the second node and run mlxfwmanager in the same dir as you scped the file and try to udpate omit the -u to just check versions

root@ubuntu:~# mlxfwmanager -u
Querying Mellanox devices firmware ...

Device #1:
----------

  Device Type:      ConnectX4LX
  Part Number:      Super_Micro_AOC-S25G-m2S
  Description:      ConnectX-4 Lx EN adapter card; 25GbE dual-port SFP28; PCIe3.0 x8
  PSID:             SM_2001000001034
  PCI Device Name:  /dev/mst/mt4117_pciconf0
  Base MAC:         7cc255633c32
  Versions:         Current        Available
     FW             14.32.1010     14.32.1250
     PXE            3.6.0502       3.6.0502
     UEFI           14.25.0017     14.25.0017

  Status:           Update required

---------
Found 1 device(s) requiring firmware update...

Perform FW update? [y/N]: y
Device #1: Updating FW ...
FSMST_INITIALIZE -   OK
Writing Boot image component -   OK                                                                                                                                                                                                      Fail : The Digest in the signature is wrong

that looks bad right luckily mellanox have documented what to do in this case and its this:

root@ubuntu:~# flint -i mlnx_fw_14.32.1250.bin sign
Updating SIGNATURE section - OK
Updating ITOC section - OK

after this it updates perfectly