Graphcore M2000 System Debug
No 100GbE link
During the bringup of our initial IPU M2000 units, it was noticed that the 100GbE interfaces would not give a link. More correctly, they did provide a link as the unit was powered on but as the fans ramped down after a minute or so, the link would drop.
The management interface would show a link, as expected.
It was identified that there was an issue with the software on the early IPU M2000 which mean the GW and IPUs were not being powered on automatically.
Accessing the IPU M2000 BMC
Identify the BMC's IP address
To access the IPU M2000's BMC, first it is necessary to identify the unit IP address. This can be achieved by a number of methods but it is usually simple enough to look through the /var/log/syslog syslog to find the IP address offered by the DHCP server:
ipuuser@ipu-host-2:~$ sudo grep dhcpd /var/log/syslog* | grep -i discover /var/log/syslog.1:Jan 20 14:23:29 ipu-host-2 dhcpd[3474]: DHCPDISCOVER from 70:69:79:20:13:b4 via enp65s0f1 ipuuser@ipu-host-2:~$ sudo grep 70:69:79:20:13:b4 /var/log/syslog* /var/log/syslog:Jan 21 06:27:08 ipu-host-2 dhcpd[3474]: DHCPREQUEST for 10.1.1.1 from 70:69:79:20:13:b4 via enp65s0f1 /var/log/syslog:Jan 21 06:27:08 ipu-host-2 dhcpd[3474]: DHCPACK on 10.1.1.1 to 70:69:79:20:13:b4 via enp65s0f1
Access the IPU's BMC using SSH
SSH to the IP address identified above using the login credentials of root and 0penBmc:
ipuuser@ipu-host-2:~$ ssh 10.1.1.1 -l root The authenticity of host '10.1.1.1 (10.1.1.1)' can't be established. RSA key fingerprint is SHA256:SW5Bwfpj/WjkwYu8eQoVefyquJhLmUM3AxCrzW0eYfA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '10.1.1.1' (RSA) to the list of known hosts. root@10.1.1.1's password: ======================================================================= Graphcore IPU-Machine ipum-p2-rev1 - OpenBMC gc-v1.6.0-21-g2ad233f Some useful commands: ipum-utils Collection of Power and Boot Control Commands. ipum-diags Enter Maintenance Mode. For low level testing including BIST. ipum_sysfpga Utility for Upgrading System FPGA. obmc-console-client Serial Console Access to Gateway SOC. To exit press "Enter" and then "~.". This login is using the default root password. It is advised to change it, using Redfish or GUI. ======================================================================= root@ipum-p2-rev1-8204721-0012:~#
Power on GW and IPUs
Run the following command:
root@ipum-p2-rev1-8204721-0012:~# ipum-utils power_on