Difference between revisions of "Graphcore M2000 Direct Connect Bringup"
(Created page with "== Hardware Spec == At the time of writing, the current spec host system is based around: * AS-1024US-TRT (or similar H12 Ultra) * 2 x AMD 7402 CPUs * 16 x 32GB DIMMs * 2 x MC...") |
(No difference)
|
Latest revision as of 22:25, 2 February 2021
Hardware Spec
At the time of writing, the current spec host system is based around:
- AS-1024US-TRT (or similar H12 Ultra)
- 2 x AMD 7402 CPUs
- 16 x 32GB DIMMs
- 2 x MCX556A-EDAT (though 1 x single port version should suffice)
- At least on local drive for O/S and software install
Operating System
At the time of writing, the latest support version of Linux is Ubuntu 18.04.4 LTS. Install this as normal (minimal installation is fine). CentOS 7.2 and CentOS 8 are also supported but, as yet, untested at Boston.
Prerequisite Packages
A number of packages need to be installed using aptitude:
root@ipu-host-2:~# apt-get install apt-transport-https ibverbs-utils openjdk-8-jdk python3-virtualenv autoconf ipmitool php-cli python3-wheel automake jq php-curl qtcreator bc kcachegrind policykit-1 rdma-core build-essential libaio-dev protobuf-compiler screen ccache libboost-all-dev python-boto3 software-properties-common clang libeigen3-dev python-dev sshpass cmake libjson-c-dev python-lxml subversion curl libjson-c-doc python-numpy swig direnv libpci-dev python-pip sysfsutils dkms libpixman-1-dev python-pytest tar emacs libprotobuf-dev python-recommonmark tmux ethtool libtool python-requests u-boot-tools exuberant-ctags lldpad python-setuptools unzip flex m4 python-wheel valgrind g++ minicom python-yaml vim gawk moreutils python2 virtualenv gcc net-tools python3 wdiff gdb netcat python3-dev wget git parallel python3-numpy zip golang-go pciutils python3-pip htop perl python3-setuptools
And a few python packages which can be installed via pip:
root@ipu-host-2:~# pip install autograd paramiko pylint scp jstyleson pep8 pyyaml yapf mock pexpect requests
User Accounts
Account Overview
The following user accounts are required on the host system:
| Accounts | Function |
|---|---|
| root | A root user account secured with a password is recommended. |
| itadmin | An admin account secured with a password is recommended. Home folder located at /home/itadmin using bash shell. |
| ipuuser | An account dedicated to IPU software and IPU-M2000 management software is mandatory. Home folder located at /home/ipuuser using bash shell. |
| poplaruser | An account dedicated to Poplar software is mandatory. Home folder located at /home/poplaruser using bash shell. |
The following user accounts are present on the IPU M2000 system:
| Login to | Username | Password |
|---|---|---|
| IPU-M2000 BMC OS | root | 0penBmc |
| IPU-M2000 GW OS | itadmin | ChangeMeFdh5P |
Create Users on Host system
Create users with useradd:
root@ipu-host-2:~# useradd -m itadmin root@ipu-host-2:~# useradd -m ipuuser root@ipu-host-2:~# useradd -m poplaruser
And set passwords with passwd (repeat for all users):
root@ipu-host-2:~# passwd itadmin Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully
Double check that the default shell for each of these users is bash. Edit the /etc/passwd file as appropriate, but each of these new users should look something like:
ipuuser@ipu-host-2:~$ cat /etc/passwd root:x:0:0:root:/root:/bin/bash . . . itadmin:x:1001:1001::/home/itadmin:/bin/bash ipuuser:x:1002:1002::/home/ipuuser:/bin/bash poplaruser:x:1003:1003::/home/poplaruser:/bin/bash
Add sudo rights
Edit the /etc/sudoers file to look something like:
ipuuser@ipuhost:~$ sudo cat /etc/sudoers # # This file MUST be edited with the 'visudo' command as root. # # Please consider adding local content in /etc/sudoers.d/ instead of # directly modifying this file. # # See the man page for details on how to write a sudoers file. # Defaults env_reset Defaults mail_badpass Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin" # Host alias specification # User alias specification # Cmnd alias specification # User privilege specification root ALL=(ALL:ALL) ALL itadmin ALL=(ALL:ALL) ALL ipuuser ALL=(ALL:ALL) ALL # Members of the admin group may gain root privileges %admin ALL=(ALL) ALL # Allow members of group sudo to execute any command %sudo ALL=(ALL:ALL) ALL # See sudoers(5) for more information on "#include" directives: #includedir /etc/sudoers.d
Networking
Network Configuration Overview
The Host system should have interfaces configured as follows:
| Port | Role | Link Speed |
IP address | Configured From |
|---|---|---|---|---|
| enp65s0f0 | External connectivity to IT infrastructure | 1GbE | 192.168.8.20x/20 | Static or DHCP lease |
| enp65s0f1 | Management of IPU-M2000 | 1GbE | 10.1.3.101/22 | Static |
| enp129s0f0 | Unused | 100GbE | N/A | N/A |
| enp129s0f1 | RDMA interface to IPU-M2000 | 100GbE | 10.1.5.5/24 | Static |
The IPU M2000 system should have interfaces configured as follows:
| Port | Role | Link Speed |
IP address | Configured From |
|---|---|---|---|---|
| IPU Mgmt Port |
BMC+GW management ports | 1GbE | BMC: 10.1.1.1/22 GW: 10.1.2.1/22 |
Static lease from DHCP server |
| IPU 100GbE #1 |
Host-link data-plane link to IPU-M2000s | 100GbE | 10.1.5.2/30 | Static lease from DHCP server |
Onboard Intel Network Drivers
Build Intel NIC drivers
It may be necessary to install a temporary NIC in order to pull down build tools and drivers
Install build-essentials
ipuuser@ipuhost:~$ sudo apt-get install build-essentials
Copy the Intel NIC driver to the system and decompress the archive:
ipuuser@ipuhost:~$ tar zxvf i40e-2.13.10.tar.gz
Switch to the source directory:
ipuuser@ipuhost:~$ cd nic_temp/i40e-2.13.10/src/
Make the module:
ipuuser@ipuhost:~$ make install
Modprobe the new driver module:
ipuuser@ipuhost:~$ sudo modprobe i40e
Configure IP address for onboard NIC
Edit the /etc/netplan/01-netcfg.yaml file to look something like this:
ipuuser@ipuhost:~$ cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
version: 2
renderer: networkd
ethernets:
enp65s0f0:
addresses:
- 192.168.8.201/20
gateway4: 192.168.5.1
nameservers:
addresses: [192.168.5.3, 192.168.5.2]
Apply the Netplan config:
ipuuser@ipuhost:~$ sudo netplay apply
Mellanox Network Drivers
Add Mellanox Repos
Add the Mellanox repositories so we can install the necessary Mellanox drivers:
cd /etc/apt/sources.list.d/ curl -LO https://linux.mellanox.com/public/repo/mlnx_ofed/latest/ubuntu18.04/mellanox_mlnx_ofed.list
Edit the .list file so it looks like this:
ipuuser@ipu-host-2:/etc/apt/sources.list.d$ cat mellanox_mlnx_ofed.list # # Mellanox Technologies Ltd. public repository configuration file. # For more information, refer to http://linux.mellanox.com # # [mlnx_ofed_5.2-1.0.4.0_base] deb [trusted=yes] http://linux.mellanox.com/public/repo/mlnx_ofed/5.2-1.0.4.0/ubuntu18.04/$(ARCH) ./
Pull down a copy of the Mellanox GPG key:
root@ipu-host-3:/etc/apt/sources.list.d# wget -qO - https://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox | sudo apt-key add -
Update the aptitude database and install the Mellanox OFED package:
root@ipu-host-2:~$ apt update
Install and configure Mellanox drivers
root@ipu-host-2:~$ apt-get install mlnx-ofed-all
Ensure the interfaces are configured to Ethernet rather than Infiniband:
root@ipu-host-2:~$ mlxconfig -d /dev/mst/mt4121_pciconf0 query root@ipu-host-2:~$ mlxconfig -d /dev/mst/mt4121_pciconf0 set LINK_TYPE_P1=2 root@ipu-host-2:~$ mlxconfig -d /dev/mst/mt4121_pciconf0 set LINK_TYPE_P2=2
Reboot the server
Netplan configuration
Networking on the Host system is configured by Netplan. The /etc/netplan/01-netcfg.yaml configuration file should look something like this (to reflect the table above):
root@ipu-host-2:~# cat /etc/netplan/01-netcfg.yaml
# This file describes the network interfaces available on your system
# For more information, see netplan(5).
network:
version: 2
renderer: networkd
ethernets:
enp65s0f0:
addresses:
- 192.168.8.202/20
gateway4: 192.168.5.1
nameservers:
addresses: [192.168.5.3, 192.168.5.2]
enp65s0f1:
addresses:
- 10.1.3.101/22
enp129s0f1:
addresses:
- 10.1.5.5/24
Apply the Netplan configuration:
root@ipu-host-2:~# netplan apply
DHCP Server
Installation
An isc-dhcp-server is required on the host system to provide the IPU M2000 system with the appropriate IP address (detailed above). It can be installed from the standard repos"
root@ipu-host-2:~# apt-get install isc-dhcp-server
Interface Configuration file
The /etc/default/isc-dhcp-server file dictates which interfaces will be serviced by the DHCP server service. It should look something like:
root@ipu-host-2:~# cat /etc/default/isc-dhcp-server INTERFACESv4="enp65s0f1 enp129s0f1" INTERFACESv6=""
Main DHCP configuration file
The /etc/dhcp/dhcpd.conf file provides configuration information for the DHCP server service. It should look something like:
root@ipu-host-2:~# cat /etc/dhcp/dhcpd.conf
default-lease-time 600;
max-lease-time 1200;
ddns-update-style none;
authoritave;
log-facility local7;
subnet 10.1.5.0 netmask 255.255.255.0 {
option subnet-mask 255.255.255.0;
range 10.1.5.2 10.1.5.2;
}
subnet 10.1.0.0 netmask 255.255.252.0 {
option subnet-mask 255.255.252.0;
}
host ipum1bmc { hardware ethernet 70:69:79:20:13:b4; fixed-address 10.1.1.1; }
host ipum1gw { hardware ethernet 70:69:79:20:13:b5; fixed-address 10.1.2.1; }
host ipum1mx { hardware ethernet 0c:42:a1:78:89:cd; fixed-address 10.1.5.2; }
Start and enable the DHCP Service
Start the DHCP service with:
root@ipu-host-2:~# systemctl start isc-dhcp-server
Enable the DHCP service with:
root@ipu-host-2:~# systemctl enable isc-dhcp-server
Check the status of the DHCP service with:
root@ipu-host-2:~# systemctl status isc-dhcp-server
● isc-dhcp-server.service - ISC DHCP IPv4 server
Loaded: loaded (/lib/systemd/system/isc-dhcp-server.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2021-01-19 16:43:43 GMT; 1 day 22h ago
Docs: man:dhcpd(8)
Main PID: 3474 (dhcpd)
Tasks: 1 (limit: 19660)
CGroup: /system.slice/isc-dhcp-server.service
└─3474 dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf
Jan 21 15:36:08 ipu-host-2 dhcpd[3474]: DHCPREQUEST for 10.1.1.1 from 70:69:79:20:13:b4 via enp65s0f1
Jan 21 15:36:08 ipu-host-2 dhcpd[3474]: DHCPACK on 10.1.1.1 to 70:69:79:20:13:b4 via enp65s0f1
Jan 21 15:36:11 ipu-host-2 dhcpd[3474]: DHCPREQUEST for 10.1.2.1 from 70:69:79:20:13:b5 via enp65s0f1
Jan 21 15:36:11 ipu-host-2 dhcpd[3474]: DHCPACK on 10.1.2.1 to 70:69:79:20:13:b5 via enp65s0f1
Jan 21 15:36:24 ipu-host-2 dhcpd[3474]: DHCPREQUEST for 10.1.5.2 from 0c:42:a1:78:89:cd via enp129s0f1
Jan 21 15:36:24 ipu-host-2 dhcpd[3474]: DHCPACK on 10.1.5.2 to 0c:42:a1:78:89:cd via enp129s0f1
Jan 21 15:41:07 ipu-host-2 dhcpd[3474]: DHCPREQUEST for 10.1.1.1 from 70:69:79:20:13:b4 via enp65s0f1
Jan 21 15:41:07 ipu-host-2 dhcpd[3474]: DHCPACK on 10.1.1.1 to 70:69:79:20:13:b4 via enp65s0f1
Jan 21 15:41:10 ipu-host-2 dhcpd[3474]: DHCPREQUEST for 10.1.2.1 from 70:69:79:20:13:b5 via enp65s0f1
Jan 21 15:41:10 ipu-host-2 dhcpd[3474]: DHCPACK on 10.1.2.1 to 70:69:79:20:13:b5 via enp65s0f1
NTP Server Service
Installation
NTP service is recommended to provide network time configuration to IPU-M2000 systems. It can be installed from the Ubuntu repositories:
root@ipu-host-2:~# apt-get install ntp
Configuration
The etc/ntp.conf file details the configuration for the NTP server; it should look something like this:
root@ipu-host-2:~# grep -v "#" /etc/ntp.conf driftfile /var/lib/ntp/ntp.drift leapfile /usr/share/zoneinfo/leap-seconds.list includefile /etc/ntp/crypto/pw keys /etc/ntp/keys fudge 127.127.1.0 stratum 10 pool 0.ubuntu.pool.ntp.org iburst pool 1.ubuntu.pool.ntp.org iburst pool 2.ubuntu.pool.ntp.org iburst pool 3.ubuntu.pool.ntp.org iburst pool ntp.ubuntu.com restrict 127.0.0.1 restrict ::1 restrict source notrap nomodify noquery
Start and enable the NTP Service
Start the NTP service with:
root@ipu-host-2:~# systemctl start ntp
Enable the NTP service with:
root@ipu-host-2:~# systemctl enable ntp
Check the status of the NTP service with:
root@ipu-host-2:~# systemctl status ntp
● ntp.service - Network Time Service
Loaded: loaded (/lib/systemd/system/ntp.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:ntpd(8)
Syslog
Syslog is a software utility for forwarding log messages in an IP network.
Configuration
The /etc/rsyslog.conf file should look like this:
root@ipu-host-2:~# grep "^[^#;]" /etc/rsyslog.conf module(load="imuxsock") # provides support for local system logging module(load="imudp") input(type="imudp" port="514") module(load="imtcp") input(type="imtcp" port="514") module(load="imklog" permitnonkernelfacility="on") $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat $RepeatedMsgReduction on $FileOwner syslog $FileGroup adm $FileCreateMode 0640 $DirCreateMode 0755 $Umask 0022 $PrivDropToUser syslog $PrivDropToGroup syslog $WorkDirectory /var/spool/rsyslog $IncludeConfig /etc/rsyslog.d/*.conf
The /etc/rsyslog.d/99_ipum.conf file should look like this:
root@ipu-host-2:~# grep "^[^#;]" /etc/rsyslog.d/99_ipum.conf $template precise,"%fromhost-ip%,%HOSTNAME%,%syslogpriority%,%syslogfacility%,%timegenerated::fulltime%,%syslogtag%,%msg%\n" :HOSTNAME, contains, "ipum" /var/log/ipulogs/ipulogs;precise & ~
The /etc/rsyslog.d/99_dhcpd.conf file should look like this:
root@ipu-host-2:~# grep "^[^#;]" /etc/rsyslog.d/99_dhcpd.conf local7.* /var/log/dhcpd.log
Graphcore Software Installation
The following Graphcore software packages need to be installed on the server:
- V-IPU server contains management and control software for IPU resource control, built-in self-test (BIST) and monitoring of the IPU-M2000s and IPUs. There is a V-IPU Admin Guide and a V-IPU User Guide available.
- IPU-M2000 system software contains the latest IPU-M2000 resident software for update, if required. It also includes the server resident tool
rack_toolwhich is required for updating the IPU-M2000s resident software and testing the system hardware.
V-IPU server installation
Both the release notes and the V-IPU software release tarball are available from the Graphcore download portal https://downloads.graphcore.ai
An installation script called install.sh is included with the V-IPU tarball. The installation script has been tested and verified to work with Ubuntu and CentOS distros that
use systemd as the default service manager. The installation script needs to be executed with root privileges (sudo ./install.sh) as it copies the vipu-server, vipu-admin and vipu binaries to /usr/local/bin.
Ensure you are logged in as ipuuser
The script will configure and start vipu-server.service.
In the following example, the system is cabled according to the standard instructions where “enp65s0f0” is the host server interface that connects to the top IPU-M2000 management port at the top of the stack.
ipuuser@ipu-host-2:~$ sudo ./install.sh Do you want to start the vipu-server as a service in this host? Note that you should have vipu-server running only in one host and use vipu/vipu-admin to connect to it from all other hosts. (N/y) y Choose an interface to use for agent auto-discovery: eno0 enp65s0f0 enp65s0f1 lo Enter disable to deactivate the auto-discovery Which interface should be used for auto-discovery? enp65s0f0 - vipu-server will be configured to be run as a service in this host - Initialising /etc/vipu/config.hcl
IPU-M2000 Server Software Installation
The IPU-M2000 system software contains a set of upgradable software and FPGA sub- components that are targeted to be executed on the IPU-M2000 units. The release also contains the tool rack_tool which is used for the software upgrade and other rack related tasks targeting the IPU-M2000s.
Ensure you are logged in as ipuuser
Go to the Graphcore download portal https://downloads.graphcore.ai and download the latest release into your home directory
Unpack the tarball:
ipuuser@ipu-host-2:~$ tar xvf IPU_M_SW-2.0.0.rc.3.tar
Install the software/tools:
ipuuser@ipu-host-2:~$ cd ~/IPU_M_SW-2.0.0-rc.3+a51e75a/maintenance_tools/ ipuuser@ipu-host-2:~$ ./install.sh
rack_tool Configuration
rack_tool requires a config file which contains information on all the IPU-M2000s it will control. The information in the config file defines all IP addresses of the BMC, GW and RNIC interfaces.
Create a directory for the configuration file:
ipuuser@ipu-host-2:~$ mkdir -p $HOME/.rack_tool
Your /home/ipuuser/.rack_tool/rack_config.json should look something like this:
ipuuser@ipu-host-2:~$ cat /home/ipuuser/.rack_tool/rack_config.json
{
"global_credentials": {
"bmc_username": "root",
"bmc_passwd": "0penBmc",
"gw_username": "itadmin",
"gw_passwd": "ChangeMeFdh5P"
},
"gw_root_overlay": "/home/ipuuser/IPU-M_releases/IPU_M_SW-2.0.0-rc.3+a51e75a/maintenance_tools/ipu_pod_config/root-overlay/",
"machines": [
{
"name": "m01",
"bmc_ip": "10.1.1.1",
"gw_ip": "10.1.2.1",
"mx_ip": "10.1.5.2"
}
]
}
Copy root-overlay file system
A root-overlay file system is used to pass configuration of the NTP and syslog into the IPU- M2000 software. The rack_config.json file above refers to the path of these files. The path is either relative to the location of the rack_config.json or an absolute path. The easiest is to copy over the files to the default location:
ipuuser@ipu-host-2:~$ cd /home/ipuuser/IPU-M_releases/IPU_M_SW-2.0.0-rc.3+a51e75a/maintenance_tools/ipu_pod_config ipuuser@ipu-host-2:~$ cp -r root-overlay /home/ipuuser/.rack_tool/