Difference between revisions of "CloudX: Mellanox CloudX Installation"

From Define Wiki
Jump to navigation Jump to search
Line 17: Line 17:
 
* MLNX_OFED_LINUX-2.2-0.0.2_20140306_1723-rhel6.4-x86_64.tgz package copied across
 
* MLNX_OFED_LINUX-2.2-0.0.2_20140306_1723-rhel6.4-x86_64.tgz package copied across
 
* Make sure all the external YUM repos are disabled
 
* Make sure all the external YUM repos are disabled
 +
* Ensure times are all in sync across all nodes
 +
* Setup ssh passwordless access
  
 
=== No External Repos ===
 
=== No External Repos ===

Revision as of 23:00, 29 August 2014

Download the CloudX Image

# Standard
wget http://support.mellanox.com/ftp/versions/current/Solutions/cloudX/1.0.0.8/ONE_CLICK_CLOUDX_1.0.0.8-31032014-2146.qcow2
# Continue WHEN things go wrong
wget -c http://support.mellanox.com/ftp/versions/current/Solutions/cloudX/1.0.0.8/ONE_CLICK_CLOUDX_1.0.0.8-31032014-2146.qcow2

Base System Setup

  • CentOS 6.4 Base System
  • MLNX_OFED_LINUX-2.2-0.0.2_20140306_1723-rhel6.4-x86_64.tgz package copied across
  • Make sure all the external YUM repos are disabled
  • Ensure times are all in sync across all nodes
  • Setup ssh passwordless access

No External Repos

[root@ft1 ~]# ls /etc/yum.repos.d/
CentOS-Base.repo  CentOS-Debuginfo.repo  CentOS-Media.repo  CentOS-Vault.repo

Package Setup

  # Install the additional packages
  yum install -y tcl gcc-gfortran.x86_64 tk 
  tar zxvf MLNX_OFED_LINUX-2.2-0.0.2_20140306_1723-rhel6.4-x86_64.tgz 
  cd MLNX_OFED_LINUX-2.2-0.0.2_20140306_1723-rhel6.4-x86_64
  ./mlnxofedinstall --force --all

  # setup the adaptors as ethernet if using VPI 
  connectx_port_config 

  # Verify the ports 
[root@blade2 ~]# connectx_port_config -s
--------------------------------
Port configuration for PCI device: 0000:07:00.0 is:
eth
eth
--------------------------------

Setup Grub

  • Setup GRUB to boot with SR-IOV support, add intel_iommu=on to the kernel args
title CentOS (2.6.32-358.el6.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-358.el6.x86_64 ro root=/dev/mapper/vg_blade3-lv_root \
                  rd_NO_LUKS rd_LVM_LV=vg_blade3/lv_root LANG=en_US.UTF-8 \ 
                  rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_blade3/lv_swap \ 
                  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet intel_iommu=on
        initrd /initramfs-2.6.32-358.el6.x86_64.img
  • NOTE SETUP THE BIOS - come back once confirmed settings

Verify PCI Speeds

  • Setup PCI Utils, make sure we are running at 8GT/s
  • ConnectX2 Sample Output - This is only GEN2 output
[root@blade1 ~]# lspci -d 15b3: -vv | grep LnkSta
		LnkSta:	Speed 5GT/s, Width x8, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-

Setup Control Node with KVM

  • Verify Host Supports KVM
egrep '(vmx|svm)' --color=always /proc/cpuinfo
# Should have a vmx flag on each core
  • Install KVM
yum install xauth
yum groupinstall  Virtualisation 'Virtualization Client' 'Virtualization Platform' 'Virtualization Tools'
modprobe kvm kvm-intel 
/etc/init.d/libvirtd start 
chkconfig libvirtd on
  • Verify the virbr0 interface is setup and ready
[root@blade2 ~]# ifconfig virbr0 
virbr0    Link encap:Ethernet  HWaddr 52:54:00:1D:04:7A  
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
  • Setup a network bridge on the host with DHCP
# create the following file
[root@blade2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-br0
DEVICE=br0 
TYPE=Bridge 
BOOTPROTO=dhcp 
ONBOOT=yes 
DELAY=0
  • Edit the eth0 configuration to add a bridge: BRIDGE=br0
[root@blade2 ~]# cat  /etc/sysconfig/network-scripts/ifcfg-eth0 
DEVICE=eth0
ONBOOT=yes
HWADDR=00:25:90:C4:E9:8A
TYPE=Ethernet
BOOTPROTO=dhcp
BRIDGE=br0
  • Reboot the node once complete
reboot

Configure the CloudX VM

  • Bring up the KVM manager
# X11 fwding required
virt-manager
  • Steps in the manager:
  1. Step 1 of 4
  2. Create a new VM
  3. Select 'Import existing disk image'
  4. Click 'Forward'
  5. Step 2 of 4
  6. Select the qcow2 image
  7. OS Type: Linux
  8. Version: Redhat Enterprise Linux 6
  9. Select 'Forward'
  10. Step 3 of 4
  11. RAM: 1024MB
  12. CPUs: 1
  13. Select 'Forward'
  14. Step 4 of 4
  15. Select the advanced options
  16. Host device should be br0
  17. Virt Type: KVM
  18. Arch: x86_64
  19. Finish
  • Shut down the VM when it starts, we need to edit the disk format
  • Select the 'i' or Information tab, Select the disk option
  • Make sure the Storage Format is qcow2
  • Make sure the Disk Bus is IDE
  • Power on the VM and let it boot
  • Check the VM settings (IP DHCP etc, go through VNC)
# Note; Keys were not working correctly through virt-manager, had to use vncviewer instead
yum install tsclient
vncviewer localhost:0

You can revert back to ssh once you have the IP address. The default username and password is root and password

CloudX Setup (on the VM)

  • Log in to the VM (root/password)
  • Verify the configuration settings in: /opt/cloudx_install/conf/cloudx.conf
    • Note: First installation we did wasnt with a mellanox switch so we had to change the Fabric Preparation setting to False
# in the file /opt/cloudx_install/conf/cloudx.conf
fabric_preparation = False

# these are the setting mellanox recommend using if you have an non-mellanox switch
# Per switch:
dcb priority-flow-control enable force
dcb priority-flow-control priority 3 enable
interface ethernet 1/1-1/{ports} dcb priority-flow-control mode on force
interface ethernet 1/1-1/{ports} mtu {mtu} force
vlan {min_vlan}-{max_vlan}   (default is 10 vlans)
 
# Per port:
interface ethernet 1/{portnum} switchport mode hybrid
interface ethernet 1/{portnum} switchport hybrid allowed-vlan all
 
***Pay attention you configure the switch with the same vlan range you did in cloudx.conf (min_vlan/max_vlan parameters)


  • Verify the servers configuration in: /opt/cloudx_install/conf/servers.csv
# This example if from the blade in the lab with: 
# blade1: Storage
# blade2: Operation Node (CloudX Host)
# blade3: Network node
# blade4: Controller (Openstack Controller)
# cloudx: Installer (This is the cloudX VM!) note: The 192.xx address does nothing for this system, default pass is also password from the VM
# blade9/10: Compute
# Notes: IP is the eth0 interface, x for MAC as it doesnt work yet, Inband is the mlnx adaptor etc
[david@head-boston cloudx]$ cat servers.csv 
IP,MAC/GUID,Inband,Username,Password,Card,Port,Role,Exclude
172.28.15.10,x,192.168.0.10,root,Boston2014,mlx4_0,2,Compute,n
172.28.15.9,x,192.168.0.9,root,Boston2014,mlx4_0,2,Compute,n
172.28.15.3,x,192.168.0.3,root,Boston2014,mlx4_0,2,Network,n
172.28.15.4,x,192.168.0.4,root,Boston2014,mlx4_0,2,Controller,n
172.28.15.1,x,192.168.0.1,root,Boston2014,mlx4_0,2,Storage,n
172.28.15.192,x,192.168.0.5,root,password,mlx4_0,2,Installer,n
  • Verify the switches configuration in: /opt/cloudx_install/conf/switches.csv
# Note: Mellanox default username/pass: admin/admin
[root@cloudx ~]# cat /opt/cloudx_install/conf/switches.csv 
Role,Hostname,Username,Password
spine_0,172.28.250.103,admin,admin
  • Other Host Preparations
# - apply the cloudx.patch
cd /opt/cloudx_install/conf/
patch -p0 < cloudx.patches
  • Make sure the host has a FQDN: eg. cloudx.boston.co.uk
# - update the file: /etc/sysconfig/network - 
HOSTNAME=cloudx.boston.co.uk
# Update /etc/hosts
172.28.0.220	cloudx.boston.co.uk	cloudx
# Ensure the following cmd works
[root@cloudx ~]# hostname --fqdn
cloudx.boston.co.uk
  • Once all the above is completed, you can launch the installer script.
# use screen - installation process can take some time!
yum install screen 
screen -S cloudx
/opt/cloudx_install/scripts/cloudx_installer.sh