XCAT Installation on Centos 8

From Define Wiki
Jump to navigation Jump to search

Base Level Setup

Note: Starting point Centos minimal 8.2 with networking in place and setup

We start by setting the hostname and disabling SElinux

setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux 
yum -y install vim tmux wget rsyslog
systemctl enable rsyslog
systemctl start rsyslog


Setup naming and hosts file

Set the hostname

hostnamectl set-hostname deploy.dt.internal 

Set the /etc/hosts file

# Ensure there is a /etc/hosts entry for the internal interface

192.168.102.253 deploy.dt.internal      deploy

Setup the firewall to route traffic from the compute nodes externally

# enp2s0 / enp3s0 internal and external interface... todo test with external in public zone and lockdown to ssh / web only 
systemctl enable firewalld
systemctl restart firewalld
firewall-cmd --permanent --zone=trusted --change-interface=enp2s0 
firewall-cmd --permanent --zone=trusted --change-interface=enp3s0 
firewall-cmd --permanent --add-masquerade --zone=trusted 
firewall-cmd --reload 

Setup the software repos

yum -y install yum-utils
wget --no-check-certificate -P /etc/yum.repos.d https://xcat.org/files/xcat/repos/yum/latest/xcat-core/xcat-core.repo

yum -y install centos-release-stream
wget --no-check-certificate -P /etc/yum.repos.d https://xcat.org/files/xcat/repos/yum/xcat-dep/rh8/x86_64/xcat-dep.repo

Add provisioning services on the headnode

yum -y install xCAT 
echo ". /etc/profile.d/xcat.sh" >> ~/.bashrc
source ~/.bashrc 

Setup Networking for provisioning

In this example we are using ```enp1s0f1``` as the internal interface (i.e. the pxeboot / provisioning interface)

# setup networking for provisioning
${sms_eth_internal}=enp1s0f1
${sms_ip}=192.168.102.253
${internal_netmask}=24

# these commands seemed pointless as i already had setup - but revisit if issues later on. not sure what that broadcast + command does 
ip link set dev enp1s0f1 up 
ip address add 192.168.102.253/24 broadcast + dev enp1s0f1

# set the DHCP interface
chdef -t site dhcpinterfaces="xcatmn|enp1s0f1"

Add the Centos DVD to the installer

Update the URL for more recent versions if required.

NOTE: don't use the minimal iso, grab the larger DVD iso


# grab the centos dvd iso to create a local repo
wget http://mirrors.ukfast.co.uk/sites/ftp.centos.org/8.2.2004/isos/x86_64/CentOS-8.2.2004-x86_64-dvd1.iso

copycds ./CentOS-8.2.2004-x86_64-dvd1.iso


Query available images

lsdef -t osimage 

Enable ssh during the installation on the compute nodes

# enables ssh to the node during installation
chdef -t site clustersite xcatdebugmode=2
lsdef -t site clustersite

Enable the time service (Chronyd)

# uncomment the following to allow the compute nodes access
# Allow NTP client access from local network.
allow 192.168.0.0/16

# add to the end of the file /etc/chrony.conf (note below for Indonesia, select correct tz)
echo "
server 1.id.pool.ntp.org
server 2.id.pool.ntp.org
server 3.id.pool.ntp.org
server 4.id.pool.ntp.org" >> /etc/chrony.conf

systemctl enable chronyd
systemctl start chronyd

Add a node to the installation.

# add a node
# note: serialport=0 (ttyS0) AIC Servers default
#       serialport=1 (ttyS1) Supermicro Servers default
[root@deploy ~]# mkdef -t node node0001 groups=compute,all ip=192.168.102.10 mac=00:15:B2:AA:E2:60 netboot=xnba arch=x86_64 bmc=192.168.101.10 bmcusername=admin bmcpassword=admin mgt=ipmi serialport=0 serialspeed=115200 provmethod=centos8-x86_64-install-compute
1 object definitions have been created or modified.

# set the root password
chtab key=system passwd.username=root passwd.password=`openssl rand -base64 12`

# set the domain name
chdef -t site domain=dt.internal 


Finalise the setup - below will create the /tftpboot files and /etc/hosts etc.

# finalise the setup 
# Complete network service configurations
# Please note: you will see a warning about how the network not having a dynamic range. Its safe to ignore this. 
makehosts
makenetworks 
makedhcp -n
makedns -n


Check the rcons status

# might not be needed systemctl enable goconserver
# systemctl start goconserver
makegocons
makegocons -q


Set the nodes to pxeboot

# Associate desired provisioning image for computes
nodeset compute osimage=centos8-x86_64-install-compute
# be careful with rinstall when using AMD Epyc nodes due to EFI Boot support
rinstall nvme0001 osimage=centos8-x86_64-install-compute

Setup Groups for parallel tools

# at a minimum we should setup groups for controllers and hypervisors

[root@stu-dpy1 partition]# mkdef -t group -o controllers members="stu-hcb1-n1,stu-hcb2-n1,stu-hcb3-n1"                                                                         
1 object definitions have been created or modified.

[root@stu-dpy1 partition]# lsdef -t group  controllers
Object name: controllers
    grouptype=static
    members=stu-hcb1-n1,stu-hcb2-n1,stu-hcb3-n1 

[root@stu-dpy1 partition]# psh controllers uptime 
stu-hcb3-n1:  07:43:47 up 8 days,  1:49,  1 user,  load average: 2.23, 1.14, 0.74
stu-hcb1-n1:  07:43:47 up 17 days, 14:17,  1 user,  load average: 0.44, 0.50, 0.70
stu-hcb2-n1:  07:43:47 up 8 days,  2:01,  0 users,  load average: 2.47, 2.09, 1.22

Useful commands


# check the node stats
lsdef -t node -l
lsdef -o node0001

# check the operating systems stats 
lsdef -t osimage -o centos8-x86_64-install-compute

# change a node mac address
makedhcp -d <nodename>
chdef -t node -o  <nodename> mac=<new-mac>
makedhcp <nodename>
lsdef -o node001

# connect to sol
rcons <nodename>
# to exit
ctrl e + c + .

# reinstall a node 
# currstate=boot
nodeset node0001 osimage=centos8-x86_64-install-compute
[root@deploy nets]# lsdef node0001 | grep currstate 
    currstate=install centos8-x86_64-compute

# using a console session - whats the root password? 
tabedit passwd

# setup RAID1 for the compute nodes
mkdir -p /install/custom/partition/
wget https://raw.githubusercontent.com/xcat2/xcat-extensions/master/partition/raid1_rh.sh -O /install/custom/partition/raid1_rh.sh

# Set the partition file as part of the provision
chdef -t osimage centos8-x86_64-install-compute partitionfile="s:/install/custom/partition/raid1_rh.sh"

# reprovision the node
rinstall node0001 osimage=centos8-x86_64-install-compute

# disk notes: check status of RAID array
cat /proc/mdstat
mdadm --detail /dev/mdX


# add a new node to an existing cluster