Difference between revisions of "Orch:Headnode install"
| Line 68: | Line 68: | ||
== Add genders == | == Add genders == | ||
| − | + | This is not required for basic installation | |
Revision as of 15:42, 23 January 2017
In the following we assume the availability of a single head node master, some -at least one- compute nodes, and the Intel Orchestrator ISO file (if you don't have it, contact David). The master node is provisioned with Centos7.2, and serves as the overall system management server (SMS). In its role as an SMS, the master node is configured to provision the remaining compute in a stateless configuration using Warewulf.
We assume the ISO file is copied in a location like /tmp/Intel_HPC_Orchestrator-rhel7.2-16.01.004.ga.iso
Enable local Intel® HPC Orchestrator repository
On the head node, mount the image and enable Orchestrator as a local repository using the "hpc-orch-release" package rpm.
mkdir -p /mnt/hpc_orch_iso
mount -o loop /tmp/Intel-HPC-Orchestrator-rhel7.2-16.01.004.ga.iso /mnt/hpc_orch_iso ; echo $?
rpm -Uvh /mnt/hpc_orch_iso/x86_64/Intel_HPC_Orchestrator_release-*.x86_64.rpm
rpm --import /etc/pki/pgp/HPC-Orchestrator*.asc
rpm --import /etc/pki/pgp/PSXE-keyfile.ascAdd provisioning services to master node
With the Intel® HPC Orchestrator repository enabled, we proceed by adding the orch-base and Warewul provisioning packaage onto the master node.
yum -y groupinstall orch-base
yum -y groupinstall orch-warewulfProvisioning services with Warewulf rely on DHCP, TFTP, and HTTP network protocols. Default firewall rules may prohibit these services. Therefore we will disable the firewall (Once installed it's highly recommended to re-enable it on the head node and configure it to only allow access on port 22 from the external interface, while still allowing traffic on the internal interfaces to the system)
rpm -q firewalld && systemctl disable firewalld
rpm -q firewalld && systemctl stop firewalldIntel® HPC Orchestrator relies on synchronized clocks throughout the system and uses the NTP protocol to facilitate this synchronization. To enable NTP services on the head node with a specific server ${ntp_server}, issue the following on the heas node:
systemctl enable ntpd.service
# Disable default external servers
sed -i 's|^server|#server|' /etc/ntp.conf
echo "server ${ntp_server}" >> /etc/ntp.conf
echo "server 127.127.1.0 # local clock" >> /etc/ntp.conf
echo "fudge 127.127.1.0 stratum 10" >> /etc/ntp.conf
systemctl restart ntpd
Add resource management services to the master node
The following command adds the SLURM workload manager server components to the head node. Later on, client-side components will be added to the compute image.
yum -y groupinstall orch-slurm-server
# Add PDSH support to determine the nodelist of a Slurm job and run a command on those nodes
yum -y install pdsh-mod-slurm-orchSLURM requires the designation of a system user that runs the underlying resource management daemons. The default configuration file that is supplied with the Intel® HPC Orchestrator build of SLURM identifies this SlurmUser to be a dedicated user named "slurm" and this user must exist.
getent passwd slurm || useradd slurmSLURM can also be configured to control which local resource limits get propagated to a user's allocated resources by enabling SLURM's PAM support.
perl -pi -e "s|^#UsePAM=|UsePAM=1|" /etc/slurm/slurm.conf
cat <<- HERE > /etc/pam.d/slurm
account required pam_unix.so
account required pam_slurm.so
auth required pam_localuser.so
session required pam_limits.so
HEREBy default all resource limits are propagated from the session a user submitted a job from. With PAM support enabled configuration can be added to SLURM's configuration, e.g. adding PropagateResourceLimitsExcept=NOFILE will prevent the user's resource limit on open files from being set on their allocated nodes.
Add genders
This is not required for basic installation