Difference between revisions of "LSF Multicluster"
(Created page with "* Note, see the lsf.shared bug at bottom. Changes not maintained in PCM 3.0 (and previous) ===== Multicluster License ===== * Ensure your license includes a line with lsf_mul...") |
(No difference)
|
Latest revision as of 09:31, 1 May 2013
- Note, see the lsf.shared bug at bottom. Changes not maintained in PCM 3.0 (and previous)
Multicluster License
- Ensure your license includes a line with lsf_multicluster, otherwise request from platform
FEATURE lsf_multicluster lsf_ld 7.000 31-JUL-2011 0 AD3E3C81D267A3C1C0B6 "Platform" DEMOConfiguration Files
The configuration below:
- uses two PCM 3.0 clusters (pcm30, pcm-mctest)
- pcm30 will forward jobs on to pcm-mctest
- pcm-mctest will receive jobs from pcm30
- All change were made in /etc/cfm/templates/lsf/
- lsf.cluster (or default.lsf.cluster), two additions RemoteCluster and PRODUCTS
# Update the PRODUCT line to include multicluster
PRODUCTS=LSF_Base LSF_Manager LSF_MultiCluster# drop in an the end of the file
# dp multicluster, note: names are the cluster names as defined by LSF (typically hostname_cluster1)
Begin RemoteClusters
CLUSTERNAME
pcm30_cluster1
pcm-mctest_cluster1
End RemoteClusters- lsf.shared (or default.lsf.shared)
# Note: replaced XXX_clustername_XXX with the cluster name. Guess this is ok provided the cluster name doesnt change
Begin Cluster
ClusterName Servers
pcm30_cluster1 pcm30
pcm-mctest_cluster1 pcm-mctest
End Cluster
###### NOTE PROBLEM, INFO NOT SYNCd AFTER ADDHOST -U ###########
###### SEE RESOLUTION at the bottom of this page ###########- lsf.conf (or default.lsf.conf)
# Multicluster enable, append to end of file
MC_PLUGIN_REMOTE_RESOURCE=yMulticluster Model
- Job forwarding model
In this model, the cluster that is starving for resources sends jobs over to the cluster that has resources to spare. To work together, two clusters must set up compatible send-jobs and receive-jobs queues. With this model, scheduling of MultiCluster jobs is a process with two scheduling phases: the submission cluster selects a suitable remote receive-jobs queue, and forwards the job to it; then the execution cluster selects a suitable host and dispatches the job to it. This method automatically favors local hosts; a MultiCluster send-jobs queue always attempts to find a suitable local host before considering a receive-jobs queue in another cluster.
- Resource leasing model
In this model, the cluster that is starving for resources takes resources away from the cluster that has resources to spare. To work together, the provider cluster must export resources to the consumer, and the consumer cluster must configure a queue to use those resources. In this model, each cluster schedules work on a single system image, which includes both borrowed hosts and local hosts.
- Choosing a model
- Consider your own goals and priorities when choosing the best resource-sharing model for your site.
- The job forwarding model can make resources available to jobs from multiple clusters, this flexibility allows maximum throughput when each clusters resource usage fluctuates.
- The resource leasing model can allow one cluster exclusive control of a dedicated resource, this can be more efficient when there is a steady amount of work.
- The lease model is the most transparent to users and supports the same scheduling features as a single cluster.
- The job forwarding model has a single point of administration, while the lease model shares administration between provider and consumer clusters.
Job Forwarding Model
- lsb.queues (or lsbatch/default/configdir/lsb.queues)
- On the host sending jobs, create a queue with SNDJOBS_TO (pcm30)
Begin Queue
QUEUE_NAME = sendq
PRIORITY = 40
HOSTS = none
SNDJOBS_TO = receiveq@pcm-mctest_cluster1
End Queue- On the cluster receiving jobs, create a queue with RCVJOBS_FROM (pcm-mctest)
Begin Queue
QUEUE_NAME = receiveq
RCVJOBS_FROM = sendq@pcm30_cluster1
HOSTS = all
End QueueResource Sharing Model
- In this example, cluster pcmtest is exporting a single node to vhpchead
- lsb.resources file on pcmtest
Begin HostExport
PER_HOST = pcmcomp000 # export host list
SLOTS = 12 # for each host, export 5 job slots
DISTRIBUTION = [vhpchead_cluster1, 6] # share distribution for remote clusters:
# cluster <vhpchead_cluster1> has 6 shares,
End HostExport- lsb.queues file on vhpchead
# resource borrow queue
Begin Queue
QUEUE_NAME = resourceborrowq
PRIORITY = 40
HOSTS = compute005 pcmcomp000@pcm30_cluster1 # 2 hosts on this queue, one remote host pcmcomp000
DESCRIPTION = Resource Borrow Queue
End Queue- Verify jobs are being run correctly
[root@pcmcomp000 ~]# bclusters
[Job Forwarding Information ]
LOCAL_QUEUE JOB_FLOW REMOTE CLUSTER STATUS
receiveq recv - vhpchead_c ok
[Resource Lease Information ]
REMOTE_CLUSTER RESOURCE_FLOW STATUS
vhpchead_cluste EXPORT ok
# Check the hosts that are being exported:
[root@pcmcomp000 ~]# bhosts -e
HOST_NAME MAX NJOBS RUN SSUSP USUSP RSV
pcmcomp000 12 3 3 0 0 0- Also check output from vhpchead
[david@vhpchead multicluster]$ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
21472 david RUN resourcele vhpchead compute005 sleep 60 Aug 3 17:20
21474 david RUN resourcele vhpchead compute005 sleep 60 Aug 3 17:20
21476 david RUN resourcele vhpchead compute005 sleep 60 Aug 3 17:20
21473 david RUN resourcele vhpchead pcmcomp000@ sleep 60 Aug 3 17:20
21475 david RUN resourcele vhpchead pcmcomp000@ sleep 60 Aug 3 17:20
21477 david PEND resourcele vhpchead sleep 60 Aug 3 17:20
[david@vhpchead multicluster]$ bclusters
[Job Forwarding Information ]
LOCAL_QUEUE JOB_FLOW REMOTE CLUSTER STATUS
sendq send receiveq pcm30_clus ok
[Resource Lease Information ]
REMOTE_CLUSTER RESOURCE_FLOW STATUS
pcm30_cluster1 IMPORT ok
[david@vhpchead ~]$ bhosts -w
HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
compute000 ok - 12 0 0 0 0 0
compute001 ok - 12 0 0 0 0 0
compute002 ok - 24 0 0 0 0 0
compute003 ok - 8 0 0 0 0 0
compute004 ok - 8 0 0 0 0 0
compute005 ok - 8 0 0 0 0 0
compute007 ok - 8 0 0 0 0 0
compute008 ok - 8 0 0 0 0 0
compute009 ok - 8 0 0 0 0 0
pcmcomp000@pcm30_cluster1 ok - 12 0 0 0 0 0
vhpchead ok - 8 0 0 0 0 0Update System
addhost -u
# Or if you edited files in /opt/lsf/conf
lsadmin reconfig
badmin mbdrestart- Note, if the configuration doesn't apply correctly run the lsadmin and badmin commands listed above to verify the configuration files (addhost -u does not report configuration errors correctly!)
Setup IPtables
- The cluster LSF processes will try and communication, by default only ssh traffic is allowed on eth1, update iptables on both servers
# Generated by iptables-save v1.3.5 on Fri Jul 1 17:40:27 2011
*nat
:PREROUTING ACCEPT [1339:165189]
:POSTROUTING ACCEPT [205:14830]
:OUTPUT ACCEPT [516:36221]
-A POSTROUTING -o eth1 -j MASQUERADE
COMMIT
# Completed on Fri Jul 1 17:40:27 2011
# Generated by iptables-save v1.3.5 on Fri Jul 1 17:40:27 2011
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [43133:352914090]
-A INPUT -i eth1 -p tcp -m state --state NEW -m tcp --dport 8080 -j ACCEPT
-A INPUT -i eth0 -p tcp -m state --state NEW -m tcp --dport 8080 -j ACCEPT
-A INPUT -i eth1 -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -i eth1 -p tcp -m state --state NEW -m tcp --dport 53 -j ACCEPT
-A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport 53 -j ACCEPT
-A INPUT -i eth1 -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
-A INPUT -i eth1 -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT
-A INPUT -i eth1 -p tcp -m state --state NEW -m tcp --dport 873 -j ACCEPT
-A INPUT -i eth1 -p tcp -m state --state NEW -m tcp --dport 5432 -j ACCEPT
# multicluster
-A INPUT -i eth1 --source 172.28.10.0/24 -p tcp -m state --state NEW -m tcp --dport 7869 -j ACCEPT
-A INPUT -i eth1 --source 172.28.10.0/24 -p tcp -m state --state NEW -m tcp --dport 6878 -j ACCEPT
-A INPUT -i eth1 --source 172.28.10.0/24 -p tcp -m state --state NEW -m tcp --dport 6881 -j ACCEPT
-A INPUT -i eth1 --source 172.28.10.0/24 -p tcp -m state --state NEW -m tcp --dport 6882 -j ACCEPT
# end multicluster
-A INPUT -i eth0 -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -i eth1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i eth0 -o eth1 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i eth0 -j ACCEPT
COMMIT
# Completed on Fri Jul 1 17:40:27 2011/etc/hosts - add clusters
- On each node, add the cluster headnode to the hosts file (if not using external DNS which can resolve both hostname)
- As these are cluster external hosts, add to /etc/hosts/append
# /etc/hosts.append on pcm-mctest
172.28.10.69 pcm30.viglen.co.uk pcm30- Then update the hosts file and sync across cluster
kusu-genconfig hosts > /etc/hosts
cfmsync -fCheck MultiCluster Status
- Use bclusters and lsclusters
- Status should be ok, if you see disc there may be some communication problems
root@pcm-mctest lsf]# bclusters
[Job Forwarding Information ]
LOCAL_QUEUE JOB_FLOW REMOTE CLUSTER STATUS
receiveq recv - pcm30_clus ok
[Resource Lease Information ]
No resources have been exported or borrowed[root@pcm-mctest lsf]# lsclusters
CLUSTER_NAME STATUS MASTER_HOST ADMIN HOSTS SERVERS
pcm-mctest_clu ok pcm-mctest hpcadmin 2 2
pcm30_cluster1 ok pcmtest.viglen.co. hpcadmin 2 2Move Files between Clusters
- LSF will use lsrcp
- Need to setup both cluster to use SSH for [lsrcp|rsh|rcp], replace/create link for the binaries on the headnode. Create these files in /etc/cfm/[compute-group]
- All need to ensure ssh keys are setup between the clusters (passwordless access)
# Either change lsrcp:
/opt/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/lsrcp -> [scp] #mkdir /etc/cfm/compute-centos-5.6-x86_64/$LSF_BINDIR
# Or change rcp (the default lsrcp will fall back on rcp)
/usr/kerberos/bin/rsh -> [ssh]
/usr/kerberos/bin/rcp -> [scp]
# ssh keys
cat ~/.ssh/id_rsa.pub | ssh user@remote.machine.com 'cat >> .ssh/authorized_keys'- NOTE: Output files created on remote cluster at not automatically copied back
# Sample scripts that copied an input file across, and then all output files back.
#BSUB -q sendq
#BSUB -o sendq_output.%J.txt
#BSUB -e sendq_error.%J.txt
#BSUB -f "/home/david/test_input.inp > /home/david/copied_across.inp"
#BSUB -f "/home/david/result_copied.out < /home/david/result.out"
#BSUB -f "/home/david/sendq_output_copied.%J.txt < /home/david/sendq_output.%J.txt"
#BSUB -f "/home/david/sendq_error_copied.%J.txt < /home/david/sendq_error.%J.txt"
echo "hi"
hostname
id
cat /home/david/copied_across.inp
hostname >> result.out
id >> result.out
sleep 30/etc/cfm/templates/default.lsf.shared Cluster section gets overwrote on sync. The following changes need to be made:
vi /opt/kusu/lib/plugins/genconfig/lsfshared_7_0_6.py
# Change from:
84 if re.compile("^End.*Cluster").search(instr):
85 inClusterSection = False
86 else:
87 if re.compile("^ClusterName").search(instr):
88 pass
89 else:
90 if inClusterSection:
91 print clusterName
92 continue
# Change to:
84 if re.compile("^End.*Cluster").search(instr):
85 inClusterSection = False
86 else:
87 if re.compile("^ClusterName").search(instr):
88 pass
89 else:
90 if inClusterSection and re.compile("^XXX_clustername_XXX").search(instr): # <---- This line!
91 print clusterName
92 continue
93
94 print instr,- Verify the update has been applied correctly:
kusu-genconfig lsfshared_7_0_6 'insert-cluster-name'
e.g: kusu-genconfig lsfshared_7_0_6 pcm30_cluster1
# once confirmed as updating correctly:
addhost -u