Rocks: SGE Commands

From Define Wiki
Revision as of 10:03, 13 May 2013 by Michael (talk | contribs) (Created page with "===== Using qstat ===== <syntaxhighlight> qstat -f # what jobs are running qstat -f -q \*@compute-0-0 # full display info [for compute-0-0 onl...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Using qstat
qstat -f                                # what jobs are running
qstat -f -q \*@compute-0-0              # full display info [for compute-0-0 only
qstat -f -g c                           # display cluster queue summary
qstat -s r -q all.q@compute-0-0         # show all running jobs on compute-0-0
qstat -u viglen                         #show jobs being run by user; eg viglen; note jobid
qstat -j <jobid>                        #print all information with regards to jobid; numerical number
Control jobs using qdel and qmod
qdel <job_id>                           #delete job job_id
qdel -f <job_id>                        #delete job job_id; register job status change at sge_master without contacting sge_execd
qdel -u viglen                          #delete all jobs being run by user viglen
qdel 12 -u viglen                       #delete job with jobid 12 being run user viglen
qdel -u "*"                             #delete all jobs being run by all users: USE WITH LOTS OF CAUTION
qmod -sj <job_id>                       #suspend a running job
qmod -rj <job_id>                       #resume/reschedule the job
Using qconf
qconf -sel                              # show the nodes configured on sge
qconf -qpl                              # show parallel environments configured
qconf -sp mpi|mpich|orte                # show the parallel env configuration (pre/post scripts etc)

qconf -sq all.q@compute-0-0.local       # show the queue configuration per node
qconf -sq all.q                         # show "all.q" queue info
qconf -mq all.q                         #  modify "all.q" queue: update hostlist, #slots
qconf -aq all.q                         # create queue named "all.q"

qhost -h compute-0-0,compute-0-1        # show host info for multiple nodes
Add headnode to queue
# on headndoe
cd /opt/gridengine
./install_execd -auto
Submit to a specific host
qsub -q '*@compute-0-0' script.sh
Queue Configuration Changes
qconf -mhgrp @allhosts     # edit the lists of hosts in SGE
qconf -mq all.q            # modify a queue, then you can remove nodes from the slots
Disable/Enable a node on the queue
qconf -rattr queue slots 0 all.q@compute-0-0  # slots -> 0 (== pbsnodes -o)
qconf -rattr queue slots 8 all.q@compute-0-0  # slots -> 8 (== pbsnodes -a)

qmod -d all.q@compute-0-0                     # enable compute-0-0 in queue all.q (-d == disable)
qmod -e all.q@compute-0-0                     # enable compute-0-0 in queue all.q (-d == disable)
Check sgeexecd on nodes
qping -info compute-0-0 537 execd 1          # check status of execd on compute-0-0
Submission flags
#$ -N <name-of-job>     # Give your job a name; useful when using -o and -e options below in file-naming
#$ -cwd                 # Use the Current Working Directory
#$ -o <file-name.out>   # Pipe standard output to file-name.out
#$ -e <file-name.err>   # Pipe standard error ti file-name.err
##$ -j y                # Join standard output and input
#$ -S /bin/bash         # Interpreting Shell used is BASH
#$ -V                   # Export all environment variables within the qsub utility to the context of the job
#$ -pe orte 32          # 32 cores, using the openmpi PE

echo "Got $NSLOTS processors."
echo "Machines:" 
cat $PE_HOSTFILE