Difference between revisions of "Slurm: Getting job information out of SLURM"
Jump to navigation
Jump to search
(Created page with "== Display queue/partition names, runtimes and available nodes == <syntaxhighlight> [user1@iitac01 ~]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up 3:00:00...") |
|||
| (3 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| + | '''LOTS OF USEFUL COMMANDS''': https://rc.fas.harvard.edu/resources/documentation/convenient-slurm-commands/ | ||
| + | |||
== Display queue/partition names, runtimes and available nodes == | == Display queue/partition names, runtimes and available nodes == | ||
<syntaxhighlight> | <syntaxhighlight> | ||
| Line 18: | Line 20: | ||
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST | PARTITION AVAIL TIMELIMIT NODES STATE NODELIST | ||
debug* up 3:00:00 6 idle iitac-n[142,144,167,197,227,259] | debug* up 3:00:00 6 idle iitac-n[142,144,167,197,227,259] | ||
| − | Display information about a specific job | + | </syntaxhighlight> |
| + | == Display information about a specific job == | ||
| + | <syntaxhighlight> | ||
[user1@iitac01 ~]$ scontrol show jobid 108 | [user1@iitac01 ~]$ scontrol show jobid 108 | ||
JobId=108 Name=test | JobId=108 Name=test | ||
Latest revision as of 20:13, 21 October 2015
LOTS OF USEFUL COMMANDS: https://rc.fas.harvard.edu/resources/documentation/convenient-slurm-commands/
Display queue/partition names, runtimes and available nodes
[user1@iitac01 ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug* up 3:00:00 6 idle iitac-n[142,144,167,197,227,259]
serial up 1-00:00:00 1 alloc iitac-n306
serial up 1-00:00:00 4 idle iitac-n[086-087,305,328]
compute up 4-00:00:00 2 down* iitac-n[206,341]
compute up 4-00:00:00 1 drain iitac-n088
compute up 4-00:00:00 220 alloc iitac-n[001-004,006-007,009-012,014-016,020-021,023-027,031-032,034-036,038-040,042-044,046-059,061,063-064,067-069,071-075,077-085,089-092,094-096,098-104,106-121,123-124,128-130,181-184,186-189,191-196,198-200,202-204,208-210,217-221,224-226,228-232,234,236-238,240-243,245-246,249-258,260-261,263,265-271,273,275,279,281-284,286-302,304,306,308-312,315-316,318,321-327,329-340,342]
compute up 4-00:00:00 37 idle iitac-n[131-132,134-141,143,145-148,150-151,153-157,159-160,162-165,171-179]
compute up 4-00:00:00 2 down iitac-n[233,307]Display runtimes and available nodes for a particular queue/partition
[user1@iitac01 ~]$ sinfo -p debug
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug* up 3:00:00 6 idle iitac-n[142,144,167,197,227,259]Display information about a specific job
[user1@iitac01 ~]$ scontrol show jobid 108
JobId=108 Name=test
UserId=user1(1351) GroupId=trhpc(3114)
Priority=1996 Account=root QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
TimeLimit=00:10:00 Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
SubmitTime=2010-07-27T15:57:18 EligibleTime=2010-07-27T15:57:18
StartTime=2010-07-27T15:57:18 EndTime=2010-07-27T16:07:18
SuspendTime=None SecsPreSuspend=0
Partition=debug AllocNode:Sid=iitac01:8389
ReqNodeList=(null) ExcNodeList=(null)
NodeList=iitac-n[197,227]
NumNodes=2 NumCPUs=4 CPUs/Task=1 ReqS:C:T=65534:65534:65534
MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) Reservation=(null)
Shared=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/home/trhpc/user1/job.sh
WorkDir=/home/trhpc/user1Display only my jobs in the queue
[user1@iitac01 ~]$ squeue -u user1
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
109 debug test-4-c user1 R 0:01 2 iitac-n[197,227]Display long output about my jobs in the queue
[user1@iitac01 ~]$ squeue -u user1 -l
Tue Jul 27 16:00:07 2010
JOBID PARTITION NAME USER STATE TIME TIMELIMIT NODES NODELIST(REASON)
109 debug test-4-c user1 RUNNING 0:43 10:00 2 iitac-n[197,227]Display historical information about completed jobs
[user1@iitac01 ~]$ sacct --format=jobid,jobname,account,partition,ntasks,alloccpus,elapsed,state,exitcode -j 66808
JobID JobName Account Partition NTasks AllocCPUS Elapsed State ExitCode
------------ --------- ----------- ---------- -------- ---------- ---------- ---------- --------
66808 my_test_j+ acc01 compute 8 00:02:34 COMPLETED 0:0
66808.batch batch acc01 1 1 00:02:34 COMPLETED 0:0