Slurm: Getting job information out of SLURM

From Define Wiki
Revision as of 10:25, 1 July 2015 by David (talk | contribs) (Created page with "== Display queue/partition names, runtimes and available nodes == <syntaxhighlight> [user1@iitac01 ~]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up 3:00:00...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Display queue/partition names, runtimes and available nodes

[user1@iitac01 ~]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
debug*       up    3:00:00      6   idle iitac-n[142,144,167,197,227,259]
serial       up 1-00:00:00      1  alloc iitac-n306
serial       up 1-00:00:00      4   idle iitac-n[086-087,305,328]
compute      up 4-00:00:00      2  down* iitac-n[206,341]
compute      up 4-00:00:00      1  drain iitac-n088
compute      up 4-00:00:00    220  alloc iitac-n[001-004,006-007,009-012,014-016,020-021,023-027,031-032,034-036,038-040,042-044,046-059,061,063-064,067-069,071-075,077-085,089-092,094-096,098-104,106-121,123-124,128-130,181-184,186-189,191-196,198-200,202-204,208-210,217-221,224-226,228-232,234,236-238,240-243,245-246,249-258,260-261,263,265-271,273,275,279,281-284,286-302,304,306,308-312,315-316,318,321-327,329-340,342]
compute      up 4-00:00:00     37   idle iitac-n[131-132,134-141,143,145-148,150-151,153-157,159-160,162-165,171-179]
compute      up 4-00:00:00      2   down iitac-n[233,307]

Display runtimes and available nodes for a particular queue/partition

[user1@iitac01 ~]$ sinfo -p debug
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
debug*       up    3:00:00      6   idle iitac-n[142,144,167,197,227,259]
Display information about a specific job
[user1@iitac01 ~]$ scontrol show jobid 108
JobId=108 Name=test
   UserId=user1(1351) GroupId=trhpc(3114)
   Priority=1996 Account=root QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   TimeLimit=00:10:00 Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
   SubmitTime=2010-07-27T15:57:18 EligibleTime=2010-07-27T15:57:18
   StartTime=2010-07-27T15:57:18 EndTime=2010-07-27T16:07:18
   SuspendTime=None SecsPreSuspend=0
   Partition=debug AllocNode:Sid=iitac01:8389
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=iitac-n[197,227]
   NumNodes=2 NumCPUs=4 CPUs/Task=1 ReqS:C:T=65534:65534:65534
   MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
   Features=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/home/trhpc/user1/job.sh
   WorkDir=/home/trhpc/user1

Display only my jobs in the queue

[user1@iitac01 ~]$ squeue -u user1
  JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
    109     debug test-4-c    user1   R       0:01      2 iitac-n[197,227]

Display long output about my jobs in the queue

[user1@iitac01 ~]$ squeue -u user1 -l
Tue Jul 27 16:00:07 2010
  JOBID PARTITION     NAME     USER    STATE       TIME TIMELIMIT  NODES NODELIST(REASON)
    109     debug test-4-c    user1  RUNNING       0:43     10:00      2 iitac-n[197,227]

Display historical information about completed jobs

[user1@iitac01 ~]$ sacct --format=jobid,jobname,account,partition,ntasks,alloccpus,elapsed,state,exitcode -j 66808
       JobID    JobName    Account  Partition   NTasks  AllocCPUS    Elapsed      State ExitCode 
------------ --------- ----------- ---------- -------- ---------- ---------- ---------- -------- 
66808        my_test_j+      acc01    compute                   8   00:02:34  COMPLETED      0:0 
66808.batch       batch      acc01                   1          1   00:02:34  COMPLETED      0:0