RUNLIMIT Bug - Patching LSF

From Define Wiki
Revision as of 09:20, 1 May 2013 by Michael (talk | contribs) (Created page with "===== LSF CPULIMIT, RUNLIMIT bug ===== * Bug Desription <pre> When a job is submitted with the option -W, confused information is generated if a very large RUNLIMIT value is u...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
LSF CPULIMIT, RUNLIMIT bug
  • Bug Desription
When a job is submitted with the option -W, confused information is generated if a very large RUNLIMIT value is used:
bsub -W 99999:00 sleep 10
Output: Bad CPU limit specification. Job not submitted. 
<<<Confusing; RUN limit was specified and not CPU limit as output indicates.>>>
Solution provided by Platform
Platform provided a patch to LSF; file lsf7Update6_linux2.6-glibc2.3-x86_64-169404.tar.Z
File was stored in the directory:
..\pdd_data\Product Development\High Performance Computing\HPC SoftwareInformation\Platform\HPC_EE\Patches
Below are intructions to apply the patch.
  • Download LSF patch file
# ---------------------------------------------------------------------------
# Download the LSF patch file and copy it to a shared directory; eg /depot/shared
# Make sure NOT TO uncompress or untar the package and/or file.
# ---------------------------------------------------------------------------

ftp ftp://ftp.platform.com

#username: lsfuser
#password: 8sF7G?w
#PATH: patches/7.0.6/patch/build169404
#File: lsf7Update6_linux2.6-glibc2.3-x86_64-169404.tar.Z

cd $LSF_TOP_DIR/7.0/install
  • Verify the patch package and/or file and apply patch to your cluster using the script patchinstall
#-----------------
#Verify patch file
#-----------------

./patchinstall -c full-path-to-patchfile-in-shared-dir

#----------------
#Apply patch file
#----------------

./patchinstall full-path-to-patchfile-in-shared-dir
  • Restart LSF
# On the headnode
/etc/init.d/lsf_daemons restart

# On all cluster nodes
pdsh -a "/etc/init.d/lsf_daemons restart"