Slurm: Jobs fail with lmod not found error
Jump to navigation
Jump to search
Problem
With even a basic script, the job may fail with a message stating lmod does not exist.
Example Script
#!/bin/bash
#SBATCH --time=30
#SBATCH -o out
#SBATCH -e err
module load bonnie++Output in error file
[Jon@head-Boston scripts]$ cat err
/cm/local/apps/slurm/var/spool/job01047/slurm_script: /usr/share/lmod/lmod/libexec/lmod: No such file or directoryCause
- When executing
module loadthe head node calls/usr/share/lmod/lmod/libexec/lmodto load the module files - One the compute nodes,
module loadcalls/cm/local/apps/environment-modules/current/bin/modulecmdandlmodis not present
Resolution
- It can be seen that
moduleuses environment variableLMOD_CMD:
[Jon@head-Boston scripts]$ type module
module is a function
module ()
{
eval $($LMOD_CMD bash "$@");
[ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh)
}
[Jon@head-Boston scripts]$ echo $LMOD_CMD
/usr/share/lmod/lmod/libexec/lmod
[Jon@head-Boston scripts]$ export LMOD_CMD=
[Jon@head-Boston scripts]$ type module
module is a function
module ()
{
eval $($LMOD_CMD bash "$@");
[ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh)
}- And, by default, LMOD_CMD points to
/usr/share/lmod/lmod/libexec/lmod:
[Jon@head-Boston scripts]$ echo $LMOD_CMD
/usr/share/lmod/lmod/libexec/lmod- Set
LMOD_CMDto use/cm/local/apps/environment-modules/current/bin/modulecmdwhich is present on both head and compute nodes:
export LMOD_CMD=/cm/local/apps/environment-modules/current/bin/modulecmd