Rocks: Torque 5.4 - Rocks 5.4
Jump to navigation
Jump to search
Torque 5.4 and Rock 5.4
- Possible bug with Torque 5.4 ?
The following were observed after Torque 5.4 installation on Rocks 5.4:
On installing compute node, compute node name were added to /opt/torque/server_priv/nodes file
once the compute node received the rocks installation image. The node name added without np=XX
associated with the compute node.
ie:
compute-0-0
The correct entry should be:
compute-0-0 np=8
Without np=xx associated with compute-0-0 in the nodes file, pbsnodes list it with only 1 core!
To generates the /opt/torque/server_priv/nodes file correctly (after node installation), runs:
rocks sync config- Job Subbmited and just get queued and will not run/execute
- checkjob <jobid> shows no resource available.
- pbsnodes show nodes are free
Cause:
- this was caused by in /opt/torque/pbs.default, server name in uppper case:
set server managers = maui@LRC-PS8-RFHEAD.SEEC.LOCAL
set server managers += root@LRC-PS8-RFHEAD.SEEC.LOCAL
Change it to:
set server managers = maui@lrc-ps8-rfhead.seec.local
set server managers += root@lrc-ps8-rfhead.seec.local
job should submited and ran/execute as expected