Difference between revisions of "Install CUDA 4.x on PCM 2.0.1/2.1"

From Define Wiki
Jump to navigation Jump to search
(Created page with "==== Install CUDA 4.x on PCM 2.0.1/2.1 ==== '''Note:''' <syntaxhighlight> There are no CUDA 4.x kit for PCM 2.0.1/2.1. CUDA kit 4.x only available on HPC 3.0 onwards. Therefore, CUDA 4...")
 
(No difference)

Latest revision as of 11:34, 7 December 2012

Install CUDA 4.x on PCM 2.0.1/2.1

Note:

There are no CUDA 4.x kit for PCM 2.0.1/2.1. CUDA kit 4.x only available on HPC 3.0 onwards.

Therefore, CUDA 4.x will need to be install manually and merge Platform CUDA LSF kits config
into the existing config on PCM 2.0.1/2.1
  • GPU node: Install cuda 4.0 devdirver
/shared/viglen/4.0/devdriver_4.0_linux_64_270.41.19.run --accept-license --no-questions --silent > /tmp/inst-cuda-driver.log 2>&1
  • GPU node: Install "Viglen build" cuda 4.x rpm (see how to create rpm below)
rpm -iv --force --nodeps /shared/viglen/4.0/cuda-4.0-1.x86_64.rpm > /tmp/inst-cuda40-rpm.log 2>&1
  • GPU node: Untar 'lsf7Update6_linux2.6-glibc2.3-x86_64-157853.tar' to /opt/lsf/7.0
cd /opt/lsf/7.0
tar xvf /shared/viglen/lsf7Update6_linux2.6-glibc2.3-x86_64-157853.tar > /tmp/inst-lsf7Update6.log 2>&1
  • Headnode node: extract nvjob, elim.nvidia and related files from platform-lsf-gpu-1.0.2.rpm
'Note': platform-lsf-gpu-1.0.1.rpm:- nvjob + elim.nvidia from this rpm will cause core dump.

- rpm2cpio platform-lsf-gpu-1.0.2.rpm > platform-lsf-gpu-1.0.2.cpio
- cd /tmp
- cpio -idmuv -I platform-lsf-gpu-1.0.2.cpio
- cp -r /tmp/opt/lsf_gpu/bin /shared/viglen/lsf-gpu
- cp -r /tmp/opt/lsf_gpu/etc /shared/viglen/lsf-gpu
  • GPU node: Replace the existing nvjob+elim.nvidia and its related files
mv /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/nvjob /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/bin/nvjob.ORIG
cp /shared/viglen/lsf-gpu/bin/nvjob /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/bin


mv /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/elim.nvidia /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/elim.nvidia.ORIG
cp /shared/viglen/lsf-gpu/etc/elim.nvidia /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc


mv /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl2_p /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl2_p.ORIG
cp /shared/viglen/lsf-gpu/etc/esub.gpuexcl2_p /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc


mv /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl2_t /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl2_t.ORIG
cp /shared/viglen/lsf-gpu/etc/esub.gpuexcl2_t /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc


mv /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl_p /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl_p.ORIG
cp /shared/viglen/lsf-gpu/etc/esub.gpuexcl_p /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc


mv /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl_t /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpuexcl_t.ORIG
cp /shared/viglen/lsf-gpu/etc/esub.gpuexcl_t /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc


mv /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpushared /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc/esub.gpushared.ORIG
cp /shared/viglen/lsf-gpu/etc/esub.gpushared /opt/lsf/7.0/linux2.6-glibc2.3-x86_64/etc
  • Headnode: Merge lsf.cluster, lsf.shared and lsf.applications from 'platform-lsf-gpu-1.0-2.x86_64.rpm' into the existing config
d=`date +%Y%m%d%H%M`
cp -p /etc/cfm/templates/lsf/default.lsf.cluster /etc/cfm/templates/lsf/default.lsf.cluster.${d}
cat lsf.cluster >> /etc/cfm/templates/default.lsf.cluster

cp -p /etc/cfm/templates/lsf/default.lsf.shared /etc/cfm/templates/lsf/default.lsf.shared.${d}
cat lsf.shared >> /etc/cfm/templates/default.lsf.shared

cp -p /opt/lsf/conf/lsbatch/headnode_cluster1/configdir/lsb.applications /opt/lsf/conf/lsbatch/headnode_cluster1/configdir/lsb.applications.${s}
cat lsb.applications >> /opt/lsf/conf/lsbatch/headnode_cluster1/configdir/lsb.applications
  • Headnode: Add new CUDA Path to /etc/cfm/<gpu_ng>/etc/profile.d/cuda.sh (copied from the gpu node itself)
export PATH=/usr/bin:/usr/sbin:/sbin:.:/pcc/cnbuild/build/bldcluster/7.0/linux2.6-glibc2.3-x86_64/etc:\
/pcc/cnbuild/build/bldcluster/7.0/linux2.6-glibc2.3-x86_64/bin:\
/home/cnbuild/bin:/pcc/saqa/build/scripts:/pcc/saqa/build/scripts/utils:\
/pcc/lsfqa-trusted/3rdparty/ant/apache-ant-1.6.0/bin:\
/pcc/lsfqa-trusted/3rdparty/jdk/1.5.0_08/linux-x86/bin:/usr/bin:/usr/local/bin:\
/usr/lib64/qt-3.3/bin:/usr/kerberos/bin:/usr/bin:/bin:/usr/local/bin:.:\
/pcc/lsfqa-trusted/symphony_ext/ant/1.5.1/bin:/pcc/saqa/build/scripts:\
/pcc/saqa/build/scripts/utils:/local/jdk1.4/bin:/usr/local/oracle/8.0.4/bin:\
/usr/bin:/usr/local/bin:/local/share/bin:/opt/local/bin:/bin:/sbin:/usr/sbin:\
/usr/ccs/bin:/etc:/usr/etc:/pcc/qa/ddts/bin:/opt/SUNWhpc/bin:/opt/SUNWspro/bin:\
/usr/openwin/bin:/opt/gnu/bin:/opt/ansic/bin:/opt/langtools/bin:/usr/softbench/bin:\
/usr/bin/X11:/usr/bsd:/local/bin/X11:/usr/hosts:/usr/vue/bin:/usr/ucb:/usr/sfw/bin:/sbin:\
/opt/cuda/4.0/toolkit/bin:/opt/cuda/4.0/sdk/C/bin/linux/release
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda/4.0/toolkit/lib64:/opt/cuda/4.0/toolkit/lib
    • Create CUDA 4.x rpm
Note to come ...