CCM: Cluster Compatibility ModeIntroductionCluster Compatibility Mode (CCM) is a component of Cray environment to support full Linux compatibility mode. With help of CCM, XE/XK compute node, normally carrying a stripped down operating system, can be turned into a typical node in a standard Linux cluster. This mode is useful for several scenarios:
CCM can be used in both interactive and standard PBS batch jobs. During a CCM job ssh will use port 203 by default. if a connection to an external ssh server, for example login.xsede.org, is required you will have to use ssh -p 22 to explictly select the port to use CCM in interactive jobFirst start an interactive batch job: > qsub -I -l gres=ccm -l nodes=4:ppn=16:xk -l walltime=01:00:00 You can add -X for X11 tunneling. You will see output similar to the following: qsub: waiting for job 1134727.nid11293 to start qsub: job 1134727.nid11293 ready In CCM JOB: 1134727.nid11293 JID 1134727 USER kot GROUP bw_staff Initializing CCM environment, Please Wait waiting for jid.... CCM Start success, 4 of 4 responses ... > The interactive session places the job on to a PBS MOM node (not a compute node). Do not run any computations on the MOM node as it is against the usage policy. The resource usage is monitored, and violations will not be tolerated. While in the CCM mode, you can find a list of nodes assigned to the job. > cat $HOME/.crayccm/ccm_nodelist.$PBS_JOBID | sort -u nid06822 nid06823 nid06904 nid06905 Use ccmrun to start the application on compute nodes. If the purpose of the session is to run an interactive job, we can migrate from the MOM node to the first compute node. Add the ccm module and execute ccmlogin to move the session to a compute node: > module add ccm > ccmlogin nid06822> You are now on a compute node nid06822 as if it were a regular Linux node. This is the right place to run compute-intense applications. You can add modules as usual, configure etc. The command ccmlogin supports X11 tunneling so if you used -X with qsub then you should be able to launch a GUI from the compute node. To access other nodes in the node list, one can use ssh command. For example, nid06822> ssh nid06823 nid06823> module swap PrgEnv-cray PrgEnv-pgi nid06823> pgaccelinfo nid06823> ssh nid06904 nid06904> When you are done, simply exit the compute node and the batch job. nid06904> exit Connection to nid06904 closed. nid06823> exit Connection to nid06823 closed. nid06822> exit Connection to nid06822 closed. > exit qsub: job 1134727.nid11293 completed OpenMPI support in Cluster Compatibility ModeCCM does not include support for Cray MPICH. However, it supports OpenMPI parallelization interface. The OpenMPI software stack is not included in the programming environment. Users should compile OpenMPI libraries in their home directories. Following are step-by-step instructions ($ denotes command line): $ module swap PrgEnv-cray PrgEnv-gnu $ cd $HOME $ mkdir openmpi $ cd openmpi $ wget http://www.open-mpi.org/software/ompi/v1.8/downloads/openmpi-1.8.4.tar.gz $ tar zxvf openmpi-1.8.4.tar.gz $ cd openmpi-1.8.4 $ ./configure --prefix=$HOME/openmpi --enable-orterun-prefix-by-default --enable-mca-no-build=plm-tm,ras-tm $ make install After compilation is completed, add export PATH=$PATH:$HOME/openmpi/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/openmpi/lib to your ~/.bashrc Add module swap PrgEnv-cray PrgEnv-gnu module add ccm to your ~/.modules Use mpicc, mpiCC, and mpif90 to compile your OpenMPI applications. Sample PBS script to launch an OpenMPI application: #!/bin/bash #PBS -j oe #PBS -l nodes=3:ppn=1:xe #PBS -l walltime=00:02:00 #PBS -l gres=ccm #PBS -l flags=commlocal:commtolerant source /opt/modules/default/init/bash module list TPN=16 NNODES=3 HOSTLIST=znodelist LAUNCH=zstart.sh cd $PBS_O_WORKDIR cat $HOME/.crayccm/ccm_nodelist.$PBS_JOBID | sort -u | awk -v n=$TPN '{for(i=0;i<n;i++) print $0}' > $HOSTLIST let NTASKS=$NNODES*$TPN echo "#!/bin/bash cd $PBS_O_WORKDIR # choosing a more appropritate bind-to option can speed things up $HOME/openmpi/bin/mpirun -v -np $NTASKS --mca btl tcp,self --mca btl_tcp_if_include ipogif0 --hostfile $HOSTLIST -npernode $TPN --bind-to none ./a.out > job.out" > $LAUNCH chmod 755 $LAUNCH ccmrun $PBS_O_WORKDIR/$LAUNCH For more information
|