Launching MPMD JobsThe aprun command can be used in Multiple Program, Multiple Data (MPMD) mode by providing several job placement options/binary combinations separated by a colon: $> aprun -n pes_1 [placement_options_1] BINARY_0 : \ -n pes_2 [placement_options_2] BINARY_1 : \ -n pes_3 [placement_options_3] BINARY_2 To ensure that your shell parses the colon as a completely separate token, make sure to include a space on both sides of the colon. All other options can be used in the usual fashion. The example below shows an MPI application that uses all the cores in 8 XE nodes running $MY_XE_BINARY, and all the cores in 16 XK nodes running $MY_XK_BINARY. #! /bin/bash # #PBS -l nodes=8:ppn=32:xe+16:ppn=16:xk #PBS -l walltime=00:30:00 #PBS -N aprun-mpmd-example-1 aprun -n $((8*32)) -N 32 $MY_XE_BINARY : \ -n $((16*16)) -N 16 $MY_XK_BINARY The following example demostrates the use of MPMD mode for a hybrid MPI and openMP application. #! /bin/bash # #PBS -l nodes=16:ppn=32:xe #PBS -l walltime=00:30:00 #PBS -N aprun-mpmd-example-2 export OMP_NUM_THREADS=4 #setenv OMP_NUM_THREADS 4 aprun -n $((32*4)) -N 32 -d 1 $MY_XE_BINARY_1 : \ -n $((8*12)) -N 8 -d 4 $MY_XE_BINARY_2 Note: The binaries in a MPMD job can be the same or different, but will run together under a unified MPI_COMM_WORLD. There's no requirement that 2 different resource specifications be used with the '+' syntax in the first example. In addition, an MPMD job can use all xk or all xe nodes. The important concept is the aprun syntax with the ' : ' separator for the executables. Restrictions:
See the aprun man page for more information. |