Skip to Content

Interactive Jobs

The Blue Waters batch system provides an interactive mode for interactive debugging and/or optimization activities. Long running jobs or production runs should not be run from interactive sessions. To start an interactive session, use the following command

$> qsub -I -l nodes=1:ppn=32:xe

The -I option indicates to the batch system to start an interactive job. You must also specify the number of nodes as shown in this example (this would be for a 1-node interactive job) and the node type (xe for CPU nodes, xk for GPU nodes). Also, you must specify the default number of processes per node (ppn), which is limited to 32 on an XE node or 16 on an XK node.  Other options accepted by qsub can be used in the interactive session.

For example, suppose you want access to 2 nodes for testing. Run the following command on a login node:

$> qsub -I -l nodes=2:ppn=32:xe -l walltime=01:00:00

where -I means interactive, and the other parameters define now many nodes and how long you want to run for.

The job will go into the queue. When the command returns, you'll be in a shell running on a "MOM" node. A MOM node is a shared service resource that manages job execution. Use aprun command to send the job to compute nodes.

Following is an example of testing application performance by trying different aprun options. Once the interactive session begins, start your application a.out by typing in

$> aprun -n32 -N16 -d2 ./a.out job.inp > job16.out

where job.inp is the application input data, and job16.out is the application output. Note that the value of -N can be smaller than the value requested in the qsub ppn parameter, but it cannot exceed the requested ppn.

After this job finishes, type in

$> aprun -n64 -N32 -d1 ./a.out  job.inp > job32.out

Also try other aprun options. See "man aprun" for details. Determine which options give the best performance and then use those in the production run. The benefit of trying several runs from the same interactive job is to get the job executed each time on the same nodeset, thus minimizing the performance variability due to non-reproducible job placement. Note that such sequence of aprun commands can be programmed into the batch script. The advantage of an interactive session is the flexibility of experimenting without the need to wait for a new session to start.

After the requested time has passed (1 hour in this example), the connection will be closed. Logging out of the MOM node will terminate the job early.

If the intended use of the interactive session is to work directly on a compute node (for example: to access the GPU in an XK node), consider using CCM mode, which provides the opportunity to do computations interactively on a dedicated compute node rather than on a shared MOM node.

Scheduling interactive jobs is a subject of the standard job queue policies. Submit the interactive job into debug or high queue in order to boost the job priority over that provided by normal queue if the wait time in the normal queue becomes excessive due to high machine load.

When using an interactive session, be aware that the same MOM node is involved in execution of many other jobs concurrently running on the machine. Any command executed on the service node without using an aprun command may potentially interfere with other jobs. The uninterrupted availability of the service resource is critical for other users on the machine to be able to submit their batch jobs and get the results of computation back. Running CPU- and I/O-intensive jobs on MOM nodes outside of an aprun command is a violation of the Terms of Use.

If your PBS script involves elaborate computation and I/O operations it might need to be corrected before being used on Blue Waters. Seek help at help+bw@ncsa.illinois.edu if needed.