Skip to Content

Accessing Blue Waters compute nodes


Reverse tunneling a port to a Bluewaters compute node


If you have an application that requires a local client connection to use, this configuration will allow you to forward a port to the compute node where the application is running.

Requirements:

  1. The host you are tunneling to must be on an public network or a network that is accessible from Blue Waters (VPN connection is a possibility to obtain a public address).
  2. There is an ssh key from Bluewaters to your local machine or the job you are running is interactive.
  3. This example assumes the listening daemon will start on the first node in the $PBS_NODEFILE

There are three steps to set up a tunnel a port through a Bluewaters compute node:

  1. Choose a random high port greater than 1024.
  2. SSH to a your localhost from the mom node with a local forward set to TARGET <bluewaterscomputenode>:LOCALPORT
  3. On your local machine, you can now connect to LOCALPORT as if you were connecting to the TARGETPORT on the compute node

Example:

LOCALPORT is 8443, TARGETPORT is 1443

The following forwards localhost:8443momnodebluewaterscomputenode:1443

In your job script before you start the daemon on the compute:

FIRSTNID=$(printf "nid%05d" $(head -n1 $PBS_NODEFILE))
# do not use ssh -f option since it detaches ssh,
# which means it is not killed when the job ends
ssh -nN -R 127.0.0.1:8443:$FIRSTNID:1443 <username>@<local machine> &
sleep 5
aprun <application with daemon running on port 1443>

After you are on public local machine: Open up your client on your local desktop and go to 127.0.0.1:8443


Forward tunneling a port to a Bluewaters compute node

You can establish a tunnel to compute nodes in a similar manner. First you should add

printf "nid%05d\n" $(head -n1 $PBS_NODEFILE) >hostname

to your job script, which writes the host name of the (first) compute node in your job into a file hostname. An alternative is to get the nid number from the exec_host line of qstat -f JOBID.  We prepend "nid" since for regular jobs "nid" is not present in the nodefile. For CCM jobs it is present so you should not add "nid" in that case.

Once you know the internal compute node IP address you can use exectute

ssh -L 127.0.0.1:8889:<compute-node>:8888 h2ologin.ncsa.illinois.edu

on your laptop or workstdation to forward port 8889 from your laptop or workstation to the compute node.

VNC

To prepare for running a vncserver, you will need to initialize your VNC password. First run:

vncpasswd

Follow the prompts to enter a password to access VNC. This password should be unique and should not be your NCSA pin. Anyone with this password can connect to your VNC server.

When using CCM you can start a VNC server on a compute node as part of a job script via

#PBS -l gres=ccm
module load ccm
ccmrun -n1 bash -c 'rm /tmp/.X-* ; vncserver -geometry 1014x768 -depth 24 ; sleep 172800'

which first removes any leftover VNC control files in /tmp.

After that you have to forward a port from your laptop to the VNC port (5900 + <display-number>) on the compute node. VNC reports the compute node it runs on at startup, and you can forward a port, for example 5955, from your laptop to it like so:

ssh -L 127.0.0.1:5955:<compute-node>:<vnc-port> h2ologin.ncsa.illinois.edu

For example if vncserver reported "New 'X' desktop is nid12345:0"

ssh -L 127.0.0.1:5955:nid12345:5900 h2ologin.ncsa.illinois.edu

and then connect to the vnc display on your local host:

vncviewer 127.0.0.1:55

A complete job script could look like this:

#!/bin/bash
#PBS -l gres=ccm
#PBS -l nodes=1:xe:ppn=32
#PBS -l walltime=0:30:0
cd $PBS_O_WORKDIR
module load ccm
ccmrun -n1 bash -c 'vncserver -geometry 1024x768 -depth 24; sleep 172800' 2>&1 | tee vnc.job.out

which creates a file vnc.job.out that contains information on how to connect to the node.

Tunneling for training accounts using bwbay

Once you have obtained the internal IP address of the compute node to which you would like to connect (see above) you can create an ssh tunnel from you laptop to it using:

ssh -tL 127.0.0.1:8889:localhost:40XXX traXXX@bwbay.ncsa.illinois.edu ssh -L 40XXX:<compute-node>:8888 h2ologin.ncsa.illinois.edu

(all on one line) which will connect port 8889 on your laptop to port 8888 on the compute node. In order to avoid conflicts over port numbers already being in use, we recommend that you replace XXX by the numerical part of your user account, ie if your user account is tra123 then XXX is replaced with 123.