Accessing Blue Waters compute nodes
Reverse tunneling a port to a Bluewaters compute node
If you have an application that requires a local client connection to use, this configuration will allow you to forward a port to the compute node where the application is running.
- The host you are tunneling to must be on an public network or a network that is accessible from Blue Waters (VPN connection is a possibility to obtain a public address).
- There is an ssh key from Bluewaters to your local machine or the job you are running is interactive.
- This example assumes the listening daemon will start on the first node in the $PBS_NODEFILE
There are three steps to set up a tunnel a port through a Bluewaters compute node:
- Choose a random high port greater than 1024.
- SSH to a your localhost from the mom node with a local forward set to TARGET <bluewaterscomputenode>:LOCALPORT
- On your local machine, you can now connect to LOCALPORT as if you were connecting to the TARGETPORT on the compute node
LOCALPORT is 8443, TARGETPORT is 1443
The following forwards localhost:8443 → momnode → bluewaterscomputenode:1443
In your job script before you start the daemon on the compute:
FIRSTNID=$(printf "nid%05d" $(head -n1 $PBS_NODEFILE))
# do not use ssh -f option since it detaches ssh,
# which means it is not killed when the job ends
ssh -nN -R 127.0.0.1:8443:$FIRSTNID:1443 <username>@<local machine> &
aprun <application with daemon running on port 1443>
After you are on public local machine: Open up your client on your local desktop and go to 127.0.0.1:8443
Forward tunneling a port to a Bluewaters compute node
You can establish a tunnel to compute nodes in a similar manner. First you should add
printf "nid%05d\n" $(head -n1 $PBS_NODEFILE
to your job script, which writes the host name of the (first) compute node in your job into a file hostname. An alternative is to get the nid number from the exec_host line of qstat -f JOBID. We prepend "nid" since for regular jobs "nid" is not present in the nodefile. For CCM jobs it is present so you should not add "nid" in that case.
Once you know the internal compute node IP address you can use execute
ssh -L 127.0.0.1:8889:<compute-node>:8888 h2ologin.ncsa.illinois.edu
on your laptop or workstdation to forward port 8889 from your laptop or workstation to the compute node.
To prepare for running a vncserver, you will need to initialize your VNC password. First run:
Follow the prompts to enter a password to access VNC. This password should be unique and should not be your NCSA pin. Anyone with this password can connect to your VNC server.
When using CCM you can start a VNC server on a compute node as part of a job script via
module load ccm
ccmrun -n1 bash -c 'rm /tmp/.X-* ; vncserver -geometry 1014x768 -depth 24 ; sleep 172800'
which first removes any leftover VNC control files in /tmp.
After that you have to forward a port from your laptop to the VNC port (5900 + <display-number>) on the compute node. VNC reports the compute node it runs on at startup, and you can forward a port, for example 5955, from your laptop to it like so:
ssh -L 127.0.0.1:5955:<compute-node>:<vnc-port> h2ologin.ncsa.illinois.edu
For example if vncserver reported "New 'X' desktop is nid12345:0"
ssh -L 127.0.0.1:5955:nid12345:5900 h2ologin.ncsa.illinois.edu
and then connect to the vnc display on your local host:
A complete job script could look like this:
#PBS -l nodes=1:xe:ppn=32
#PBS -l walltime=0:30:0
module load ccm
ccmrun -n1 bash -c 'vncserver -geometry 1024x768 -depth 24; sleep 172800' 2>&1 | tee vnc.job.out
which creates a file vnc.job.out that contains information on how to connect to the node.
Tunneling for training accounts using bwbay
Once you have obtained the internal IP address of the compute node to which you would like to connect (see above) you can create an ssh tunnel from you laptop to it using:
ssh -tL 127.0.0.1:8889:localhost:40XXX traXXX@bwbay.ncsa.illinois.edu ssh -L 40XXX:<compute-node>:8888 h2ologin.ncsa.illinois.edu
(all on one line) which will connect port 8889 on your laptop to port 8888 on the compute node. In order to avoid conflicts over port numbers already being in use, we recommend that you replace XXX by the numerical part of your user account, ie if your user account is tra123 then XXX is replaced with 123.