Connecting to a remote HPC system
Overview
Teaching: 25 min
Exercises: 10 minQuestions
How do I log in to a remote HPC system?
Objectives
Configure secure access to a remote HPC system.
Connect to a remote HPC system.
Secure Connections
The first step in using a cluster is to establish a connection from our laptop to the cluster. When we are sitting at a computer (or standing, or holding it in our hands or on our wrists), we have come to expect a visual display with icons, widgets, and perhaps some windows or applications: a graphical user interface, or GUI. Since computer clusters are remote resources that we connect to over slow or intermittent interfaces (WiFi and VPNs especially), it is more practical to use a command-line interface, or CLI, to send commands as plain-text. If a command returns output, it is printed as plain text as well. The commands we run today will not open a window to show graphical results.
If you have ever opened the Windows Command Prompt or macOS Terminal, you have seen a CLI. If you have already taken The Carpentries’ courses on the UNIX Shell or Version Control, you have used the CLI on your local machine extensively. The only leap to be made here is to open a CLI on a remote machine, while taking some precautions so that other folks on the network can’t see (or change) the commands you’re running or the results the remote machine sends back. We will use the Secure SHell protocol (or SSH) to open an encrypted network connection between two machines, allowing you to send & receive text and data without having to worry about prying eyes.
SSH clients are usually command-line tools, where you provide the remote
machine address as the only required argument. If your username on the remote
system differs from what you use locally, you must provide that as well. If
your SSH client has a graphical front-end, such as PuTTY or MobaXterm, you will
set these arguments before clicking “connect.” From the terminal, you’ll write
something like ssh userName@hostname
, where the argument is just like an
email address: the “@” symbol is used to separate the personal ID from the
address of the remote machine.
When logging in to a laptop, tablet, or other personal device, a username, password, or pattern are normally required to prevent unauthorized access. In these situations, the likelihood of somebody else intercepting your password is low, since logging your keystrokes requires a malicious exploit or physical access. For systems like cirrus-login1 running an SSH server, anybody on the network can log in, or try to. Since usernames are often public or easy to guess, your password is often the weakest link in the security chain. Many clusters therefore forbid password-based login, requiring instead that you generate and configure a public-private key pair with a much stronger password. Even if your cluster does not require it, the next section will guide you through the use of SSH keys and an SSH agent to both strengthen your security and make it more convenient to log in to remote systems.
Log In to the Cluster
Go ahead and open your terminal or graphical SSH client, then log in to the
cluster. Replace yourUsername
with your username or the one
supplied by the instructors.
[user@laptop ~]$ ssh yourUsername@login.cirrus.ac.uk
You will be prompted first for your ssh key passphrase and then for your Cirrus login password. Watch out: the characters you type after the password prompt are not displayed on the screen. Normal output will resume
once you press Enter
.
You may have noticed that the prompt changed when you logged into the remote
system using the terminal. This change is important because
it can help you distinguish on which system the commands you type will be run
when you pass them into the terminal. This change is also a small complication
that we will need to navigate throughout the workshop. Exactly what is displayed
as the prompt (which conventionally ends in $
) in the terminal when it is
connected to the local system and the remote system will typically be different
for every user. We still need to indicate which system we are entering commands
on though so we will adopt the following convention:
[user@laptop ~]$
when the command is to be entered on a terminal connected to your local computer[yourUsername@cirrus-login1 ~]$
when the command is to be entered on a terminal connected to the remote system$
when it really doesn’t matter which system the terminal is connected to.
Creating an alias for quicker login
We can create an alias on our local machine to use as a shortcut to login to Cirrus.
Instead of typing ssh yourUsername@login.cirrus.ac.uk
every time we want to login
we can reduce it to a much shorter command, for example ssh cirrus
Create the file ~/.ssh/config
if it does not exist on your local machine. Add the following lines:
Host cirrus
Hostname login.cirrus.ac.uk
User yourUsername
IdentityFile ~/.ssh/mykey
You should now be able to connect to Cirrus from your local machine with the following shell command,
[user@laptop ~]$ ssh cirrus
Looking Around Your Remote Home
Very often, many users are tempted to think of a high-performance computing
installation as one giant, magical machine. Sometimes, people will assume that
the computer they’ve logged onto is the entire computing cluster. So what’s
really happening? What computer have we logged on to? The name of the current
computer we are logged onto can be checked with the hostname
command. (You
may also notice that the current hostname is also part of our prompt!)
[yourUsername@cirrus-login1 ~]$ hostname
cirrus-login1
So, we’re definitely on the remote machine. Note that since there are two login nodes on Cirrus
the hostname
command may also return cirrus-login2
.
Next, let’s find out where we are by running pwd
to print the working directory.
[yourUsername@cirrus-login1 ~]$ pwd
/home/tc036/tc036/yourUsername
Great, we know where we are!
Let’s see what’s in our current directory. The system administrators may have configured your home directory with some helpful files, folders, and links (shortcuts) to space reserved for you on other filesystems. If they did not, your home directory may appear empty. To double-check, include hidden files in your directory listing:
[yourUsername@cirrus-login1 ~]$ ls -a
. .. .bash_history .cache .config .local .python_history .ssh
In the first column, .
is a reference to the current directory and ..
a
reference to its parent (/home/tc036/tc036
). You may or may not see
the other files, or files like them: .bashrc
is a shell configuration file,
which you can edit with your preferences; and .ssh
is a directory storing SSH
keys and a record of authorized connections.
SSH Keys
SSH keys are an alternative method for authentication to obtain access to remote computing systems. They can also be used for authentication when transferring files or for accessing remote version control systems (such as GitHub).
During setup
you will have create a pair of SSH keys:
- a private key which you keep on your own computer, and
- a public key which can be placed on any remote system you will access.
Private keys are your secure digital passport
A private key that is visible to anyone but you should be considered compromised, and must be destroyed. This includes having improper permissions on the directory it (or a copy) is stored in, traversing any network that is not secure (encrypted), attachment on unencrypted email, and even displaying the key on your terminal window.
Protect this key as if it unlocks your front door. In many ways, it does.
Regardless of the software or operating system you use, please choose a strong password or passphrase to act as another layer of protection for your private SSH key.
Considerations for SSH Key Passwords
When prompted, enter a strong password that you will remember. There are two common approaches to this:
- Create a memorable passphrase with some punctuation and number-for-letter substitutions, 32 characters or longer. Street addresses work well; just be careful of social engineering or public records attacks.
- Use a password manager and its built-in password generator with all character classes, 25 characters or longer. KeePass and BitWarden are two good options.
- Nothing is less secure than a private key with no password. If you skipped password entry by accident, go back and generate a new key pair with a strong password.
On your local machine take a look in
~/.ssh
(usels ~/.ssh
). You should see two new files:
- your private key (
~/.ssh/id_rsa
): do not share with anyone!- the shareable public key (
~/.ssh/id_rsa.pub
): if a system administrator asks for a key, this is the one to send. It is also safe to upload to websites such as GitHub: it is meant to be seen.
The public key you uploaded to SAFE can be found in the .ssh
folder:
[yourUsername@cirrus-login1 ~]$ ls .ssh/
authorized_keys id_rsa id_rsa.pub
[yourUsername@cirrus-login1 ~]$ cat .ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCk44JLYQ4DCAcalNNJqtLsZAVSUvkbSt0OdPYycqo/2hvgvrs+8HsSyys+V6gKBA2zVL7rnLpMprJx8aN8bJwFfIBxzBsGZ7HFyL5Gs1cz1olbbouzBkS10TJu/9SAN6XyG7BVxAQC75Kz91Vb3sYQmFZC6pUZw4fShUAUVbXCXKbcIS+RjR9iaUBiTmpRoYoc6bdMiGHFLuHz4scCfHCGpjNI6OSpIbF6L99GhftmwZxlb9TaId8SBnOkBzjsYSFui0x06rFFdy7rrqwsYx0XKMmLwDY7U21z1DVx1/SCWll704b5BO111N/89SyEr3O4QtqDP4FKkSCFFayelNlvmQB4+QDGdvJHs0YBYMQ372fskItIUNOp5q2ioCt88mD15JPsxtEAUqbXcfSoZZE5y1FLVngAT5sUDqK+kX9sxhIf3E16gQOcMG3AxMMmVHuSFcqfoCLgU1jcT2x9hacc8QlPX7LQPPm8SzYCeVr3MavnNP+JiA1vhxKMlKbRThc= yourUsername@cirrus-login1
There May Be a Better Way
Policies and practices for handling SSH keys vary between HPC clusters: follow any guidance provided by the cluster administrators or documentation. Other systems may not have a online portal for managing SSH keys and you may need to upload your public key onto the HPC explicitly.
Key Points
An HPC system is a set of networked machines.
HPC systems typically provide login nodes and a set of worker nodes.
The resources found on independent (worker) nodes can vary in volume and type (amount of RAM, processor architecture, availability of network mounted filesystems, etc.).
Files saved on one node are available on all nodes.