Introduction to HPC
High Performance Computing (HPC) typically involves connecting to
very large computing systems elsewhere in the world.
These other systems can be used to do work that would either be
impossible or much slower on smaller systems.
An HPC system is a set of networked machines.
HPC systems typically provide login nodes and a set of worker
nodes.
The standard method of interacting with such systems is via a
command line interface called Bash.
The resources found on independent (worker) nodes can vary in volume
and type (amount of RAM, processor architecture, availability of network
mounted filesystems, etc.).
Files saved on one node are available on all nodes.
Avoid running jobs on the login node
“The scheduler handles how compute resources are shared between
users.”
“Everything you do should be run through the scheduler.”
“A job is just a shell script.”
“If in doubt, request more resources than you will need.”
“Load software with module load softwareName
.”
“Unload software with module unload
or
module purge
”
“The module system handles software versioning and package conflicts
for you automatically.”
wget
and curl -O
download a file from the
internet.
scp
transfers files to and from your computer.
rsync
is good for large transfers because it only
transfers changed files
“Parallel programming allows applications to take advantage of
parallel hardware; serial code will not ‘just work.’”
“Distributed memory parallelism is a common case, using the Message
Passing Interface (MPI).”
“The queuing system facilitates executing parallel tasks.”
“Performance improvements from parallel execution do not scale
linearly.”
“The smaller your job, the faster it will schedule.”
“Be careful how you use the login node.”
“Your data on the system is your responsibility.”
“Plan and test large data transfers.”
“It is often best to convert many files to a single archive file
before transferring.”
“Again, don’t run stuff on the login node.”