Introduction to HPC
“High Performance Computing (HPC) typically involves connecting to
very large computing systems elsewhere in the world.”
“These other systems can be used to do work that would either be
impossible or much slower on smaller systems.”
“The standard method of interacting with such systems is via a
command line interface called Bash.”
“An HPC system is a set of networked machines.”
“HPC systems typically provide login nodes and a set of worker
nodes.”
“The resources found on independent (worker) nodes can vary in
volume and type (amount of RAM, processor architecture, availability of
network mounted filesystems, etc.).”
“Files saved on one node are available on all nodes.”
“The scheduler handles how compute resources are shared between
users.”
“Everything you do should be run through the scheduler.”
“A job is just a shell script.”
“If in doubt, request more resources than you will need.”
“Load software with module load softwareName
.”
“Unload software with module purge
”
“The module system handles software versioning and package conflicts
for you automatically.”
“wget
and curl -O
download a file from the
internet.”
“scp
transfers files to and from your computer.”
“The smaller your job, the faster it will schedule.”
“Be careful how you use the login node.”
“Your data on the system is your responsibility.”
“Plan and test large data transfers.”
“It is often best to convert many files to a single archive file
before transferring.”
“Again, don’t run stuff on the login node.”
Use .md
files for episodes when you want static
content
Use .Rmd
files for episodes when you need to generate
output
Run sandpaper::check_lesson()
to identify any issues
with your lesson
Run sandpaper::build_lesson()
to preview your lesson
locally