Why use an HPC System?


  • High Performance Computing (HPC) typically involves connecting to very large computing systems elsewhere in the world.
  • These other systems can be used to do work that would either be impossible or much slower on smaller systems.

Working on an HPC system


  • An HPC system is a set of networked machines.
  • HPC systems typically provide login nodes and a set of worker nodes.
  • The standard method of interacting with such systems is via a command line interface called Bash.
  • The resources found on independent (worker) nodes can vary in volume and type (amount of RAM, processor architecture, availability of network mounted filesystems, etc.).
  • Files saved on one node are available on all nodes.
  • Avoid running jobs on the login node

Working with the scheduler


  • “The scheduler handles how compute resources are shared between users.”
  • “Everything you do should be run through the scheduler.”
  • “A job is just a shell script.”
  • “If in doubt, request more resources than you will need.”

Accessing software via Modules


  • “Load software with module load softwareName.”
  • “Unload software with module unload or module purge
  • “The module system handles software versioning and package conflicts for you automatically.”

Transferring files with remote computers


  • wget and curl -O download a file from the internet.
  • scp transfers files to and from your computer.
  • rsync is good for large transfers because it only transfers changed files

Running a parallel job


  • “Parallel programming allows applications to take advantage of parallel hardware; serial code will not ‘just work.’”
  • “Distributed memory parallelism is a common case, using the Message Passing Interface (MPI).”
  • “The queuing system facilitates executing parallel tasks.”
  • “Performance improvements from parallel execution do not scale linearly.”

Using resources effectively


  • “The smaller your job, the faster it will schedule.”

Using shared resources responsibly


  • “Be careful how you use the login node.”
  • “Your data on the system is your responsibility.”
  • “Plan and test large data transfers.”
  • “It is often best to convert many files to a single archive file before transferring.”
  • “Again, don’t run stuff on the login node.”