10  The Slurm Scheduler

Author

Sean Taylor, Marc Carlson, Glenn Morton, Lindsay Clark, Neerja Katiyar

Published

May 7, 2026

10.1 Access

See Chapter 1 for instructions on getting access to the HPC. You will also want to request an association in order to get access to storage space and compute resources.

  • It is important to understand that the node you connect to when you first log in to the cluster is just a login node.

  • When running compute-intensive tasks on the Sasquatch HPC, it’s crucial to use the worker nodes instead of the login node. You should always launch your jobs using srun for interactive sessions or sbatch for batch scripts.

  • The login node is a shared resource for all users to manage files, compile code, and submit jobs. If you run resource-intensive work on this node, you can slow down or freeze the system, preventing others from performing essential tasks. In such cases, your processes may be terminated by an administrator to restore normal operations.

10.2 Using slurm on Sasquatch

10.3 Common Slurm Terms

Term Description
partition Collection of computers (nodes). Usually grouped by similar architectural properties (cpus/gpus).
account Collection of users. Used for permitting access to parts of the system.
node A computer in the cluster. Physical hardware at the data center.
job Collection of steps, often just a configuration step and an exeucution step that executes on the cluster.
step A subdivision within a job. A set of tasks that are executed together in parallel or in series within a single step.
task The smallest unit of execution in Slurm. Tasks are typically associated with a specific number of CPU cores.
cpu A physical processor capable of doing a task. Within some tools, may be referenced as a core or thread. Be sure to read the documentation of your tool to determine specifics.
gpu A powerful processor optimized for floating point operations that is typically useful for machine learning and other bespoke pipelines/algorithms. *Be sure to read the documentation of the tools you are using before you ask for one of these limited resources.
mem Memory (RAM). Used to allocate working resources for your task.

10.4 HPC Architecture

Overall architecture diagram of Sasquatch can be found in Section 5.2.

Information on the different clusters for Posit can be found in Chapter 22.

10.5 Learning resources: SLURM