11  Submitting Jobs

Author

Sean Taylor, Marc Carlson, Glenn Morton, Lindsay Clark, Neerja Katiyar

Published

May 7, 2026

11.1 Slurm jobs

11.1.1 Choosing a job type

In order to get work done, a user submits a “job” to the scheduler. Slurm offers different commands for running jobs, each designed for a specific purpose

11.1.1.1 sbatch (Batch jobs)

  • This is the standard method for submitting jobs.
  • You create a text file called a batch script that contains all the Slurm directives and shell commands for your job.
  • sbatch submits this script to the job queue, where it will wait for the requested resources to become available before running.
  • This method is ideal for long-running, non-interactive tasks, as you can disconnect from the cluster after submitting the job.

The basic syntax for sbatch is as follows:

sbatch --partition=<partition_name> --account=<account_name> \
  --nodes=1 --ntasks=1 --cpus-per-task=<num_cpus> --mem-per-cpu=<memory> \
  --gres=gpu:<num_gpus> --time=00-00:00:00 myscript

11.1.1.2 srun (Interactive jobs)

  • srun runs interactively. It is used to execute a single command or launch a parallel program directly on a compute node.

  • You can use srun to launch an interactive shell on a worker node.

  • This is also great way to test a specific command or launch a smaller-scale task without needing a full shell session.

The basic syntax for an interactive job using srun is as follows:

sbatch --partition=<partition_name> --account=<account_name> \
  --nodes=1 --ntasks=1 --cpus-per-task=<cpus> --mem-per-cpu=<memory> \
  --time=<wall_time> myscript

For more information on intermediate and advanced uses of these and other slurm commands, please refer to Additional SLURM Documentation.

11.2 Mandatory directives

These directives are mandatory and must be included with every job (on our system)

  • --partition: Specify the partition. See Section 11.2.1 for more info
  • --account: Specify the account code. See Section 11.2.2 for more info
  • --time: Maximum amount of time the job can run. See Section 11.2.3 for more info

11.2.1 --partition

Each partition represents a collection of worker nodes that are set up in a certain way. More abstractly, you can think of a partition as just a pool of resources that your job is ‘lined up’ to get access to. Once that pool of resources has a spot for your job to run, the scheduler will process your job using that pool of resources.

These are the “general” partitions that anyone can use:

Partitions on the cluster
State Description Time
cpu-test-sponsored Limited partition that costs nothing to use. For testing purpose only. 15 minutes
gpu-test-sponsored Limited partition that costs nothing to use. For testing GPU purpose only. 15 minutes
cpu-core-sponsored Used for most jobs. Can request max cores and memory available. Up to 14 days
gpu-core-sponsored Request for GPU jobs. Can request max cores and memory available. Up to 14 days

11.2.2 --account

The account is used to link your HPC jobs to a specific association or group. This helps track and govern resource usage. Every user is a member of the core association and has access to these public accounts

  • cpu-core-sponsored for CPU-based jobs

  • gpu-core-sponsored for GPU-based jobs.

When you become a member of a private association, you gain access to accounts that will unlock access to more resources.

The easiest way to list accounts and partitions available to you is through this command:

sshare -o "Account%40,Partition%40"

Your output of sshare probably looks something like this, where “mylab” is the name of an association that you belong to:

Account                         Partition
-------------------- --------------------
cpu-core-sponsored     cpu-core-sponsored
cpu-mylab-sponsored    cpu-core-sponsored
gpu-core-sponsored     gpu-core-sponsored
gpu-mylab-sponsored    gpu-core-sponsored
cpu-test-sponsored     cpu-test-sponsored
gpu-test-sponsored     gpu-test-sponsored

11.2.3 --time

Sets the maximum wall-clock time your job is allowed to run. The format is [DD-][HH:]MM:SS. If your job exceeds this time, it will be automatically terminated by Slurm. The maximum job time on Sasquatch is 14 days.

See Chapter 3 for examples on formatting time.

11.3 Resource Directives

Properly requesting resources is a key best practice that ensures your job runs efficiently and doesn’t waste resources for other users. CPUs, GPUs, and memory can either be requested statically or dynamically.

11.3.1 CPUs, GPUs, and Mem

Command Example Description
--cpus --cpus=1 Requests specific number of of CPUs. (Static)
--cpus-per-task --cpus-per-task=1 Specifies number of CPU cores required for each task in your job. (Dynamic)
--cpus-per-gpu --cpus-per-gpu=32 Specifies number of CPU cores required for each GPU in your job. (Dynamic)
--gpus --gpus=1 Requests specific number of of GPUs. (Static)
--gpus-per-task --gpus-per-task=1 Specifies number of GPU cores required for each task in your job. (Dynamic)
--mem --mem=15G Specifies amount of memory required. (Static)
--mem-per-cpu --mem-per-cpu=7.5G Specifies the amount of memory per CPU core. (Dynamic)
--mem-per-gpu --mem-per-gpu=480G Specifies the amount of memory per GPU core. (Dynamic)
--ntasks --ntasks=1 Specifies the number of tasks per node. For most jobs, a single task is sufficient. (Dynamic)
--nodes --nodes=1 Specifies the number of nodes you need. For most jobs, a single node is sufficient. (Static)

11.3.3 Default resources

If you do not specify hardware resources, your job will default to the following

  • --nodes=1

  • --ntasks=1

  • --cpus=1

  • --mem=unlimited

The memory request is the equivalent of an entire node, so unless the cluster has some idle nodes, this job is unlikely to run in a timely manner.

Always request a reasonable amount of resources for your job. Requesting too little may cause your job to fail (e.g., with an OUT_OF_MEMORY error), while requesting too much can cause other users’ jobs to wait unnecessarily. You should monitor your job’s resource usage to help you tune your requests. The seff command is great for this!

11.4 Other useful and important directives

While none of these are mandatory, they are highly recommended for managing your jobs effectively.

--job-name: A descriptive name for your job, which makes it easier to identify in the job queue.

--output (-o): Specifies the file where your job’s standard output (e.g., print statements) will be written.

--error (-e): Specifies the file where your job’s standard error will be written. This is critical for debugging.

--mail-type: What events should trigger an email notification? Common values include BEGIN, END, FAIL, or ALL, for example –mail-type=BEGIN,END,FAIL.

--mail-user: The email address to send notifications to.

Additional details on all parameters and directives see SLURM Manual Pages on Commands

Option Value/Format Description
-p, --partition <string> Name of the queue on which to run the job.
-A, --account <string> Project code
-N, --nodes <integer> Number of compute nodes to reserve. Set greater than 1 only for tools that can use multiple nodes.
-n, --ntasks <integer> Number of tasks to run. The default is one task per node.
-c, --cpus-per-task <integer> Number cpus to use per tasks. Without this option, the controller will use one cpu per task
--cpus-per-gpu <integer> Number of cpus to use per gpu. This is useful for maintaining the ideal gpu/cpu ratio.
--mem-per-cpu <int>G, <int>M Memory to reserve per cpu per node. This is ideal to use to help with scaling your job and maintaining the proper cpu/mem ratio.
--mem-per-gpu <int>G <int>M Memory to reserve per gpu per node. This is ideal to use to help with scaling your job and maintaing the proper gpu/cpu/mem ratio.
--mem <int>g, <int>m Memory to reserve per node. Recommend against using this to help with scaling your jobs easily.
-t, --time HH:MM:SS, MM, MM:SS, D-HH, D-HH:MM, D-HH:MM:SS Time limit, after which the job will be terminated if it hasn't finished.
-J, --job-name <string> Arbitrary name for job, to make it more identifiable to you.
-D, --chdir /path/to/dir/ Path to working directory for script, and where log files will be written.
--mail-type NONE, BEGIN, END, FAIL, REQUEUE, ALL, INVALID_DEPEND, STAGE_OUT, TIME_LIMIT, TIME_LIMIT_90, TIME_LIMIT_80, TIME_LIMIT_50, ARRAY_TASKS Types of email to send. ALL is a good value to use.
--mail-user Mickey.Mouse@company.org Your email address.
-d, --dependency afterok:<jobnum> Do not start job until job <jobnum> has completed successfully.
-a, --array 1-12%4 Run an array job. The example shows twelve array jobs being run, with a maximum of four running at once.
-e, --error myjobdescription-%j.err Send errors to a file pattern specified (with %j for job number or %A_%a for array jobs) instead of the standard output file.
-o, --output myjobdescription-%j.out Sends standard output to the file pattern specified.
--gpus-per-task <integer> Number of gpus to use per task (only for tools that use GPUs!
-G, --gpus <integer> Number of GPUs required for the job (only for tools that use GPUs!)
--pty /bin/bash informs slurm to launch an interactive pseudo terminal using the specified shell. This option is necessary when trying to run an interactive job. /bin/bash should go at the very end of the command. *this option is only available in srun.

An example submission with custom resources

sbatch --partition=cpu-core-sponsored --account=cpu-mylab-sponsored \
  --nodes=1 --ntasks=1 --cpus-per-task=4 --mem-per-cpu=7.5G \
  --time=0-0:10:00 \
  --job-name=my_rna_script \
  --chdir={{< var path.home >}}/mmouse/logs/ \
  --mail-user={{< var path.exampleemail >}} \
  --mail-type=ALL \
  run_rnaseq_master.sh

11.5 Slurm Batch Scripts

As an alternative to flags on the command-line, Slurm directives can be specified inside the script itself to create a Slurm script.

11.5.1 Example batch script

A Slurm script is a BASH script that contains directives for the Slurm system. Here is a clear example of a basic batch script, which we’ll save as my_cpu_job.sh. This script will run a Python script called my_script.py.

#!/bin/bash
#SBATCH --job-name=my_cpu_job
#SBATCH --partition=cpu-core-sponsored
#SBATCH --account=cpu-core-sponsored
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=7.5G
#SBATCH --time=0-04:00:00
#SBATCH --output=my_cpu_job.out
#SBATCH --error=my_cpu_job.err
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --mail-user={{< var path.exampleemail >}}

# ^ Resource request
#------------------------------------------------------------------------------
# v Execution script


# Activate your conda environment
source ~/.bashrc
conda activate my_env

# Run your Python script
python my_script.py

To execute this run

sbatch my_cpu_job.sh

More example scripts can be found on Sasquatch in the /data/hps/assoc/public/core/slurm-batch-examples/ directory.

11.6 Scheduled jobs

11.6.1 Learning about what compute resources are present

  • To list nodes available for computing to Slurm
sinfo -o "%30N %6c %15m %30G"
  • To get a graphical window with information about resources available and jobs running (note that you will need to have X11 forwarding, either by using ssh -X to log into Sasquatch, or by enabling X11 forwarding in MobaXTerm)
sview

11.6.2 Crontabs

Sometimes you may want a job to happen at a regular time interval. If you need to do this on the cluster, I think it’s worth it to ask yourself a key questions first:

Do I really want to do this on a shared resource?

You should ask this because while slurm does have facilities for scheduling jobs, it is definitely a shared resource. Our cluster can and will be busy at times, the level of system load from day to day is quite stochastic, and we have to allow users to run for fairly long walltimes. All of these things together can interfere with whether or not your job kicks off everytime as specified.

If after considering these things you still want to schedule a job, you can do so, but you should not use the classic crontab to do it. Crontab records may get wiped out whenever the head nodes need to be reset, which will lead to frustration if you are trying to depend on those. Instead: use scrontab, which is described in the official docs here and given some context here. scrontab is quite similar to crontab, except that it uses the slurm scheduler to schedule your jobs. This is both more appropriate for a shared resource and also avoids the problem of losing all your crontab records whenever the login nodes get reset.

11.7 A note on tasks

One of the most confusing new concepts introduced with slurm is the idea of “tasks”. And now that we know what scripts look like, we can look at some concrete examples of what this literally means. For these examples, I will leave out a lot of details just so you can see the really relevant parts.

Here is an example where we say that we want ONE task, but then tell the computer to do TWO things.

#!/bin/bash
#SBATCH --ntasks=1

srun sleep 10 & 
srun sleep 12 &
wait

In this case the computer will do these two things as a series of events.

In the above example we have forced this inefficiency by telling slurm that we only want it to do one task, and then actually giving it to things to do. Slurm responds in this case by doing those two things one after the other.

And here is an example where we say that we have TWO tasks, and then (again) ask the computer to do TWO things.

#!/bin/bash
#SBATCH --ntasks=2

srun --ntasks=1 sleep 10 & 
srun --ntasks=1 sleep 12 &
wait

In this case, the computer will also do these two tasks, but this time they are both run at the SAME time.

Please also notice that for EACH individual task in our 2nd example (each srun call), we ALSO need to tell the computer that we want those steps to be processed as just one task each. When specifying our jobs it helps to be unambiguous like this.

Hopefully you can now see how the task is a central concept for describing your jobs to slurm. A task here is just represents a ‘unit’ of computing effort, and many of the other slurm arguments ultimately exist to just describe details that are related to these tasks (cpus-per-task, mem-per-cpu etc.).