Managing Jobs¶
When jobs are submitted, researchers can monitor their status using Slurm commands. Additionally, researchers can get information about completed jobs regarding their CPU and memory usage during execution for planning future jobs. Both of these cases should be a regular part of using Cheaha for researchers.
In case jobs were submitted by accident or the code was written incorrectly, they can also be cancelled.
Monitoring Queued Jobs with squeue
¶
Currently running jobs can be monitored using the squeue
command. The basic command to list all jobs for a specific researcher is:
The output of squeue
will look like:
By default the fields displayed are jobid
, partition
, jobname
as name
, BlazerID as user
, job state as st
, total run time as time
, number of nodes as node
, and the list of nodes as nodelist
, used for each job a researcher has submitted.
For array jobs, the JobID will be formatted as jobid_arrayid
.
More information is available at the Official Documentation.
Cancelling Jobs with scancel
¶
Cancelling queued and currently running jobs can be done using the scancel
command. Importantly, this will only cancel jobs that were initiated by the researcher running the command. scancel
is very flexible in how it behaves:
# cancel a single job or an entire job array
scancel <jobid>
# cancel specific job array IDs, specified as single number or a range
scancel <jobid_arrayid>
# cancel all jobs on a partition for the user
scancel -p <partition>
# cancel all jobs for a researcher
scancel -u $USER
Warning
Cancelling all jobs will also cancel the interactive jobs created on the Open OnDemand portal.
More information is available at the Official Documentation.
Reviewing Past Jobs with sacct
¶
If you are planning a new set of jobs and are estimating resource requests, it is useful to review similar jobs that have already completed. To list past jobs for a researcher, use the sacct
command. Common use cases and information are detailed below. Full details are available at the Official Documentation.
Tip
To minimize queue wait times and make best use of resources, please review job efficiency using seff
. See our Job Efficiency page for more information.
Review Jobs by JobID¶
The basic form is to use -j
along with a JobID to list information about that job.
You can also review multiple jobs using a comma-separated list of JobIDs.
This command will output basic information such as the ID, Name, Partition, Allocated CPUs, and State for the given JobID.
Jobs can have matching extern and/or batch job entries as well. These are not especially helpful for most researchers. You can remove these entries using the -X
flag.
Review Jobs Submitted Between Specific Timepoints¶
If you do not remember the JobID, you can use the -S
and -E
flags to retrieve jobs submitted between the given start datetime and end datetime.
For example, to retrieve jobs submitted during the month of July 2021, the command could be:
Customizing the Output¶
You can add -o
with a list of output fields to customize the information you see.
You may also use the format <field>%<width>
to make columns be <width>
characters wide. This is sometimes necessary for TRES fields and nodelist
, among others. An example might be alloctres%40
to make the field 40 characters wide.
This command will output the JobID, the start time, end time, the state, the number of allocated CPUs, and the requested memory for the specified job. All potential output fields can be seen using sacct --helpformat
. Their descriptions can be found on the sacct documentation under Job Accounting Fields.
Formatting the Output¶
You can format the output of sacct
using a delimiter with the flags --parsable2
and --delimiter=<delim>
. Any number of characters may be used as a delimiter. The default is |
. It is not recommended to use ,
as that is used in comma-separated lists throughout sacct
fields.
sacct
Flags¶
Flag | Short | Description | Docs |
---|---|---|---|
FILTERING | |||
--user |
-u |
Jobs from a specific user. Please only use your own BlazerID. | sacct |
--allocations |
-X |
Show jobs only, not steps. | sacct |
--starttime |
-S |
Jobs starting at a given time. See Time formatting. | sacct |
--endtime |
-E |
Jobs ending at a given time. See Time Formatting. | sacct |
--state |
-s |
Jobs with a given state. See States. | sacct |
--jobs |
-j |
Show only the jobids supplied in a comma-separated list. | sacct |
FORMATTING | |||
--format |
-o |
Show only the Fields supplied in a comma-separated list. | sacct |
--helpformat |
-e |
Show a list of available Fields. | sacct |
--parsable2 |
-P |
Output as delimited data with --delimiter if supplied, default is \| . |
sacct |
--delimiter |
n/a | Characters to delimit field values. | sacct |
--json |
n/a | Output as JSON. (Not yet available on Cheaha). | sacct |
--yaml |
n/a | Output as YAML. (Not yet available on Cheaha). | sacct |
--noconvert |
n/a | Keep uniform units, e.g. all M instead of M and G. See Units. | sacct |
A complete list of flags is available at Official Documentation.
sacct
Fields¶
Field | Description | Same As... | Job | Step | Docs |
---|---|---|---|---|---|
METADATA | |||||
jobid | Slurm assigned job ID number. | jobid format | yes | yes | sacct |
jobname | User assigned job name. | --job-name |
yes | yes | sacct |
state | Current state of the job. | states | yes | yes | sacct |
partition | Partition job was submitted to. | --partition |
yes | yes | sacct |
ntasks | Number of requested tasks. | --ntasks |
yes | yes | sacct |
nodelist | List of nodes used. | --nodelist if supplied |
yes | yes | sacct |
TIME | |||||
submit | Submit time as YYYY-MM-DDTHH:MM:SS | n/a | yes | yes | sacct |
start | Start time as YYYY-MM-DDTHH:MM:SS | n/a | yes | yes | sacct |
end | End time as YYYY-MM-DDTHH:MM:SS | n/a | yes | yes | sacct |
elapsed | Elapsed time as DD-HH:MM:SS | n/a | yes | yes | sacct |
RESOURCE REQUESTED | |||||
reqcpus | CPUs requested. | cpu calculation | yes | yes | sacct |
reqmem | Memory requested. Uses 10Gc for per core, 10Gn for per node. | --mem-per-cpu or --mem |
yes | no | sacct |
reqnodes | Nodes requested. | --nodes |
yes | yes | sacct |
reqtres | All requested resources. May be used to review GPUs. | tres explanation | yes | yes | sacct |
RESOUCES ALLOCATED | |||||
alloccpus | CPUs allocated. | cpu calculation | yes | yes | sacct |
allocnodes | Nodes allocated | --nodes |
yes | yes | sacct |
alloctres | All allocated resources. May be used to review GPUs. | tres explanation | yes | yes | sacct |
averss | Average resident set size (memory) in bytes across tasks. | resident set size | no | yes | sacct |
maxrss | Maximum resident set size (memory) in bytes across tasks. | resident set size | no | yes | sacct |
A complete list of fields is available at the Official Documentation.
Slurm Common Reference¶
Slurm JobID Formatting¶
JobID numbers are assigned automatically by the scheduler in the order submissions are received. All jobs have a single, unique JobID number associated with them. Some features will cause JobID numbers to be reported differently than their actual value.
- For non-array jobs submitted with
sbatch
,salloc
, or withsrun
outside of a job context, the unique JobID number is reported directly. - For array jobs submitted with
sbatch
, the array is assigned a master ID like12345678
, and each task is reported as<master-job-id>_<task-id>
. An example might be12345678_987
. Each task still has a unique JobID number. - For job steps submitted with
srun
inside of a job context, the JobID is reported as<job-id>.<task-name>
. All jobs submitted generate a.batch
step and a.extern
step. An example might be12345678.batch
.
Slurm Time Formatting¶
Slurm formats time in two different ways: (1) time points and (2) durations. Time points are used whenever a single point in time is needed, such as the start or end of a job. Durations are needed for job requests and reported for elapsed times.
Units are given a shorthand designations:
YYYY
four-digit year.MM
two-digit month or two-digit minutes, depending on placement.DD
two-digit day.HH
two-digit hour.SS
two-digit seconds.AM|PM
literally AM or PM.
Square brackets []
indicate the contents are optional.
Time points may be formatted like any of the following.
HH:MM[:SS][AM|PM]
MMDD[YY][-HH:MM[:SS]]
MM.DD[.YY][-HH:MM[:SS]]
MM/DD[/YY][-HH:MM[:SS]]
YYYY-MM-DD[THH:MM[:SS]]
Duration requests are made like any of the following.
Durations are reported like the following.
Slurm States¶
Job states report on where the job is in the overall Slurm process. If all goes well, you will see jobs move through the following states:
PENDING
RUNNING
- A terminal state depending on what happens
COMPLETED
if the job finished normally and returns exit code zeroCANCELLED
if the researcher cancels the jobFAILED
if there is a software error or non-zero exit codeTIMEOUT
if the job had insufficient time
Other states are possible. A complete list of job states is available at the Official Documentation.
Slurm Units¶
Slurm uses flexible units for memory to keep reports compact. It always prefers the shortest possible representation, and will choose the largest units by default. Other units may be used, and there are flags to allow reporting in uniform units.
The memory units are KMGT
for kilo
, mega
, giga
, tera
respectively. All are in bytes. Slurm uses the convention that e.g.
TRES Explained¶
The abbreviation TRES
stands for "trackable resources". Any resource made available by Slurm that is trackable is recorded in the Slurm database and can be recovered using sacct. The fields reqtres
and alloctres
can be used to review CPUs, memory, nodes and GPUs. The data is stored as a comma-separated list of <resource>=<quantity>
pairs, and all values are totals across the entire job, not per node or per task. An example might look like:
RSS Explained¶
The abbreviation RSS
stands for "resident set size", and is related to memory usage by jobs in Slurm. Memory usage is challenging to record accurately. Recording memory means a request must be made to the operating system to obtain memory usage at a single point in time, which uses computational resources. There is a balance made between resolution in time, and computational overhead.
The difficulty with recording memory usage contributes to difficulty diagnosing root causes of out of memory errors, bus errors, and segmentation faults.
RSS is recorded by Slurm in the sacct fields averss
and maxrss
. These values are both reported in bytes, rather than the usual compact memory units.
Slurm Resource Calculations¶
Calculating CPUs¶
Example:
For a job with --cpus-per-task=16 --ntasks=2 --nodes=3
:
Calculating Memory¶
Examples:
For a job with --mem=40G --nodes=2
:
For a job with --mem-per-cpu=10G --cpus-per-task=8 --ntasks=2 --nodes=2
: