LinuxCommandLibrary

sacct

Report job accounting information

TLDR

Display job ID, job name, partition, account, number of allocated cpus, job state, and job exit codes for recent jobs

$ sacct
copy

Display job ID, job state, job exit code for recent jobs
$ sacct [[-b|--brief]]
copy

Display the allocations of a job
$ sacct [[-j|--jobs]] [job_id] [[-X|--allocations]]
copy

Display elapsed time, job name, number of requested CPUs, and memory requested of a job
$ sacct [[-j|--jobs]] [job_id] [[-o|--format]] Elapsed,JobName,ReqCPUS,ReqMem
copy

Display recent jobs that occurred from one week ago up to the present day
$ sacct [[-S|--starttime]] $(date [[-d|--date]] "1 week ago" +'%F')
copy

Output a larger number of characters for an attribute
$ sacct [[-o|--format]] JobID,JobName%100
copy

SYNOPSIS

sacct [OPTIONS]

PARAMETERS

-S <time>, --starttime=<time>
    Select jobs or steps that started or ended after the specified time. Time formats include YYYY-MM-DD[THH:MM[:SS]] or relative times like 'now', 'yesterday', or '-1hour'.

-E <time>, --endtime=<time>
    Select jobs or steps that started or ended before the specified time.

-j <job_list>, --jobs=<job_list>
    Display information for a comma-separated list of specific job IDs or job ID ranges.

-u <user_list>, --users=<user_list>
    Display information for a comma-separated list of user names or IDs.

-s <state_list>, --state=<state_list>
    Display jobs or steps matching the specified job states (e.g., COMPLETED, FAILED, RUNNING).

-o <format>, --format=<format>
    Specify a comma-separated list of fields to display in the output (e.g., JobID,JobName,State,CPUTime).

-D, --show-steps
    Display individual job steps in addition to the overall job information.

-X, --no-steps
    Only display job information, suppressing the display of individual job steps.

-l, --long
    Display more fields than the default output format.

-P, --parsable
    Output in a machine-readable format, typically using a '|' delimiter.

-a, --allusers
    Display accounting records for all users (requires administrative privileges).

-M <cluster_list>, --clusters=<cluster_list>
    Specify a comma-separated list of cluster names to query.

DESCRIPTION

sacct is a command-line utility within the Slurm Workload Manager suite. Its primary function is to query and display historical accounting information for jobs and job steps that have completed, failed, or been canceled. Unlike squeue, which shows active jobs, sacct delves into the past, retrieving data from the Slurm accounting database (typically managed by slurmdbd). Users can filter results by various criteria such as job ID, user, time range, job state, and cluster. It provides a comprehensive view of resource utilization (CPU, memory, GPU), exit codes, submission and completion times, and other crucial metrics, making it invaluable for auditing, performance analysis, and resource management within a Slurm-managed HPC environment. The command supports flexible output formatting, allowing users to customize the displayed fields to suit specific analytical needs.

CAVEATS

sacct relies on Slurm accounting (slurmdbd) being configured and running to store historical data. Data retention policies, set by administrators, dictate how long job records are kept, meaning older data might not be available. Queries over very large datasets or long time ranges can be slow. Users typically only have permissions to view their own job records unless granted administrative privileges.

COMMON OUTPUT FIELDS AND USAGE EXAMPLES

When using the -o, --format option, sacct allows specifying a wide range of fields to display. Common fields include JobID, JobName, State, ExitCode, User, Start, End, ElapsedTime, CPUTime, MaxRSS (maximum resident set size), AllocNodes, and AllocCPUs.

For example, to display the Job ID, Job Name, State, Exit Code, CPU Time, and Max RSS for a specific job:
sacct -j 12345 -o JobID,JobName,State,ExitCode,CPUTime,MaxRSS

To see all jobs for a user 'jdoe' that completed yesterday:
sacct -u jdoe -s COMPLETED -S yesterday

HISTORY

Slurm (Simple Linux Utility for Resource Management) was developed by Lawrence Livermore National Laboratory. sacct is an integral part of its accounting system, introduced to provide detailed historical job data which squeue doesn't cover. Its functionality has evolved alongside Slurm itself, improving query capabilities, performance, and output customization, becoming a crucial tool for HPC administrators and users for post-job analysis and chargeback.

SEE ALSO

sbatch(1), squeue(1), scancel(1), sinfo(1), sacctmgr(8), slurmctld(8), slurmdbd(8)

Copied to clipboard