LinuxCommandLibrary
GitHubF-DroidGoogle Play Store

sdiag

Display Slurm controller diagnostic information

TLDR

Show all performance counters
$ sdiag -a
copy
Reset performance counters
$ sdiag -r
copy
Output as JSON or YAML
$ sdiag -a --json
copy
Specify cluster
$ sdiag -a -M cluster_name
copy

SYNOPSIS

sdiag [options]

DESCRIPTION

sdiag displays diagnostic information about slurmctld, the Slurm controller daemon. It shows performance metrics, scheduling statistics, and resource usage counters.
This is useful for monitoring cluster health and troubleshooting scheduling performance.

PARAMETERS

-a, --all

Show all performance counters
-r, --reset
Reset performance counters
--json, --yaml
Output format
-M, --cluster name
Target specific cluster

CAVEATS

Requires appropriate permissions to access Slurm controller data. Reset option affects all users' view of counters.

HISTORY

Part of Slurm workload manager, providing diagnostic tools for cluster administrators.

SEE ALSO

scontrol(1), sinfo(1), squeue(1)

Copied to clipboard
Kai