LinuxCommandLibrary

hyperfine

Benchmark command-line programs

TLDR

Run a basic benchmark, performing at least 10 runs

$ hyperfine '[make]'
copy

Run a comparative benchmark
$ hyperfine '[make target1]' '[make target2]'
copy

Change minimum number of benchmarking runs
$ hyperfine [[-m|--min-runs]] [7] '[make]'
copy

Perform benchmark with warmup
$ hyperfine [[-w|--warmup]] [5] '[make]'
copy

Run a command before each benchmark run (to clear caches, etc.)
$ hyperfine [[-p|--prepare]] '[make clean]' '[make]'
copy

Run a benchmark where a single parameter changes for each run
$ hyperfine [[-p|--prepare]] '[make clean]' [[-P|--parameter-scan]] [num_threads] [1] [10] '[make --jobs {num_threads]}'
copy

SYNOPSIS

hyperfine [OPTIONS] COMMAND...
hyperfine [OPTIONS] [--runs N] [--warmup W] [--prepare PREP_CMD] [--export-json FILE] COMMAND...

PARAMETERS

--runs , -r
    Number of benchmark runs for each command.
Defaults to 10.

--warmup , -w
    Number of warm-up runs for each command.
These runs are not measured and help mitigate caching effects.

--prepare , -p
    Command(s) to execute before each benchmark run.
Useful for resetting the environment or generating temporary files.

--setup , -s
    Command(s) to execute once before all benchmark runs.

--cleanup , -c
    Command(s) to execute once after all benchmark runs.

--export-json
    Exports the benchmark results to the specified JSON file.

--export-markdown
    Exports the benchmark results to the specified Markdown file.

--export-csv
    Exports the benchmark results to the specified CSV file.

--min-runs , -m
    Minimum number of benchmark runs.
Hyperfine will perform at least this many runs, potentially more until statistical significance is reached or --max-runs / --time-limit is hit.

--max-runs
    Maximum number of benchmark runs.

--time-limit
    Maximum total time to spend benchmarking each command.

--show-output
    Prints the stdout/stderr of the benchmarked commands.

--ignore-output
    Ignores stdout/stderr of the benchmarked commands when checking for success.

--command-name
    Provides a custom name for the following command in the output.

--shell
    Specifies the shell to use for executing commands (e.g., bash, zsh).

DESCRIPTION

Hyperfine is a powerful command-line benchmarking tool designed for accurately measuring the execution time of shell commands. Unlike simpler tools like `time`, Hyperfine performs multiple runs of each command, detects outliers, and provides robust statistical analysis of the results. It includes features like warm-up runs, a mechanism for preparing the environment before each run, and cleaning up afterward, ensuring consistent and fair comparisons. The tool can export detailed results in various formats, including JSON, Markdown, and CSV, making it easy to integrate into automated workflows or for generating reports. Its emphasis on statistical rigor helps users obtain reliable performance metrics, making it ideal for optimizing scripts, comparing different implementations, or tracking performance regressions.

CAVEATS

While hyperfine offers superior accuracy compared to time, it's important to consider:

  • Measurement Overhead: For extremely short-duration commands (e.g., sub-millisecond), the overhead of hyperfine itself can become a significant portion of the measured time, potentially skewing results.
  • External Factors: Benchmark results can still be influenced by external system load, CPU frequency scaling, I/O contention, and other background processes. For highly precise measurements, consider running in an isolated environment.
  • Reproducibility: Achieving perfectly identical results across different machines or even different runs on the same machine can be challenging due to system variability. Focus on trends and statistical significance rather than absolute single-run values.

STATISTICAL ANALYSIS AND OUTPUT

hyperfine provides detailed statistical output including mean, standard deviation, and median execution times. It visually represents the distribution of run times and highlights any detected outliers. For multiple commands, it calculates and displays the speed-up factor between them, complete with confidence intervals, which is crucial for determining if performance differences are statistically significant.

OUTLIER DETECTION

The tool automatically detects and flags outliers in the measured execution times, which could be due to transient system disturbances. These outliers are shown in the output and can be optionally excluded from statistical calculations for a more robust average.

HISTORY

hyperfine was created by David Peter and is written in Rust, leveraging its performance and safety features. It emerged as a more robust and statistically sound alternative to traditional shell time commands, addressing common pitfalls like single-run measurements, lack of warm-up, and absence of outlier detection. Its development focused on providing a user-friendly interface combined with scientific rigor, quickly gaining popularity in the developer community for reproducible and reliable benchmarking of shell scripts and executables. The first public release (0.1.0) was around September 2017.

SEE ALSO

time(1), perf(1), strace(1)

Copied to clipboard