hyperfine
Benchmark command-line programs
TLDR
Run a basic benchmark, performing at least 10 runs
Run a comparative benchmark
Change minimum number of benchmarking runs
Perform benchmark with warmup
Run a command before each benchmark run (to clear caches, etc.)
Run a benchmark where a single parameter changes for each run
SYNOPSIS
hyperfine [OPTIONS] COMMAND...
hyperfine [OPTIONS] [--runs N] [--warmup W] [--prepare PREP_CMD] [--export-json FILE] COMMAND...
PARAMETERS
--runs
Number of benchmark runs for each command.
Defaults to 10.
--warmup
Number of warm-up runs for each command.
These runs are not measured and help mitigate caching effects.
--prepare
Command(s) to execute before each benchmark run.
Useful for resetting the environment or generating temporary files.
--setup
Command(s) to execute once before all benchmark runs.
--cleanup
Command(s) to execute once after all benchmark runs.
--export-json
Exports the benchmark results to the specified JSON file.
--export-markdown
Exports the benchmark results to the specified Markdown file.
--export-csv
Exports the benchmark results to the specified CSV file.
--min-runs
Minimum number of benchmark runs.
Hyperfine will perform at least this many runs, potentially more until statistical significance is reached or --max-runs
/ --time-limit
is hit.
--max-runs
Maximum number of benchmark runs.
--time-limit
Maximum total time to spend benchmarking each command.
--show-output
Prints the stdout/stderr of the benchmarked commands.
--ignore-output
Ignores stdout/stderr of the benchmarked commands when checking for success.
--command-name
Provides a custom name for the following command in the output.
--shell
Specifies the shell to use for executing commands (e.g., bash
, zsh
).
DESCRIPTION
Hyperfine is a powerful command-line benchmarking tool designed for accurately measuring the execution time of shell commands. Unlike simpler tools like `time`, Hyperfine performs multiple runs of each command, detects outliers, and provides robust statistical analysis of the results. It includes features like warm-up runs, a mechanism for preparing the environment before each run, and cleaning up afterward, ensuring consistent and fair comparisons. The tool can export detailed results in various formats, including JSON, Markdown, and CSV, making it easy to integrate into automated workflows or for generating reports. Its emphasis on statistical rigor helps users obtain reliable performance metrics, making it ideal for optimizing scripts, comparing different implementations, or tracking performance regressions.
CAVEATS
While hyperfine
offers superior accuracy compared to time
, it's important to consider:
- Measurement Overhead: For extremely short-duration commands (e.g., sub-millisecond), the overhead of
hyperfine
itself can become a significant portion of the measured time, potentially skewing results. - External Factors: Benchmark results can still be influenced by external system load, CPU frequency scaling, I/O contention, and other background processes. For highly precise measurements, consider running in an isolated environment.
- Reproducibility: Achieving perfectly identical results across different machines or even different runs on the same machine can be challenging due to system variability. Focus on trends and statistical significance rather than absolute single-run values.
STATISTICAL ANALYSIS AND OUTPUT
hyperfine
provides detailed statistical output including mean, standard deviation, and median execution times. It visually represents the distribution of run times and highlights any detected outliers. For multiple commands, it calculates and displays the speed-up factor between them, complete with confidence intervals, which is crucial for determining if performance differences are statistically significant.
OUTLIER DETECTION
The tool automatically detects and flags outliers in the measured execution times, which could be due to transient system disturbances. These outliers are shown in the output and can be optionally excluded from statistical calculations for a more robust average.
HISTORY
hyperfine
was created by David Peter and is written in Rust, leveraging its performance and safety features. It emerged as a more robust and statistically sound alternative to traditional shell time
commands, addressing common pitfalls like single-run measurements, lack of warm-up, and absence of outlier detection. Its development focused on providing a user-friendly interface combined with scientific rigor, quickly gaining popularity in the developer community for reproducible and reliable benchmarking of shell scripts and executables. The first public release (0.1.0) was around September 2017.