LinuxCommandLibrary

stressapptest

Stress-test hardware stability and reliability

TLDR

Test the given amount of memory (in Megabytes)

$ stressapptest -M [memory]
copy

Test memory as well as I/O for the given file
$ stressapptest -M [memory] -f [path/to/file]
copy

Test specifying the verbosity level, where 0=lowest, 20=highest, 8=default
$ stressapptest -M [memory] -v [level]
copy

SYNOPSIS

stressapptest [OPTIONS]

PARAMETERS

-s
    Run for a specified duration in seconds.

-M
    Allocate MB megabytes of memory for testing.

-m
    Number of memory copy threads to run concurrently.

-C
    Enable CPU stress testing using integer and floating-point operations.

-j
    Number of CPU stress threads to run.

-D
    Enable disk I/O testing. Requires specifying files with -f.

-f
    Use the specified file (or directory) for disk I/O testing.

-p
    Number of I/O threads per disk file.

-W
    Write and read pseudo-random data patterns to memory for enhanced error detection.

-H
    Stop on the first detected memory or data integrity error.

-x
    Exit after a specified number of detected errors.

-v
    Set verbosity level for output (e.g., 0=quiet, 1=default, 2=verbose).

-t
    Run a specific test suite, e.g., 'long' or 'memory'.

-o
    Write test results and logs to the specified file instead of stdout.

DESCRIPTION

stressapptest is a powerful, open-source diagnostic tool designed by Google to rigorously test and validate the stability and reliability of system hardware components. It focuses on pushing CPU, cache, memory, and disk I/O subsystems to their limits, simulating heavy workloads that reveal underlying hardware issues. By allocating large amounts of memory, performing complex computations, and executing intensive disk operations, stressapptest can uncover defects such as faulty RAM modules, overheating CPUs, unstable power supplies, or problematic disk drives that might not manifest under normal usage.

The tool includes robust error detection mechanisms, performing data integrity checks after operations to identify and report any discrepancies. This makes it invaluable for "burn-in" testing new hardware, diagnosing intermittent system crashes, or validating system stability after upgrades. Its comprehensive approach to stress testing helps ensure that systems are robust and reliable even under extreme load conditions.

CAVEATS

Using stressapptest can push your system to its absolute limits, potentially leading to system instability, crashes, or even data loss if underlying hardware is faulty.

It is crucial to run this command on systems where you can afford downtime and data loss, ideally in a controlled testing environment. Monitor system temperatures and performance closely during testing to prevent overheating or other damage. Results interpretation requires technical understanding of hardware health and error messages.

<I>ERROR DETECTION MECHANISMS</I>

stressapptest implements sophisticated error detection by writing known data patterns to memory or disk, and then verifying their integrity after operations. This allows it to detect bit flips, corruption, and other anomalies indicative of hardware failures. It logs errors with details like address and expected/actual values, aiding in diagnosis.

<I>TYPICAL USAGE SCENARIOS</I>


1. New Hardware Burn-in: Validate the stability of new servers or components before deployment.
2. Debugging Intermittent Issues: Uncover elusive crashes or performance problems often caused by unstable hardware.
3. Overclocking Stability Testing: Verify system stability after changes to CPU or memory clock speeds.
4. System Upgrade Validation: Ensure new RAM or storage devices are functioning correctly and stably with existing hardware.
5. Thermal Stress Testing: Identify overheating issues under prolonged heavy load.

HISTORY

stressapptest was originally developed by Google engineers as an internal tool to rigorously validate the stability and robustness of server hardware in their data centers. Its primary purpose was to identify subtle hardware defects, particularly in memory and I/O subsystems, that might evade detection during standard quality assurance procedures.

Recognizing its utility beyond their internal needs, Google subsequently open-sourced stressapptest, making it available to the wider Linux community. This initiative allowed system administrators, hardware developers, and enthusiasts to leverage Google's expertise in hardware validation for their own systems, contributing to more reliable and stable computing environments globally. Its development continues to focus on comprehensive error detection and system stress.

SEE ALSO

stress(1), badblocks(8), smartctl(8), mcelog(8)

Copied to clipboard