LinuxCommandLibrary

zstdmt

Compress or decompress files using multiple threads

TLDR

View documentation for the original command

$ tldr zstd
copy

SYNOPSIS

zstdmt [OPTIONS] [FILES...]
zstdmt -d [OPTIONS] [FILES...]
zstdmt -r [OPTIONS] DIRECTORY...

PARAMETERS

-d, --decompress, --uncompress
    Decompress. This option overrides the default compression behavior.

-z, --compress
    Compress. This is the default operation if no other mode is specified.

-f, --force
    Overwrite existing output files without prompting.

-k, --keep
    Do not delete source files after successful compression or decompression.

-o FILE, --output FILE
    Specify the output filename. If not specified, a default name is used (e.g., adding/removing .zst extension).

-c, --stdout, --to-stdout
    Write output to standard output, even if it's a console. Original files are kept when using this option.

-r, --recursive
    Operate on directories recursively. Processes files within specified directories and subdirectories.

-L#, --level=#
    Specify compression level. Valid range is 1-22. Higher levels offer better compression but are slower. Default is usually 3.

-T#, --threads=#
    Specify the number of worker threads to use. By default, zstdmt attempts to use all available CPU cores (0 means all available). Set to 1 for single-threaded operation.

--rm
    Remove source files after successful operation. This is the default behavior for zstdmt when compressing to a file.

-q, --quiet
    Suppress all messages and warnings during operation.

-v, --verbose
    Display more information and statistics during operation, such as progress updates.

--fast[=#]
    Enable fast compression mode. Higher numbers (e.g., --fast=10) indicate faster compression but generally lower compression ratios. Can be combined with --level.

--check
    Verify the integrity of the decompressed data. This is performed after decompression and helps ensure data accuracy.

--list
    Display information about Zstandard files (e.g., compressed size, original size, compression ratio) without decompressing them.

-h, --help
    Display a help message and exit.

--version
    Display version information and exit.

DESCRIPTION

zstdmt is a powerful command-line utility for multi-threaded Zstandard compression and decompression. It is a variant of the standard zstd compressor, specifically optimized to leverage multiple CPU cores, making it highly efficient for processing large files or directories. While retaining zstd's excellent compression ratios and high speeds, zstdmt significantly enhances throughput by parallelizing tasks. This makes it particularly well-suited for environments where rapid data processing and maximum utilization of available computing resources are critical.

It supports a wide range of compression levels, from very fast (less compression) to ultra (maximum compression), and can operate on single files, multiple files, or recursively on directories. Like its single-threaded counterpart, zstdmt can work with standard input/output streams, making it versatile for scripting and integration into data pipelines. Its primary advantage lies in its ability to achieve substantially faster compression and decompression times on multi-core systems compared to single-threaded solutions.

CAVEATS

Using more threads (via -T option) significantly increases memory consumption, as each thread requires its own memory buffers for compression/decompression. While zstdmt excels with large files, for very small files, the overhead of multi-threading might negate performance gains, potentially making single-threaded zstd or other tools faster. The specific availability of the zstdmt binary can vary between Linux distributions; some distributions may provide multi-threading capabilities directly within the main zstd(1) command, potentially with a default thread count set based on CPU cores.

DEFAULT BEHAVIOR

By default, when compressing, zstdmt compresses the specified file(s) and replaces them with the compressed version(s), adding the .zst extension. When decompressing, it replaces the .zst file with the original uncompressed data. Use --keep to retain original files or --stdout to write to standard output without modifying source files.

OPTIMAL THREAD COUNT

The ideal number of threads (-T#) depends on the system's CPU cores and the nature of the data. For maximum performance, it often defaults to the number of available logical cores. However, using too many threads on I/O-bound operations or systems with limited memory can sometimes lead to diminishing returns or even performance degradation due to contention and increased memory pressure.

PIPING AND STREAMS

zstdmt can seamlessly handle data from standard input (stdin) and write to standard output (stdout), making it highly flexible for use in shell pipelines. For example, cat large_file.txt | zstdmt -T0 > large_file.txt.zst compresses data from a pipe, and zstdmt -d large_file.txt.zst -c | less decompresses to standard output.

HISTORY

Zstandard (zstd) was developed by Yann Collet at Facebook (now Meta) and publicly released in 2016. It was designed to offer a balance of high compression ratios with very fast compression and decompression speeds, often outperforming older algorithms like gzip and bzip2. From its inception, zstd was designed with multi-threading capabilities in mind to leverage modern multi-core processors. The zstdmt utility specifically highlights and often defaults to utilizing these multi-threading features to provide maximum throughput. Its development reflects the increasing need for high-performance compression solutions in data-intensive applications, cloud computing, and big data environments where processing speed and efficient resource utilization are paramount.

SEE ALSO

zstd(1), gzip(1), bzip2(1), xz(1), tar(1)

Copied to clipboard