zstd

Compress or decompress files using the Zstandard algorithm

TLDR

Compress a file into a new file with the .zst suffix

$ zstd [path/to/file]

Decompress a file

$ zstd [[-d|--decompress]] [path/to/file.zst]

Decompress to stdout

$ zstd [[-d|--decompress]] [[-c|--stdout]] [path/to/file.zst]

Compress a file specifying the compression level, where 1=fastest, 19=slowest, and 3=default

$ zstd -[level] [path/to/file]

Compress a file using an ultra-fast compression level, where 1=default

$ zstd --fast=[level] [path/to/file]

Unlock higher compression levels (up to 22) using more memory (both for compression and decompression)

$ zstd --ultra -[level] [path/to/file]

Set the number of working threads to the number of physical CPU cores

$ zstd [[-T|--threads]] 0

zstd [OPTIONS] [FILES...]
zstd -d [OPTIONS] [FILES...]
zstd -D DICTIONARY [FILES...]

When no FILES are specified, zstd reads from standard input and writes to standard output.

PARAMETERS

-d, --decompress, --uncompress, -x
    Decompress specified FILES. If no files are given, reads from stdin.

-z, --compress
    Compress specified FILES (this is the default behavior).

-f, --force
    Overwrite existing output files without prompting.

-k, --keep
    Do not delete original input files after successful operation.

-c, --stdout
    Write to standard output. Original files are kept. Useful for piping.

-o <file>, --output=<file>
    Specify the output file name instead of the default.

-l, --list
    List information about zstd compressed files (e.g., compressed size, original size, ratio).

-t, --test
    Test the integrity of compressed zstd files without decompressing them.

-r, --recursive
    Operate on files and subdirectories recursively. When compressing, it creates .zst files; when decompressing, it removes them.

-T<num>, --threads=<num>
    Set the number of worker threads to use for compression/decompression. 0 means use all available cores; 1 means single-threaded.

-D <file>, --dictionary=<file>
    Use a specified dictionary file for compression or decompression. Improves compression for small, similar data blocks.

-v, --verbose
    Display verbose messages and progress information during operation.

-q, --quiet
    Suppress all messages, except errors.

--rm
    Remove source files after successful compression or decompression. Similar to mv.

-<level> (e.g., -1 to -19)
    Set the compression level. -1 is fastest/least compression, -19 is slowest/most compression. Default is -3.

--ultra
    Enable ultra compression levels (up to -22) for significantly higher compression ratios, at the cost of much slower compression speed.

--fast[=<level>]
    Enable faster compression modes. Levels range from 1 to 10 (default is 1), with higher levels offering slightly better ratio at a speed cost.

DESCRIPTION

Zstandard, commonly referred to as zstd, is a fast lossless compression algorithm and tool developed by Yann Collet at Facebook (now Meta) and open-sourced in 2016.

It is designed to offer a very wide range of compression ratios and speeds, balancing efficiency with performance. zstd provides excellent compression ratios, comparable to xz (LZMA) for high settings, but with significantly faster decompression speeds, often rivalling or even exceeding gzip and snappy.

Its key features include high-speed compression and decompression, a configurable compression ratio, support for dictionary compression (which dramatically improves compression of small, similar files), and multi-threading capabilities. zstd is widely adopted in various applications, from real-time data processing and databases to Linux kernel file systems, due to its versatility and robust performance characteristics.

CAVEATS

While zstd is highly versatile, it's not always a drop-in replacement for older tools like gzip or xz if the exact compression algorithm or specific compatibility is required. Optimal performance with dictionary compression relies on a well-trained dictionary for the specific data type. High compression levels (e.g., --ultra) can be very CPU-intensive, despite zstd's overall speed.

INPUT/OUTPUT HANDLING

By default, zstd reads data from standard input if no input FILES are specified, or if -c (--stdout) is used. It writes compressed or decompressed data to standard output in these cases. Otherwise, it generates output files with a .zst suffix for compression, or removes the suffix for decompression.

DICTIONARY COMPRESSION

zstd's dictionary mode is particularly effective for compressing small, highly redundant data blocks (e.g., log entries, network packets). A dictionary can be generated from a sample dataset using the zstd --train command, and then applied during compression and decompression to significantly boost the compression ratio for similar subsequent data.

HISTORY

Zstandard was developed by Yann Collet (then at Facebook, now Meta) with the first stable release in August 2016. Its primary motivation was to create a modern compression algorithm that could outperform existing standards like zlib in terms of both speed and compression ratio, especially for real-time applications and large datasets.

It quickly gained traction due to its performance characteristics and has been integrated into various systems, including the Linux kernel (for filesystems like Btrfs and F2FS, and for SquashFS), databases, and network protocols, establishing itself as a popular choice for high-performance data compression.