LinuxCommandLibrary

gzip

Compress files to reduce their size

TLDR

Compress a file, replacing it with a gzip archive

$ gzip [path/to/file]
copy

Decompress a file, replacing it with the original uncompressed version
$ gzip [[-d|--decompress]] [path/to/file.gz]
copy

Compress a file, keeping the original file
$ gzip [[-k|--keep]] [path/to/file]
copy

Compress a file, specifying the output filename
$ gzip [[-c|--stdout]] [path/to/file] > [path/to/compressed_file.gz]
copy

Decompress a gzip archive specifying the output filename
$ gzip [[-c|--stdout]] [[-d|--decompress]] [path/to/file.gz] > [path/to/uncompressed_file]
copy

Specify the compression level. 1 is the fastest (low compression), 9 is the slowest (high compression), 6 is the default
$ gzip -[1..9] [[-c|--stdout]] [path/to/file] > [path/to/compressed_file.gz]
copy

Display the name and reduction percentage for each file compressed or decompressed
$ gzip [[-v|--verbose]] [[-d|--decompress]] [path/to/file.gz]
copy

SYNOPSIS

gzip [OPTION]... [FILE]...
gunzip [OPTION]... [FILE]...
zcat [OPTION]... [FILE]...

PARAMETERS

-c, --stdout, --to-stdout
    Write on standard output, keep original files unchanged. Useful for piping output.

-d, --decompress, --uncompress
    Decompress a compressed file. Equivalent to using the gunzip command.

-f, --force
    Force compression or decompression even if the file has multiple links, already exists, or has a different suffix.

-h, --help
    Display a help message and exit.

-l, --list
    List the uncompressed file name, uncompressed size, compressed size, and compression ratio for each compressed file.

-n, --no-name
    Do not save or restore the original file name and time stamp when compressing/decompressing.

-N, --name
    Always save or restore the original file name and time stamp. This is the default when decompressing.

-q, --quiet
    Suppress all warning messages.

-r, --recursive
    Traverse the directory structure recursively. If a file is a directory, gzip will descend into it.

-S .suf, --suffix .suf
    Use .suf as the suffix for compressed files instead of .gz.

-t, --test
    Test the integrity of a compressed file without decompressing it.

-v, --verbose
    Display the name and percentage reduction for each file compressed or decompressed.

-V, --version
    Display the version number and exit.

-num
    Set the compression level, where num is a digit from 1 to 9. -1 (or --fast) is the fastest compression (less compression), and -9 (or --best) is the slowest but provides the best compression. The default is -6.

DESCRIPTION

gzip (GNU zip) is a popular command-line utility used for compressing and decompressing files. It employs the DEFLATE algorithm, a combination of LZ77 and Huffman coding, which is known for its good compression ratio and speed. By default, gzip replaces the original file with its compressed version, appending a .gz suffix (e.g., filename.txt becomes filename.txt.gz).

While gzip excels at compressing individual files, it does not archive directories or multiple files into a single archive; for that, it's commonly used in conjunction with archiving tools like tar. gzip is stream-oriented, meaning it can compress or decompress data piped to and from standard input/output, making it versatile for scripting and pipeline operations. Its widespread adoption makes it a de facto standard for single-file compression in Unix-like systems.

CAVEATS

gzip is designed for single-file compression and does not preserve directory structures or combine multiple files into a single archive. For this purpose, it is typically used in conjunction with archiving tools like tar. By default, gzip replaces the original file with its compressed or decompressed version. It is generally not effective for files already compressed with other lossy or lossless algorithms (e.g., JPEG images, MP3 audio, or zip archives), as re-compressing them often yields little or no size reduction, and can sometimes even increase file size.

COMPRESSION LEVELS

gzip offers various compression levels, controlled by the -num option, where num ranges from 1 to 9. -1 provides the fastest compression speed but results in a larger file size, while -9 offers the best compression ratio but takes longer to process. The default compression level is -6, which strikes a good balance between speed and compression effectiveness. Users can choose a level based on their specific needs for speed versus disk space.

USAGE WITH <I>TAR</I>

A very common pattern in Linux is to combine gzip with the tar command to create compressed archives of directories or multiple files. For example, tar -czf archive.tar.gz directory/ first archives directory/ into archive.tar, then compresses this tar archive using gzip, resulting in a single .tar.gz file. This is the standard way to package and distribute file collections while ensuring a smaller footprint.

HISTORY

gzip, short for GNU zip, was created by Jean-loup Gailly and Mark Adler. It was first released in 1992 as part of the GNU project. Its primary motivation was to replace the proprietary compress program, which used the LZW algorithm that was subject to patent restrictions. gzip's adoption of the patent-free DEFLATE algorithm (also used in the PKZIP file format) quickly made it the standard compression utility in the Unix/Linux world, ensuring free and unencumbered usage and distribution.

SEE ALSO

gunzip(1), zcat(1), zless(1), zmore(1), compress(1), uncompress(1), zip(1), unzip(1), bzip2(1), bunzip2(1), xz(1), unxz(1), tar(1)

Copied to clipboard