gzip
Compress files to reduce their size
TLDR
Compress a file, replacing it with a gzip archive
Decompress a file, replacing it with the original uncompressed version
Compress a file, keeping the original file
Compress a file, specifying the output filename
Decompress a gzip archive specifying the output filename
Specify the compression level. 1 is the fastest (low compression), 9 is the slowest (high compression), 6 is the default
Display the name and reduction percentage for each file compressed or decompressed
SYNOPSIS
gzip [OPTION]... [FILE]...
gunzip [OPTION]... [FILE]...
zcat [OPTION]... [FILE]...
PARAMETERS
-c, --stdout, --to-stdout
Write on standard output, keep original files unchanged. Useful for piping output.
-d, --decompress, --uncompress
Decompress a compressed file. Equivalent to using the gunzip command.
-f, --force
Force compression or decompression even if the file has multiple links, already exists, or has a different suffix.
-h, --help
Display a help message and exit.
-l, --list
List the uncompressed file name, uncompressed size, compressed size, and compression ratio for each compressed file.
-n, --no-name
Do not save or restore the original file name and time stamp when compressing/decompressing.
-N, --name
Always save or restore the original file name and time stamp. This is the default when decompressing.
-q, --quiet
Suppress all warning messages.
-r, --recursive
Traverse the directory structure recursively. If a file is a directory, gzip will descend into it.
-S .suf, --suffix .suf
Use .suf as the suffix for compressed files instead of .gz.
-t, --test
Test the integrity of a compressed file without decompressing it.
-v, --verbose
Display the name and percentage reduction for each file compressed or decompressed.
-V, --version
Display the version number and exit.
-num
Set the compression level, where num is a digit from 1 to 9. -1 (or --fast) is the fastest compression (less compression), and -9 (or --best) is the slowest but provides the best compression. The default is -6.
DESCRIPTION
gzip (GNU zip) is a popular command-line utility used for compressing and decompressing files. It employs the DEFLATE algorithm, a combination of LZ77 and Huffman coding, which is known for its good compression ratio and speed. By default, gzip replaces the original file with its compressed version, appending a .gz suffix (e.g., filename.txt becomes filename.txt.gz).
While gzip excels at compressing individual files, it does not archive directories or multiple files into a single archive; for that, it's commonly used in conjunction with archiving tools like tar. gzip is stream-oriented, meaning it can compress or decompress data piped to and from standard input/output, making it versatile for scripting and pipeline operations. Its widespread adoption makes it a de facto standard for single-file compression in Unix-like systems.
CAVEATS
gzip is designed for single-file compression and does not preserve directory structures or combine multiple files into a single archive. For this purpose, it is typically used in conjunction with archiving tools like tar. By default, gzip replaces the original file with its compressed or decompressed version. It is generally not effective for files already compressed with other lossy or lossless algorithms (e.g., JPEG images, MP3 audio, or zip archives), as re-compressing them often yields little or no size reduction, and can sometimes even increase file size.
COMPRESSION LEVELS
gzip offers various compression levels, controlled by the -num option, where num ranges from 1 to 9. -1 provides the fastest compression speed but results in a larger file size, while -9 offers the best compression ratio but takes longer to process. The default compression level is -6, which strikes a good balance between speed and compression effectiveness. Users can choose a level based on their specific needs for speed versus disk space.
USAGE WITH <I>TAR</I>
A very common pattern in Linux is to combine gzip with the tar command to create compressed archives of directories or multiple files. For example, tar -czf archive.tar.gz directory/ first archives directory/ into archive.tar, then compresses this tar archive using gzip, resulting in a single .tar.gz file. This is the standard way to package and distribute file collections while ensuring a smaller footprint.
HISTORY
gzip, short for GNU zip, was created by Jean-loup Gailly and Mark Adler. It was first released in 1992 as part of the GNU project. Its primary motivation was to replace the proprietary compress program, which used the LZW algorithm that was subject to patent restrictions. gzip's adoption of the patent-free DEFLATE algorithm (also used in the PKZIP file format) quickly made it the standard compression utility in the Unix/Linux world, ensuring free and unencumbered usage and distribution.