tar

Archive multiple files into a single file

TLDR

[c]reate an archive and write it to a [f]ile

$ tar cf [path/to/target.tar] [path/to/file1 path/to/file2 ...]

[c]reate a g[z]ipped archive and write it to a [f]ile

$ tar czf [path/to/target.tar.gz] [path/to/file1 path/to/file2 ...]

[c]reate a g[z]ipped (compressed) archive from a directory using relative paths

$ tar czf [path/to/target.tar.gz] [[-C|--directory]] [path/to/directory] .

E[x]tract a (compressed) archive [f]ile into the current directory [v]erbosely

$ tar xvf [path/to/source.tar[.gz|.bz2|.xz]]

E[x]tract a (compressed) archive [f]ile into the target directory

$ tar xf [path/to/source.tar[.gz|.bz2|.xz]] [[-C|--directory]] [path/to/directory]

[c]reate a compressed archive and write it to a [f]ile, using the file extension to [a]utomatically determine the compression program

$ tar caf [path/to/target.tar.xz] [path/to/file1 path/to/file2 ...]

Lis[t] the contents of a tar [f]ile [v]erbosely

$ tar tvf [path/to/source.tar]

E[x]tract files matching a pattern from an archive [f]ile

$ tar xf [path/to/source.tar] --wildcards "[*.html]"

tar [MODE] [OPTIONS] [ARCHIVE_FILE] [FILE_OR_DIRECTORY...]

Common usage modes and their primary options:
Create an archive:
tar -c[zvfjJ] -f ARCHIVE_FILE [FILE_OR_DIRECTORY...]

Extract from an archive:
tar -x[zvfjJ] -f ARCHIVE_FILE [FILE_OR_DIRECTORY...]

List contents of an archive:
tar -t[vf] -f ARCHIVE_FILE

PARAMETERS

-c, --create
    Creates a new archive.

-x, --extract, --get
    Extracts files from an archive.

-t, --list
    Lists the contents of an archive.

-u, --update
    Appends files to an archive only if they are newer than their copy in the archive or not already in the archive.

-r, --append
    Appends files to the end of an archive.

-f ARCHIVE, --file ARCHIVE
    Specifies the archive file to work with (e.g., archive.tar, /dev/tape).

-v, --verbose
    Displays the names of files processed in a verbose manner.

-z, --gzip, --gunzip, --ungzip
    Filters the archive through gzip for compression or decompression.

-j, --bzip2
    Filters the archive through bzip2 for compression or decompression.

-J, --xz
    Filters the archive through xz for compression or decompression.

-C DIRECTORY, --directory DIRECTORY
    Changes to the specified DIRECTORY before performing the operation.

--exclude PATTERN
    Excludes files or directories that match the specified PATTERN from the archive.

--strip-components=NUMBER
    Removes the specified NUMBER of leading directory components from file names during extraction.

-p, --preserve-permissions
    Preserves file permissions (and modes) during extraction.

-k, --keep-old-files
    Does not overwrite existing files when extracting.

DESCRIPTION

The tar (tape archive) command is a powerful and widely used command-line utility in Linux and Unix-like operating systems. It is primarily used for collecting multiple files and/or directories into a single archive file, often referred to as a 'tarball'. While its name originates from its historical use with tape drives, tar is now predominantly used to create and manipulate archives on disk.

Beyond simply bundling files, tar can preserve file attributes such as permissions, ownership, timestamps, and symbolic links. It doesn't inherently compress files, but it is often used in conjunction with compression utilities like gzip, bzip2, or xz (e.g., resulting in files like .tar.gz, .tgz, .tar.bz2, or .tar.xz). This makes tar an indispensable tool for backups, distributing software, or transferring collections of files.

CAVEATS

When using tar, be mindful of absolute versus relative paths. Archiving with absolute paths (e.g., /etc/fstab) can lead to unintended overwrites if extracted at the root of a different system. It's generally safer to change into the parent directory and use relative paths.

While tar preserves permissions and ownership, extracting files as a regular user into areas where they lack write permissions or where UIDs/GIDs do not match can lead to permission errors or unexpected ownership changes.

COMMON COMPRESSION INTEGRATION

tar does not compress files by itself, but it seamlessly integrates with various compression programs. The most common way to create a compressed tarball is to use the -z (for gzip), -j (for bzip2), or -J (for xz) flags directly with the create or extract operations. For example, tar -czvf archive.tar.gz files/ creates a gzipped tarball, and tar -xjvf archive.tar.bz2 extracts a bzip2 compressed tarball.

DIRECTORY STRUCTURE AND PATHS

When archiving a directory, tar by default includes the directory itself and its entire contents. For instance, tar -cvf backup.tar mydir/ will create an archive where mydir/ is the top-level entry. If you only want the contents of mydir/ to be at the top level of the archive, you should use the -C option: tar -cvf backup.tar -C mydir . (archiving the current directory . after changing into mydir).

HISTORY

The tar command, short for "tape archive", originated in the early days of Unix, specifically around Version 7 Unix in 1979. Its initial purpose was to provide a simple and reliable way to store multiple files on sequential access storage devices like magnetic tape drives for backup and restoration.

Over decades, despite its 'tape' heritage, tar evolved significantly. It transitioned from primarily tape use to disk-based archiving, and features like integration with compression tools (gzip, bzip2, xz) were added, making it an incredibly versatile and ubiquitous archiving utility across Unix-like operating systems today.

tar