tar
Archive multiple files into a single file
TLDR
[c]reate an archive and write it to a [f]ile
[c]reate a g[z]ipped archive and write it to a [f]ile
[c]reate a g[z]ipped (compressed) archive from a directory using relative paths
E[x]tract a (compressed) archive [f]ile into the current directory [v]erbosely
E[x]tract a (compressed) archive [f]ile into the target directory
[c]reate a compressed archive and write it to a [f]ile, using the file extension to [a]utomatically determine the compression program
Lis[t] the contents of a tar [f]ile [v]erbosely
E[x]tract files matching a pattern from an archive [f]ile
SYNOPSIS
tar [MODE] [OPTIONS] [ARCHIVE_FILE] [FILE_OR_DIRECTORY...]
Common usage modes and their primary options:
Create an archive:
tar -c[zvfjJ] -f ARCHIVE_FILE [FILE_OR_DIRECTORY...]
Extract from an archive:
tar -x[zvfjJ] -f ARCHIVE_FILE [FILE_OR_DIRECTORY...]
List contents of an archive:
tar -t[vf] -f ARCHIVE_FILE
PARAMETERS
-c, --create
Creates a new archive.
-x, --extract, --get
Extracts files from an archive.
-t, --list
Lists the contents of an archive.
-u, --update
Appends files to an archive only if they are newer than their copy in the archive or not already in the archive.
-r, --append
Appends files to the end of an archive.
-f ARCHIVE, --file ARCHIVE
Specifies the archive file to work with (e.g., archive.tar, /dev/tape).
-v, --verbose
Displays the names of files processed in a verbose manner.
-z, --gzip, --gunzip, --ungzip
Filters the archive through gzip for compression or decompression.
-j, --bzip2
Filters the archive through bzip2 for compression or decompression.
-J, --xz
Filters the archive through xz for compression or decompression.
-C DIRECTORY, --directory DIRECTORY
Changes to the specified DIRECTORY before performing the operation.
--exclude PATTERN
Excludes files or directories that match the specified PATTERN from the archive.
--strip-components=NUMBER
Removes the specified NUMBER of leading directory components from file names during extraction.
-p, --preserve-permissions
Preserves file permissions (and modes) during extraction.
-k, --keep-old-files
Does not overwrite existing files when extracting.
DESCRIPTION
The tar (tape archive) command is a powerful and widely used command-line utility in Linux and Unix-like operating systems. It is primarily used for collecting multiple files and/or directories into a single archive file, often referred to as a 'tarball'. While its name originates from its historical use with tape drives, tar is now predominantly used to create and manipulate archives on disk.
Beyond simply bundling files, tar can preserve file attributes such as permissions, ownership, timestamps, and symbolic links. It doesn't inherently compress files, but it is often used in conjunction with compression utilities like gzip, bzip2, or xz (e.g., resulting in files like .tar.gz, .tgz, .tar.bz2, or .tar.xz). This makes tar an indispensable tool for backups, distributing software, or transferring collections of files.
CAVEATS
When using tar, be mindful of absolute versus relative paths. Archiving with absolute paths (e.g., /etc/fstab) can lead to unintended overwrites if extracted at the root of a different system. It's generally safer to change into the parent directory and use relative paths.
While tar preserves permissions and ownership, extracting files as a regular user into areas where they lack write permissions or where UIDs/GIDs do not match can lead to permission errors or unexpected ownership changes.
COMMON COMPRESSION INTEGRATION
tar does not compress files by itself, but it seamlessly integrates with various compression programs. The most common way to create a compressed tarball is to use the -z (for gzip), -j (for bzip2), or -J (for xz) flags directly with the create or extract operations. For example, tar -czvf archive.tar.gz files/ creates a gzipped tarball, and tar -xjvf archive.tar.bz2 extracts a bzip2 compressed tarball.
DIRECTORY STRUCTURE AND PATHS
When archiving a directory, tar by default includes the directory itself and its entire contents. For instance, tar -cvf backup.tar mydir/ will create an archive where mydir/ is the top-level entry. If you only want the contents of mydir/ to be at the top level of the archive, you should use the -C option: tar -cvf backup.tar -C mydir . (archiving the current directory . after changing into mydir).
HISTORY
The tar command, short for "tape archive", originated in the early days of Unix, specifically around Version 7 Unix in 1979. Its initial purpose was to provide a simple and reliable way to store multiple files on sequential access storage devices like magnetic tape drives for backup and restoration.
Over decades, despite its 'tape' heritage, tar evolved significantly. It transitioned from primarily tape use to disk-based archiving, and features like integration with compression tools (gzip, bzip2, xz) were added, making it an incredibly versatile and ubiquitous archiving utility across Unix-like operating systems today.