borg

Deduplicating, authenticated, and compressed backup archiving

TLDR

Initialize a (local) repository

$ borg init [path/to/repo_directory]

Backup a directory into the repository, creating an archive called "Monday"

$ borg create --progress [path/to/repo_directory]::[Monday] [path/to/source_directory]

List all archives in a repository

$ borg list [path/to/repo_directory]

Extract a specific directory from the "Monday" archive in a remote repository, excluding all *.ext files

$ borg extract [user]@[host]:[path/to/repo_directory]::[Monday] [path/to/target_directory] --exclude '[*.ext]'

Prune a repository by deleting all archives older than 7 days, listing changes

$ borg prune --keep-within [7d] --list [path/to/repo_directory]

Mount a repository as a FUSE filesystem

$ borg mount [path/to/repo_directory]::[Monday] [path/to/mountpoint]

Display help on creating archives

$ borg create --help

borg [OPTIONS] COMMAND [ARGS...]

Common commands and examples:
borg init --encryption=repokey-blake2 /path/to/repo
borg create /path/to/repo::archive_name /source/path
borg extract /path/to/repo::archive_name
borg list /path/to/repo
borg prune /path/to/repo --keep-daily=7

PARAMETERS

--help
    Show a help message for the command or subcommand and exit.

--version
    Show the program's version number and exit.

--verbose, -v
    Increase verbosity (can be specified multiple times for more detailed output).

--debug
    Enable debug messages, providing very detailed output for troubleshooting.

--json
    Output machine-readable JSON formatted data where applicable.

--show-version
    Show Borg version and platform details.

--lock-wait TIMEOUT
    Wait for the repository lock to become available for up to TIMEOUT seconds.

--no-files-cache
    Disable the local files cache, which normally speeds up subsequent backups.

--file-cache-dir DIR
    Specify a different directory for the files cache.

--rsh COMMAND
    Specify the remote shell command to use for SSH connections (e.g., 'ssh -i ~/.ssh/id_rsa').

--bypass-lock
    Bypass repository lock. Use with extreme caution, as it can lead to repository corruption.

DESCRIPTION

borg (BorgBackup) is a highly efficient and secure deduplicating backup program. Its primary goal is to provide an easy-to-use, robust, and performant solution for data backup, suitable for both local and remote scenarios. Borg achieves significant storage savings through its innovative deduplication process, which breaks files into variable-size chunks and stores only unique chunks. This means that if you back up multiple versions of a file, or many similar files, only the unique data is stored, saving considerable disk space.

Security is a core feature, offering authenticated encryption to protect your data's confidentiality and integrity. Various strong encryption ciphers (e.g., AES, ChaCha20) and MAC algorithms (e.g., Blake2) are supported. Data compression is also built-in, with options like zlib, lz4, and zstd available to further reduce storage requirements and network transfer sizes.

Borg supports remote backups over SSH, treating remote repositories as if they were local. It uses a fuse filesystem to mount archives as regular directories, allowing easy browsing and extraction of files. Its robust design handles large datasets efficiently and provides strong data integrity checks, making it a reliable choice for critical backups.

CAVEATS

Resource Usage: Borg can be CPU-intensive during backup and restore operations due to encryption, decryption, compression, and decompression. It also requires sufficient RAM for its chunking and caching mechanisms, especially for large datasets.

Repository Integrity: Interrupting Borg operations (e.g., network failure, power loss) without proper cleanup can lead to repository corruption. The borg check command is crucial for verifying integrity.

Key Management: Loss of the encryption passphrase or key file means permanent data loss. Securely managing your encryption keys is paramount.

Performance: While generally fast, performance can vary greatly depending on CPU speed, I/O throughput, network latency (for remote repos), and the chosen encryption/compression algorithms.

REPOSITORY CONCEPT

A borg repository is the central storage location for all your backup archives. It's an ordinary directory (or accessible via SSH for remote repos) that contains the deduplicated, encrypted, and compressed data. All operations (creating, listing, extracting, checking, pruning) are performed against a specified repository.

DEDUPLICATION MECHANISM

Borg's deduplication works by segmenting files into variable-sized chunks using a rolling hash algorithm. Each chunk is then hashed, and only chunks with unique hashes are stored in the repository. If a chunk already exists, a reference is stored instead, significantly reducing storage requirements, especially across multiple backups of similar data or identical files. This content-addressable storage model is fundamental to its efficiency.

SECURITY FEATURES

Borg provides strong security through authenticated encryption. This means not only is your data encrypted (confidentiality), but its integrity is also verified, preventing unauthorized modification. It supports various encryption modes like `repokey` (key derived from passphrase, stored in repo), `repokey-blake2` (adds Blake2 for stronger KDF), and `keyfile` (key stored in an external file).

REMOTE OPERATION

Borg supports backing up to and restoring from remote repositories over SSH. The remote borg process handles all cryptographic and deduplication operations on the remote end, minimizing the amount of data transferred and improving performance for remote backups.

HISTORY

BorgBackup originated as a fork of the attic backup program. The fork occurred around 2015 due to disagreements within the attic development community regarding its direction and pace of development. Borg's creators aimed for more active development, feature additions (especially enhanced security features like authenticated encryption), and performance improvements. It quickly gained popularity due to its robust design, high performance, and commitment to active maintenance and new features, becoming a de facto standard for many Linux users seeking a modern, deduplicating backup solution.