LinuxCommandLibrary

pg_combinebackup

Combine PostgreSQL full with incremental backups

TLDR

Combine a full and incremental backup into one synthetic full backup

$ pg_combinebackup [path/to/full_backup] [path/to/incremental_backup] [[-o|--output]] [path/to/output_directory]
copy

Perform a dry run to show what would be done, without creating files
$ pg_combinebackup [[-n|--dry-run]] [path/to/full_backup] [path/to/incremental_backup] [[-o|--output]] [path/to/output_directory]
copy

Use hard links instead of copying files (faster, same filesystem required)
$ pg_combinebackup [[-k|--link]] [path/to/full_backup] [path/to/incremental_backup] [[-o|--output]] [path/to/output_directory]
copy

Use file cloning (reflinks) for efficient copy if supported
$ pg_combinebackup --clone [path/to/full_backup] [path/to/incremental_backup] [[-o|--output]] [path/to/output_directory]
copy

Use the copy_file_range system call for efficient copying
$ pg_combinebackup --copy-file-range [path/to/full_backup] [path/to/incremental_backup] [[-o|--output]] [path/to/output_directory]
copy

Relocate a tablespace during reconstruction
$ pg_combinebackup [path/to/backup1 path/to/backup2 ...] [[-T|--tablespace-mapping]] /[path/to/old_tablespace]=/[path/to/new_tablespace] [[-o|--output]] [path/to/output_directory]
copy

Disable fsync for faster but unsafe writes (testing only)
$ pg_combinebackup [[-N|--no-sync]] [path/to/backup1 path/to/backup2 ...] [[-o|--output]] [path/to/output_directory]
copy

Show detailed debug output
$ pg_combinebackup [[-d|--debug]] [path/to/backup1 path/to/backup2 ...] [[-o|--output]] [path/to/output_directory]
copy

SYNOPSIS

pg_combinebackup [option...] <old_base_backup_directory> <incremental_backup_directory...> <new_base_backup_directory>

PARAMETERS

-d, --debug
    Enables debug output for detailed information about the operation.

-o, --overwrite
    Allows the new_base_backup_directory to exist. If it exists, its contents will be emptied before the combined backup is written.

--no-sync
    By default, pg_combinebackup will try to sync all files to disk. This option disables that, which can be faster but is less safe if the system crashes.

--progress
    Shows progress information during the backup combination process.

--dry-run
    Performs a simulation of the combination process without actually creating the new backup, showing what would be done.

-v, --verbose
    Enables verbose output for more detailed status messages.

-V, --version
    Prints the pg_combinebackup version and exits.

--help
    Shows help about pg_combinebackup command line arguments and exits.

old_base_backup_directory
    The path to the initial, full base backup directory that serves as the starting point for the combination.

incremental_backup_directory...
    One or more paths to incremental backup directories, which must be specified in chronological order. These contain the changes to be applied.

new_base_backup_directory
    The path where the newly consolidated base backup will be created. This directory must not exist unless --overwrite is specified.

DESCRIPTION

pg_combinebackup is a utility for PostgreSQL that consolidates an existing base backup with one or more subsequent incremental backups into a new, up-to-date base backup. This process effectively merges the changes captured by the incremental backups into the original full backup, creating a fresh, consolidated base backup. The primary benefit is streamlining the recovery process by reducing the number of backups that need to be applied during a restore operation, thereby improving the Recovery Time Objective (RTO). It operates on physical file-level backups, not logical dumps, and does not alter the original backups, but rather creates a new one at the specified destination. This command is a crucial tool for managing backup chains, making them more manageable and efficient over time, especially in environments utilizing PostgreSQL's native incremental backup features introduced in recent versions.

CAVEATS

Requires a full base backup as the initial input.
All intermediate incremental backups in the chain must be provided in chronological order to successfully create the new base backup.
The target directory (new_base_backup_directory) must not exist, unless the --overwrite option is used, in which case it will be emptied.
This utility is designed for physical file-level backups of a PostgreSQL data directory, not for logical backups generated by pg_dump.
The operation can be I/O intensive, especially with large backups or many incremental steps.

BACKUP STRATEGY INTEGRATION

pg_combinebackup is a key component in an advanced PostgreSQL backup strategy involving incremental backups. Instead of constantly performing full base backups, which can be resource-intensive, an organization can take a full base backup once, then periodically take incremental backups. Over time, the chain of incremental backups can become long, increasing restore times. pg_combinebackup allows for 'rolling up' these increments into a fresh base backup, shortening the backup chain and ensuring faster restores without needing to take a brand new full backup from a running server.

HISTORY

pg_combinebackup was officially introduced in PostgreSQL 15. Its development addressed a long-standing need for native, robust incremental backup management within the PostgreSQL ecosystem. Prior to its inclusion, users often relied on file-system level tools (like rsync or block-level snapshots) or third-party backup solutions to achieve similar incremental backup and consolidation functionalities. This command significantly enhances PostgreSQL's built-in backup and recovery capabilities, making it easier to manage complex backup strategies and reduce recovery times.

SEE ALSO

pg_basebackup(1), pg_wal_replay_pause(1), pg_waldump(1)

Copied to clipboard