pg_dump

Back up a PostgreSQL database

TLDR

Dump database into an SQL-script file

$ pg_dump [db_name] > [output_file.sql]

Same as above, customize username

$ pg_dump [[-U|--username]] [username] [db_name] > [output_file.sql]

Same as above, customize host and port

$ pg_dump [[-h|--host]] [host] [[-p|--port]] [port] [db_name] > [output_file.sql]

Dump a database into a custom-format archive file

$ pg_dump [[-F|--format]] [[c|custom]] [db_name] > [output_file.dump]

Dump only database data into an SQL-script file

$ pg_dump [[-a|--data-only]] [db_name] > [path/to/output_file.sql]

Dump only schema (data definitions) into an SQL-script file

$ pg_dump [[-s|--schema-only]] [db_name] > [path/to/output_file.sql]

pg_dump [connection-options] [option...] [dbname]

Common connection-options:
-h hostname (Server host)
-p port (Server port)
-U username (Connect as user)
-W (Force password prompt)

PARAMETERS

-f FILENAME, --file=FILENAME
    Specifies the output file or directory name. If omitted, the dump is written to standard output.

-F FORMAT, --format=FORMAT
    Selects the output format: p (plain SQL script), c (custom archive), t (tar archive), or d (directory archive).

-s, --schema-only
    Dump only the database schema (DDL statements), without any data.

-a, --data-only
    Dump only the data, without the schema (DDL statements).

-Z LEVEL, --compress=LEVEL
    Compresses the output file with the specified compression LEVEL (0-9). This option is only applicable for custom and tar archive formats.

-v, --verbose
    Enables verbose mode, displaying detailed progress messages and object information during the dump process.

-j NUM, --jobs=NUM
    Runs NUM jobs simultaneously during the dump. This option significantly speeds up the dump process and is only supported with the directory output format (-F d).

--clean
    Includes SQL commands to clean (DROP) objects before recreating them. This is useful for ensuring a clean slate during restoration.

--create
    Adds a CREATE DATABASE command to the dump, which is useful when restoring the database to a new instance.

--inserts
    Dumps data as INSERT commands (instead of COPY FROM). This is only supported for the plain SQL format and can be much slower.

--if-exists
    Uses IF EXISTS clauses with DROP commands, preventing errors if an object doesn't exist during restoration with --clean.

-x, --no-owner
    Prevents outputting commands to set object ownership, making the dump more portable across different PostgreSQL users.

DESCRIPTION

pg_dump is a utility for backing up a PostgreSQL database. It creates a database dump file, which can be an SQL script or an archive file in a custom format. The SQL script contains SQL commands to recreate the database objects (tables, indexes, functions, etc.) and data. The archive formats (custom, tar, directory) allow for more flexible restoration using pg_restore, including reordering of items, and parallel restoration.

pg_dump produces a consistent snapshot of the database, even if the database is being actively used during the dump process. It does this by starting a transaction and dumping the data as it exists at the moment the transaction began. This makes it an essential tool for logical backups, migrations between different PostgreSQL versions, and creating development/testing environments.

CAVEATS

pg_dump performs a logical backup, which recreates database contents via SQL commands. It does not capture the physical state of data files or WAL segments. For point-in-time recovery or physical backups of very large databases, pg_basebackup or file system level backups are typically more appropriate. While pg_dump ensures consistency via a snapshot, dumping extremely large databases can be time-consuming and resource-intensive on the server.

OUTPUT FORMATS

pg_dump supports several output formats, each suited for different use cases:
Plain (p): Produces a simple SQL script file containing commands to recreate the database. It's human-readable and can be directly executed by psql.
Custom (c): A proprietary, flexible archive format that can be compressed and selectively restored using pg_restore. This is the most recommended format for general use.
Tar (t): Generates a standard Unix tar archive, with each table and blob as a separate file. While it can be manipulated with tar, it offers less flexibility than the custom format for restoration.
Directory (d): Creates a directory containing a file for each table and blob. This format is crucial for enabling parallel dumps using the -j (jobs) option.

DATA CONSISTENCY

pg_dump ensures data consistency by initiating a single transaction and performing the entire dump within that transaction. This means the dump reflects the state of the database exactly as it was at the moment the pg_dump command began, even if concurrent updates or other transactions are occurring. This snapshot isolation is a key feature, guaranteeing a logically consistent backup or migration source.

HISTORY

pg_dump has been a fundamental utility in the PostgreSQL ecosystem since its early days, evolving alongside the database server itself. Initially, it primarily produced plain SQL scripts for simple migrations. Over time, more sophisticated archive formats (custom, tar, directory) were introduced to provide greater flexibility for selective restoration, support for parallel dumps/restores, and better handling of large datasets, making it an indispensable tool for database administration and migration tasks.