LinuxCommandLibrary

pg_amcheck

Check PostgreSQL database for corruption

TLDR

Check a specific database

$ pg_amcheck [dbname]
copy

Check all databases
$ pg_amcheck [[-a|--all]]
copy

Check databases matching the specified pattern
$ pg_amcheck [[-d|--database]] [pattern]
copy

Check specific tables within a database
$ pg_amcheck [[-t|--table]] [pattern] [dbname]
copy

Check specific schemas within a database
$ pg_amcheck [[-s|--schema]] [pattern] [dbname]
copy

Display help
$ pg_amcheck [[-?|--help]]
copy

SYNOPSIS

pg_amcheck [OPTION]... [DBNAME]

PARAMETERS

-d DBNAME, --dbname=DBNAME
    Specifies the name of the database to check. If not specified, it defaults to the user's name.

-t TABLE, --table=TABLE
    Checks a specific table. This option can be specified multiple times to check several tables. If this is used, -i cannot be used.

-i INDEX, --index=INDEX
    Checks a specific index. This option can be specified multiple times to check several indexes. If this is used, -t cannot be used.

-a, --all
    Checks all relations in the database. Cannot be used with -t or -i.

-c CHECK, --check=CHECK
    Specifies which amcheck function to use for validation. Common options include bt_index_check (default) and bt_amcheck_next_block, with the latter being more thorough.

-j NUM_JOBS, --jobs=NUM_JOBS
    Uses this many concurrent jobs to check relations, speeding up the process on multi-core systems. The default is 1.

-v, --verbose
    Prints verbose messages, including details for successfully checked relations.

-q, --quiet
    Suppresses messages for successfully checked relations, only reporting issues or errors.

-h HOSTNAME, --host=HOSTNAME
    Specifies the host name of the machine on which the database server is running. Defaults to a local Unix domain socket.

-p PORT, --port=PORT
    Specifies the TCP port or local Unix domain socket file extension on which the database server is listening for connections.

-U USERNAME, --username=USERNAME
    Specifies the user name to connect as.

-W, --password
    Forces pg_amcheck to prompt for a password before connecting to a database.

--no-password
    Prevents pg_amcheck from prompting for a password. If the server requires a password and it's not available, the connection attempt will fail.

--no-sync
    Disables syncing of output files to disk. This is primarily for development and testing; not recommended for production.

-V, --version
    Prints the pg_amcheck version and exits.

-?, --help
    Shows help about pg_amcheck command line arguments, then exits.

DESCRIPTION

pg_amcheck is a command-line utility for PostgreSQL that allows users to verify the physical and logical consistency of relations (tables and indexes) within a database. It utilizes the amcheck extension's functions, specifically bt_amcheck_next_block, to traverse and validate B-tree indexes, ensuring that their structure adheres to the expected B-tree invariants. This includes checking sibling pointers, parent pointers, page headers, item pointers, and the ordering of keys.

The utility connects to a running PostgreSQL instance and is invaluable for database administrators to proactively identify data corruption, which could arise from hardware failures, software bugs, or unexpected system crashes, before it leads to data loss or incorrect query results. By running pg_amcheck regularly, DBAs can ensure the health of their indexes, improving reliability and preventing potential downtime associated with corrupted database objects. It can target specific tables or indexes, or check all relations in a database.

CAVEATS

The pg_amcheck utility requires the amcheck extension to be installed and enabled in the target PostgreSQL database (e.g., CREATE EXTENSION amcheck;).

While highly effective for B-tree indexes, it primarily checks B-tree structures and may not cover all types of corruption or other access methods (e.g., hash, GIN, GiST, BRIN indexes).

Running pg_amcheck, especially with the more thorough bt_amcheck_next_block function or against very large databases, can be I/O intensive and might temporarily impact database performance. It only detects corruption; it does not fix it. Repair usually involves rebuilding the affected index with REINDEX or restoring from a backup.

INSTALLATION

To use pg_amcheck, the amcheck extension must first be installed in the target PostgreSQL database. This is typically done by connecting to the database via psql and executing CREATE EXTENSION amcheck;. The extension itself is usually part of the postgresql-contrib package provided by most Linux distributions.

EXIT STATUS

pg_amcheck returns 0 upon successful completion with no corruption found. It returns 1 if corruption is detected in any of the checked relations. An exit status of 2 indicates a server connection issue or another fatal error. This exit status is crucial for scripting and automated monitoring.

HISTORY

The underlying amcheck extension, which provides the core integrity checking functions, was introduced in PostgreSQL 10. The pg_amcheck utility itself was introduced in PostgreSQL 13. Its creation provided a convenient command-line interface for database administrators to easily invoke and script these integrity checks, making proactive corruption detection more accessible than calling amcheck functions directly via SQL queries.

SEE ALSO

reindexdb(1), pg_dump(1), psql(1), VACUUM(7) (SQL command), ANALYZE(7) (SQL command)

Copied to clipboard