pgbench
Benchmark PostgreSQL database performance
TLDR
Initialize a database with a scale factor of 50 times the default size
Benchmark a database with 10 clients, 2 worker threads, and 10,000 transactions per client
SYNOPSIS
pgbench [option...] [dbname]
PARAMETERS
-i, --initialize
Initializes the pgbench tables (pgbench_accounts, pgbench_branches, pgbench_history, pgbench_tellers) in the specified database. This must be run before executing any benchmark.
-s factor, --scale=factor
Multiplies the number of rows generated during initialization (`-i`). A scale factor of 1 creates 100,000 accounts. Larger factors create larger datasets.
-c clients, --client=clients
Sets the number of concurrent benchmark clients (sessions) that will connect to the database. Each client simulates a user running transactions.
-j threads, --jobs=threads
Sets the number of worker threads to use within pgbench. If multiple threads are used, clients are distributed among them. Default is 1.
-t transactions, --transactions=transactions
Sets the number of transactions each client will execute. The benchmark runs until all clients complete their specified transactions.
-T seconds, --time=seconds
Sets the total duration of the benchmark in seconds. The benchmark will run for this specified time, overriding the number of transactions per client.
-f filename, --file=filename
Reads transaction scripts from the specified file(s) instead of using the built-in benchmark. Allows for custom workload simulation.
-D var=value, --define=var=value
Defines a variable for use within custom SQL scripts. Variables can be referenced as `$(var)` within the script.
-M name, --builtin=name
Selects a built-in benchmark script. Available options include `tpcb-like` (default), `simple-update`, `select-only`, etc.
-R rate, --rate=rate
Sets a target transaction rate in transactions per second. pgbench will attempt to maintain this rate, introducing delays if necessary.
-L ms, --latency-limit=ms
Sets a latency limit in milliseconds. Transactions exceeding this limit are counted as 'late' and reported separately.
-l, --log
Writes information about each individual transaction (start time, end time, client ID, transaction number, latency) to a log file.
-N, --no-vacuum
Skips the `VACUUM` and `ANALYZE` commands normally run after initialization (`-i`). Useful for testing the impact of un-vacuumed tables.
-P seconds, --progress=seconds
Shows a progress report every seconds, indicating the number of transactions processed and the current TPS.
-U username, --username=username
Connects to the database as the specified user.
-h hostname, --host=hostname
Specifies the database server host name or socket directory.
-p port, --port=port
Specifies the database server port number.
DESCRIPTION
pgbench is a simple program for running benchmark tests on PostgreSQL. It executes a predefined or custom sequence of SQL commands repeatedly, often in multiple concurrent sessions, to simulate real-world database load. The primary goal is to compute the average transaction rate (transactions per second, TPS) and measure transaction latency.
It comes with a built-in benchmark script that mimics a TPC-B-like workload, involving a mix of `UPDATE`, `INSERT`, `SELECT`, and `DELETE` operations. Users can also provide their own SQL scripts for more specific application workload simulation. pgbench is an invaluable tool for performance tuning, capacity planning, and regression testing of PostgreSQL database systems, helping identify bottlenecks and evaluate configuration changes.
CAVEATS
pgbench measures performance from the client's perspective, which includes network latency. This might not always precisely reflect server-side bottlenecks. For accurate server performance analysis, it's recommended to run pgbench on a separate machine from the database server. The built-in benchmark scripts are generic; custom scripts are essential for simulating specific application workloads realistically.
TYPICAL WORKFLOW
A common pgbench workflow involves first initializing the test database tables using `pgbench -i dbname`, often accompanied by a scaling factor (`-s`) to control the dataset size. After initialization, the benchmark is executed using `pgbench [options] dbname`, where key parameters like concurrent clients (`-c`), worker threads (`-j`), and total duration (`-T`) or transactions per client (`-t`) are specified. The output provides essential performance metrics.
KEY METRICS
The primary output of pgbench highlights the transactions per second (TPS), indicating the database's throughput under the simulated load. It also provides detailed latency statistics, including the average, minimum, maximum, and standard deviation of transaction execution times. For rate-controlled benchmarks, the number of 'late' transactions (exceeding a defined latency limit) is also reported. These metrics are crucial for assessing and comparing database performance under different configurations and loads.
HISTORY
pgbench was first introduced in PostgreSQL 8.0, released in 2005, providing an integral benchmarking tool for the database system. Its inclusion addressed the need for a standardized, repeatable, and customizable way to measure PostgreSQL performance. Since its inception, it has undergone continuous development, gaining features such as support for custom SQL scripts, detailed transaction logging, progress reporting, and advanced rate/latency control options, evolving into a robust utility for performance analysis.