LinuxCommandLibrary

dbt

Transform data in your data warehouse

TLDR

Debug the dbt project and the connection to the database

$ dbt debug
copy

Run all models of the project
$ dbt run
copy

Run all tests of example_model
$ dbt test --select example_model
copy

Build (load seeds, run models, snapshots, and tests associated with) example_model and its downstream dependents
$ dbt build --select example_model+
copy

Build all models, except the ones with the tag not_now
$ dbt build --exclude "tag:not_now"
copy

Build all models with tags one and two
$ dbt build --select "tag:one,tag:two"
copy

Build all models with tags one or two
$ dbt build --select "tag:one tag:two"
copy

SYNOPSIS

dbt [global flags...] <command> [<args>...]
Common commands: debug, deps, docs-generate, docs-serve, ls, parse, run, seed, snapshot, sql, test

PARAMETERS

--version
    Display current dbt version

--help
    Show help message

--log-level DEBUG|INFO|WARN|ERROR
    Set logging verbosity

--project-dir PATH
    Specify dbt project directory (default: current)

--profiles-dir PATH
    Specify profiles.yml directory (default: ~/.dbt)

--target TARGET
    Specify target config in profiles.yml

--threads INTEGER
    Number of worker threads to use (default: 4)

--vars '{key: value}'
    Set runtime variables for models

--select PATH|TAG|MODEL
    Select specific resources to run

--exclude PATH|TAG|MODEL
    Exclude specific resources

--models MODEL
    DEPRECATED; use --select

--full-refresh
    Run full refresh on incremental models

--state ARTIFACT_PATH
    Compare against prior state for change detection

DESCRIPTION

dbt (data build tool) is a command-line tool that enables analytics engineers to transform data in warehouses more effectively. It treats data transformation as code, allowing SQL models to be version-controlled, tested, documented, and scheduled.

dbt compiles and executes SQL SELECT statements defined in modular .sql files, building a dependency graph (DAG) of models, sources, seeds, snapshots, and tests. It supports incremental models for efficiency, generic testing, singular testing, and auto-generated documentation.

Integrates with warehouses like Snowflake, BigQuery, Postgres, Redshift via adapters. Requires Python and installation via pip. Ideal for production data pipelines with CI/CD integration.

Usage involves creating a dbt project with dbt init, defining models in models/ dir, and running commands like dbt run or dbt test.

CAVEATS

Not a native Linux command; requires Python 3.8+ and pip installation (e.g., pip install dbt-core dbt-postgres). Needs warehouse adapter and profiles.yml config. Resource-intensive on large projects.

INSTALLATION

pip install dbt-core
pip install dbt-<adapter> (e.g., dbt-snowflake, dbt-bigquery)

PROJECT SETUP

dbt init my_project creates boilerplate with profiles.yml and models/

KEY SUBCOMMANDS

run: compile+execute models
test: run tests
docs-generate: build docs site
seed: load CSV data

HISTORY

Created by Tristan Handy in 2016 at Fishtown Capital for internal use. Open-sourced in 2018, acquired by dbt Labs. Reached v1.0 in Jan 2022 with semantic layer features. Widely adopted in data teams; v1.8+ supports Python models.

SEE ALSO

pip(1), python3(1), make(1)

Copied to clipboard