dbt
data transformation workflow tool for analytics engineering
TLDR
Initialize a new dbt project
SYNOPSIS
dbt command [options]
DESCRIPTION
dbt (data build tool) is a transformation workflow tool that enables data analysts and engineers to transform data in their warehouse using SQL. It follows software engineering practices like version control, testing, and documentation for data transformations.
dbt works with your existing data warehouse (Snowflake, BigQuery, Redshift, PostgreSQL, etc.) and manages the T in ELT (Extract, Load, Transform). Models are defined as SQL SELECT statements that dbt materializes as tables or views.
The tool provides dependency management between models, automated testing with schema tests and custom tests, documentation generation, and incremental processing for efficient updates of large datasets.
PARAMETERS
COMMAND
dbt command to execute (run, test, build, compile, etc.).--select, -s MODEL
Select specific models to run.--exclude MODEL
Exclude specific models from run.--target, -t TARGET
Target profile to use.--profiles-dir DIR
Directory containing profiles.yml.--project-dir DIR
Directory containing dbt_project.yml.--full-refresh
Rebuild incremental models from scratch.--vars JSON
Pass variables as JSON.--help
Display help information.
CONFIGURATION
~/.dbt/profiles.yml
Connection profiles for data warehouses, including credentials and connection parameters.dbt_project.yml
Project configuration defining models, tests, sources, and project-level settings.
CAVEATS
Requires Python and a connection to a supported data warehouse. Complex dependencies may lead to long DAG resolution times. Resource usage scales with project size. Breaking changes occasionally occur between major versions.
HISTORY
dbt was created by Fishtown Analytics (now dbt Labs) and released in 2016. It pioneered the "analytics engineering" approach, bringing software development practices to data transformation and helping establish the modern data stack paradigm.
