dagster
data orchestration platform for software-defined assets
TLDR
Start development server
SYNOPSIS
dagster command [options]
dagster-daemon run [options]
dagster-webserver [options]
DESCRIPTION
dagster is the CLI for Dagster, a data orchestration platform built around software-defined assets. It manages the development environment, job execution, and infrastructure.
dagster dev starts both the webserver (UI) and daemon (schedules, sensors) for local development. In production, run dagster-webserver and dagster-daemon separately.
Software-defined assets are the core abstraction—functions that produce data assets with dependencies. Assets form a DAG that Dagster materializes. Jobs group assets for execution.
Schedules trigger jobs on cron patterns; sensors trigger based on external events. Both require the daemon process to run.
The webserver provides a UI showing asset lineage, run history, logs, and metrics. The asset graph visualizes data dependencies.
PARAMETERS
-m, --module name
Python module containing definitions.-f, --file path
Python file containing definitions.-j, --job name
Job name.-p, --port port
Webserver port. Default: 3000.-h, --host host
Webserver host. Default: localhost.-w, --workspace file
Workspace YAML file.-d, --working-directory path
Working directory for code.
CONFIGURATION
dagster.yaml
Instance configuration file controlling storage, compute, and run settings.workspace.yaml
Workspace configuration defining code locations and repositories.
COMMANDS
dev
Start development server (webserver + daemon).job execute|list|print
Manage and run jobs.asset materialize|list|wipe
Manage software-defined assets.schedule list|start|stop|preview
Manage schedules.sensor list|start|stop|preview
Manage sensors.project scaffold|from-example
Create new projects.definitions validate
Validate code definitions.instance info|migrate
Manage Dagster instance.run list|delete|terminate
Manage pipeline runs.
CAVEATS
Daemon is required for schedules and sensors. Asset materialization tracks state in instance storage. Production requires PostgreSQL for run storage. Some features require Dagster+ (cloud).
HISTORY
Dagster was created by Elementl, founded by Nick Schrock (co-creator of GraphQL) in 2018. The project introduced the concept of software-defined assets as an improvement over task-oriented workflows. Version 1.0 released in 2022. Dagster emphasizes developer experience with type checking, testing utilities, and local development. The company offers Dagster+ for managed cloud orchestration.
