LinuxCommandLibrary

dvc-commit

Save changes to tracked data and pipelines

TLDR

Commit changes to all DVC-tracked files and directories

$ dvc commit
copy

Commit changes to a specified DVC-tracked target
$ dvc commit [target]
copy

Recursively commit all DVC-tracked files in a directory
$ dvc commit [[-R|--recursive]] [path/to/directory]
copy

SYNOPSIS

dvc commit [TARGETS] [-m MESSAGE] [-R] [--all-commit] [--exclude PATTERNS] [--include PATTERNS] [-q] [-v]

PARAMETERS

-m , --message
    Specify custom commit message

-a, --all-commit
    Commit pipeline stage changes in dvc.yaml

-R, --recursive
    Commit changes recursively in directories

--exclude
    Exclude paths matching glob patterns

--include
    Include only paths matching glob patterns

-q, --quiet
    Suppress non-error output

-v, --verbose
    Display detailed process information

-h, --help
    Show help message and exit

DESCRIPTION

The dvc commit command permanently records changes to data files and pipeline stages tracked by DVC (Data Version Control). After modifying tracked data or updating pipeline dependencies, run dvc commit to update the corresponding .dvc files with new checksums (MD5 hashes). This creates a new version in DVC's cache without storing large data in Git.

It detects changes by comparing current checksums against those in .dvc files. For pipelines, use --all-commit to also update dvc.yaml stages. Committing enables versioning, reproduction, and sharing via dvc push to remote storage.

Supports recursive operation on directories and pattern filtering. Ensures reproducibility in ML workflows by linking code, data, and models. Always commit after dvc add or data modifications.

CAVEATS

Requires initialized DVC repo (dvc init). Data must be tracked via dvc add first. Does not push to remotes; use dvc push afterward. Git commit .dvc files separately.

EXAMPLE

dvc commit
dvc commit data.dvc model.dvc
dvc commit -R data/ -m "Update dataset" --all-commit

WORKFLOW

Typical flow: dvc add data.csv → modify data → dvc commitgit add data.csv.dvcgit commitdvc push

HISTORY

Introduced in DVC v0.1 (2017) by Iterative.ai. Evolved to support pipelines in v0.6 (2018). Core to DVC's Git-like data versioning; active development continues with enhanced pipeline support.

SEE ALSO

dvc(1), dvc-add(1), dvc-push(1), dvc-repro(1), git-commit(1)

Copied to clipboard