dvc-diff
Show changes between DVC tracked data versions
TLDR
Compare DVC tracked files from different Git commits, tags, and branches w.r.t the current workspace
Compare the changes in DVC tracked files from 1 Git commit to another
Compare DVC tracked files, along with their latest hash
Compare DVC tracked files, displaying the output as JSON
Compare DVC tracked files, displaying the output as Markdown
SYNOPSIS
dvc diff [
PARAMETERS
--all
Show all stages in the DVC project (default).
--diff-filter=<[ADMRTUXB*]>
Filter diff by the type of change.
--granular
Show diffs in a more granular way.
--old
Show only old paths.
--new
Show only new paths.
-q, --quiet
Suppress any output.
-h, --help
Show help message and exit.
-v, --verbose
Increase verbosity level.
-o, --out
Write output to a file.
--targets
Limit command scope to these DVC-files or directories with DVC-files.
Base revision to compare against (commit, tag, branch).
Head revision to compare (commit, tag, branch).
DESCRIPTION
dvc diff
compares two DVC repositories, commits, or tags, displaying the differences in tracked data and pipelines. It helps track changes in data science projects by identifying which data files, dependencies, or outputs have been modified between different versions. This command is essential for understanding the impact of code or data changes on the project's overall state, ensuring reproducibility, and facilitating collaborative development. The output is formatted similarly to standard `git diff`, focusing on the differences in DVC-tracked files and dependencies. It allows you to see the changes to data files, metrics, parameters, or any other outputs defined in your dvc.yaml
files. This is crucial for debugging, understanding the evolution of your data, and for code review processes.
CAVEATS
Requires a DVC repository to be initialized and configured correctly. The command's effectiveness relies on the accuracy and completeness of the dvc.yaml
files in tracking dependencies and outputs.
EXAMPLE USAGE
To compare the current state with the latest committed version: dvc diff
To compare between two specific commits: dvc diff commit1 commit2
To only show added data files: dvc diff --diff-filter=A
SEE ALSO
git diff(1), dvc status(1), dvc dag(1)