dvc-gc
Clean up unused data in DVC repository
TLDR
Garbage collect from the cache, keeping only versions referenced by the current workspace
Garbage collect from the cache, keeping only versions referenced by branch, tags, and commits
Garbage collect from the cache, including the default cloud remote storage (if set)
Garbage collect from the cache, including a specific cloud remote storage
SYNOPSIS
dvc gc [OPTION]...
PARAMETERS
-a, --all
Remove all unused cache files, including those used only by experiments.
--cloud-only
Remove unused files from remote storage only (not local cache).
--exclude-used
Remove files used by experiments (default preserves them).
-f, --force
Force removal without confirmation prompt.
-q, --quiet
Suppress progress bars and non-error messages.
--workspace
Remove uncommitted/untracked files from the workspace.
DESCRIPTION
The dvc gc command performs garbage collection for Data Version Control (DVC) projects. It removes unused files from the DVC cache directory (typically .dvc/cache), which stores large data files, models, and metrics hashed for versioning.
DVC cache grows over time with multiple versions, experiments, and branches. Running dvc gc reclaims disk space by deleting objects not referenced by any tracked data, models, or active experiments.
By default, it preserves cache used by the current workspace, Git-tracked DVC files, and recent experiments. Use options like --all for aggressive cleanup. It's safe for local repos but requires caution with remotes.
Ideal for CI/CD pipelines or after pruning experiments with dvc exp gc. Always commit changes first to avoid data loss.
CAVEATS
Irreversible deletion; backup cache or use Git before running. Does not affect remote storage unless --cloud-only. Run in clean Git state to avoid surprises.
EXAMPLES
dvc gc # Default: safe cleanup
dvc gc -a -f # Aggressive, no prompt
dvc gc --workspace # Clean untracked files
HISTORY
Introduced in DVC v0.21 (2019) by Iterative.ai. Evolved with experiment support in v1.x; now integral for large-scale ML workflows.


