LinuxCommandLibrary

dvc-gc

Clean up unused data in DVC repository

TLDR

Garbage collect from the cache, keeping only versions referenced by the current workspace

$ dvc gc [[-w|--workspace]]
copy

Garbage collect from the cache, keeping only versions referenced by branch, tags, and commits
$ dvc gc [[-a|--all-branches]] [[-T|--all-tags]] [[-a|--all-commits]]
copy

Garbage collect from the cache, including the default cloud remote storage (if set)
$ dvc gc [[-a|--all-commits]] [[-c|--cloud]]
copy

Garbage collect from the cache, including a specific cloud remote storage
$ dvc gc [[-a|--all-commits]] [[-c|--cloud]] [[-r|--remote]] [remote_name]
copy

SYNOPSIS

dvc gc [OPTION]...

PARAMETERS

-a, --all
    Remove all unused cache files, including those used only by experiments.

--cloud-only
    Remove unused files from remote storage only (not local cache).

--exclude-used
    Remove files used by experiments (default preserves them).

-f, --force
    Force removal without confirmation prompt.

-q, --quiet
    Suppress progress bars and non-error messages.

--workspace
    Remove uncommitted/untracked files from the workspace.

DESCRIPTION

The dvc gc command performs garbage collection for Data Version Control (DVC) projects. It removes unused files from the DVC cache directory (typically .dvc/cache), which stores large data files, models, and metrics hashed for versioning.

DVC cache grows over time with multiple versions, experiments, and branches. Running dvc gc reclaims disk space by deleting objects not referenced by any tracked data, models, or active experiments.

By default, it preserves cache used by the current workspace, Git-tracked DVC files, and recent experiments. Use options like --all for aggressive cleanup. It's safe for local repos but requires caution with remotes.

Ideal for CI/CD pipelines or after pruning experiments with dvc exp gc. Always commit changes first to avoid data loss.

CAVEATS

Irreversible deletion; backup cache or use Git before running. Does not affect remote storage unless --cloud-only. Run in clean Git state to avoid surprises.

EXAMPLES

dvc gc # Default: safe cleanup
dvc gc -a -f # Aggressive, no prompt
dvc gc --workspace # Clean untracked files

HISTORY

Introduced in DVC v0.21 (2019) by Iterative.ai. Evolved with experiment support in v1.x; now integral for large-scale ML workflows.

SEE ALSO

git gc(1), dvc exp gc(1)

Copied to clipboard