dvc-fetch
Download data or models tracked by DVC
TLDR
Fetch the latest changes from the default remote upstream repository (if set)
Fetch changes from a specific remote upstream repository
Fetch the latest changes for a specific target/s
Fetch changes for all branch and tags
Fetch changes for all commits
SYNOPSIS
dvc fetch [TARGETS] [-A] [-j N] [-f] [-q] [-v]
PARAMETERS
--help (-h)
Show help message and exit.
--quiet (-q)
Suppress non-essential output.
--verbose (-v)
Display more detailed output.
--all (-A)
Fetch all data tracked by DVC.
--jobs JOBS (-j)
Number of parallel jobs (default: CPU count).
--force (-f)
Overwrite local cache files.
--run-cache
Fetch run cache (experimental).
--with-stats
Print JSON stats to stdout.
TARGETS
Pipeline stages, .dvc files, dirs, or outputs (optional; defaults to all).
DESCRIPTION
The dvc fetch command downloads data from the configured DVC remote storage into the local DVC cache directory (.dvc/cache). It updates the cache without modifying workspace files or .dvc metadata files. This is useful for prefetching large datasets separately from checkout operations, enabling efficient CI/CD pipelines or selective data pulls.
Specify targets like pipeline stages, .dvc files, directories, or outputs to fetch specific data. Without targets, it fetches all tracked data by default with -A. It verifies file integrity using hashes and supports multiple remotes.
Unlike dvc pull, it skips workspace updates, making it faster for cache-only syncs. Ideal for shared environments where data is versioned but not always checked out.
CAVEATS
Requires initialized DVC repo and configured remote (dvc remote add). Does not update workspace or .dvc files—use dvc checkout or dvc pull for that. Fails if local cache is ahead or inconsistent without -f. Large datasets may require significant disk space.
TARGETS EXAMPLES
dvc fetch model.dvc fetches specific file.
dvc fetch -A fetches everything.
dvc fetch train fetches pipeline stage.
CACHE LOCATION
Data stored in .dvc/cache using MD5 hashes for deduplication and verification.
HISTORY
Introduced in DVC v0.21 (2018) by Iterative.ai to separate cache fetching from checkout. Evolved with multi-remote support (v1.0, 2019) and parallel jobs (v1.10, 2020). Now integral for reproducible ML pipelines.
SEE ALSO
dvc pull(1), dvc push(1), dvc checkout(1), dvc remote(1)


