dolt-gc
Garbage collect dolt repository data
TLDR
Clean up unreferenced data from the repository
Initiate a faster but less thorough garbage collection process
SYNOPSIS
dolt gc [--threads <int>] [--stats]
PARAMETERS
--threads <int>
Number of threads for parallel processing (default 1; use CPU cores for speed)
--stats
Print detailed statistics on cleaned objects and savings
DESCRIPTION
dolt gc performs garbage collection on a Dolt database repository, removing unreferenced objects like blobs, trees, and commits.
This reduces repository size, improves clone/fetch performance, and reclaims disk space after operations such as branch deletions, merges, or reflogs expiration.
It aggressively packs the object database into packfiles, similar to git gc, but tailored for Dolt's SQL + Git data model. Run periodically or when disk usage grows unexpectedly.
Warning: This is a destructive, non-atomic operation that rewrites the object store. Interruptions may corrupt the repo; backups are essential for production use. It scans all objects, so large repos take time proportional to size.
CAVEATS
Destructive; backups required. Unsafe to interrupt. Incompatible with concurrent Dolt operations. No-op if no garbage found.
USAGE EXAMPLE
dolt gc --threads 4 --stats
Cleans with 4 threads, shows stats.
WHEN TO RUN
After dolt branch -D or when dolt log --all shows bloat. Cron job: weekly on non-prod.
HISTORY
Added in Dolt v0.20.0 (2020) by Liquidata (now DoltHub) to optimize Git-like data repos, evolving with Dolt's multi-table versioning.
SEE ALSO
git-gc(1), dolt-prune(1)


