LinuxCommandLibrary

dolt-gc

Garbage collect dolt repository data

TLDR

Clean up unreferenced data from the repository

$ dolt gc
copy

Initiate a faster but less thorough garbage collection process
$ dolt gc [[-s|--shallow]]
copy

SYNOPSIS

dolt gc [--threads <int>] [--stats]

PARAMETERS

--threads <int>
    Number of threads for parallel processing (default 1; use CPU cores for speed)

--stats
    Print detailed statistics on cleaned objects and savings

DESCRIPTION

dolt gc performs garbage collection on a Dolt database repository, removing unreferenced objects like blobs, trees, and commits.

This reduces repository size, improves clone/fetch performance, and reclaims disk space after operations such as branch deletions, merges, or reflogs expiration.

It aggressively packs the object database into packfiles, similar to git gc, but tailored for Dolt's SQL + Git data model. Run periodically or when disk usage grows unexpectedly.

Warning: This is a destructive, non-atomic operation that rewrites the object store. Interruptions may corrupt the repo; backups are essential for production use. It scans all objects, so large repos take time proportional to size.

CAVEATS

Destructive; backups required. Unsafe to interrupt. Incompatible with concurrent Dolt operations. No-op if no garbage found.

USAGE EXAMPLE

dolt gc --threads 4 --stats
Cleans with 4 threads, shows stats.

WHEN TO RUN

After dolt branch -D or when dolt log --all shows bloat. Cron job: weekly on non-prod.

HISTORY

Added in Dolt v0.20.0 (2020) by Liquidata (now DoltHub) to optimize Git-like data repos, evolving with Dolt's multi-table versioning.

SEE ALSO

git-gc(1), dolt-prune(1)

Copied to clipboard