LinuxCommandLibrary

git-gc

Optimize Git repository by garbage collecting

TLDR

Optimise the repository

$ git gc
copy

Aggressively optimise, takes more time
$ git gc --aggressive
copy

Do not prune loose objects (prunes by default)
$ git gc --no-prune
copy

Suppress all output
$ git gc --quiet
copy

Display help
$ git gc --help
copy

SYNOPSIS

git gc [--aggressive] [--auto] [--prune= | --prune=now] [--force] [--quiet] [--shared] [--no-prune]

PARAMETERS

--aggressive
    Optimize the repository more aggressively to save disk space, potentially taking longer to complete.

--auto
    Run garbage collection only if certain internal thresholds (e.g., number of loose objects or packfiles) are exceeded. This is the default behavior when git gc is triggered by other commands.

--prune=
    Prune unreachable objects that are older than the specified . Use --prune=now to prune all unreachable objects immediately, regardless of age.

--no-prune
    Do not prune any objects. Useful when you only want to repack without deleting.

--force
    Force garbage collection to run, even if Git believes it is not necessary based on its internal checks.

--quiet
    Suppress all progress and completion messages, running silently.

--shared
    Optimize for repositories that are shared among multiple users, ensuring that packfiles are readable by all.

DESCRIPTION

git gc performs garbage collection on your Git repository, aiming to optimize disk space usage and improve performance. It achieves this by packing loose objects into efficient packfiles and pruning unreachable objects from the object database. This command cleans up various forms of redundant data, including stale packfiles, dangling objects, and old reflogs. While it can be run manually, it is also frequently invoked automatically by other Git commands (like git commit or git merge) when certain thresholds are met, ensuring continuous repository health and efficiency.

CAVEATS

Running git gc --aggressive can be time-consuming for large repositories.
While --auto helps, it doesn't guarantee a full repack if conditions aren't met, potentially leaving some loose objects.
Objects that are part of the repository's history (even if seemingly 'lost') are preserved unless explicitly pruned via --prune and are no longer reachable by any reflog or reference.

<B>AUTOMATIC GARBAGE COLLECTION (`--AUTO`)</B>

When invoked with --auto, git gc checks if the number of loose objects or packfiles exceeds certain thresholds. These thresholds are configurable via git config variables such as gc.auto (default 6700 loose objects) and gc.autoPackLimit (default 50 packfiles). If either threshold is met, a full garbage collection runs. This automatic behavior prevents the repository from accumulating excessive loose objects or packfiles, maintaining optimal performance without constant manual intervention.

<B>OBJECT STORAGE CONCEPTS</B>

Git stores objects (blobs, trees, commits, tags) either as loose objects or within packfiles. Loose objects are individual files for each object, convenient for quick writes and additions. Packfiles are compressed archives of multiple objects, highly efficient for storage and network transfer. git gc's primary role is to consolidate these loose objects into packfiles and remove obsolete or unreachable objects, thereby optimizing storage and retrieval performance within the repository.

HISTORY

git gc has been a fundamental command in Git since its early days, designed to maintain repository health and efficiency. As Git repositories grow in size and complexity, the importance of efficient garbage collection has led to continuous improvements in its automatic behaviors and configuration options, making it a robust maintenance tool.

SEE ALSO

git repack(1), git prune(1), git config(1), git fsck(1), git reflog(1)

Copied to clipboard