git-repack
Pack Git objects to reduce repository size
TLDR
Pack unpacked objects in the current directory
Also remove redundant objects after packing
SYNOPSIS
git repack [-a | -d | -f | -l | -n | -q | -b | --window=<n> | --depth=<n> | --unpack-unreachable=<n>]
PARAMETERS
-a, --all
Repack all objects, combining existing packfiles into a new single one.
-d, --no-reuse-object
Delete unreachable packfiles and loose objects after packing.
-f, --force
Force repack even if the repository is already optimized.
-l, --local
Do not reuse existing pack objects; implies --all.
-n, --dry-run
Do not write a packfile, just report what would happen.
-q, --quiet
Be quiet; suppress progress output.
-b, --write-midx
Write a multi-pack index (MIDX) file for better performance.
--window=<n>
Set the maximum number of objects to consider for delta compression. Higher values improve compression but use more resources.
--depth=<n>
Set the maximum delta recursion depth. Higher values improve compression but use more resources.
--unpack-unreachable=<n>
Unpack objects from packfiles that are unreachable for at least <n> days, allowing them to be pruned.
DESCRIPTION
git-repack is a low-level command used to optimize a Git repository's storage. It consolidates loose objects (individual files representing Git objects like commits, trees, or blobs) into efficient, compressed "packfiles." Packfiles improve performance by reducing the number of files on disk and enabling delta compression, which stores objects as differences from other objects.
This process significantly reduces disk space usage and speeds up Git operations such as cloning and object lookup. While often invoked automatically by git gc (garbage collection), git-repack can be run manually for fine-grained control over the packing process, allowing users to aggressively compress or rearrange objects within the repository to further optimize performance and disk footprint.
CAVEATS
Can be computationally intensive and time-consuming for very large repositories.
Typically not needed manually, as git gc handles repacking automatically as part of its garbage collection process.
Aggressive compression settings (e.g., very high --window or --depth) significantly increase processing time and memory usage.
PACKFILE MECHANICS
Git stores its objects (blobs, trees, commits, tags) efficiently in packfiles. A packfile is a single, often highly compressed file that contains multiple Git objects. Objects are typically stored as deltas (differences) against other objects, minimizing space. Each packfile has an accompanying index (.idx) file that allows Git to quickly locate any object within the pack. This structure is key to Git's performance and low disk usage for large repositories.
TYPICAL USAGE SCENARIOS
Although git gc usually handles repacking, manual git repack can be beneficial in specific scenarios:
- After a large import or a series of operations that generated many loose objects.
- When disk space is at a premium and a more aggressive optimization is desired than what git gc might provide by default.
- For performance tuning, such as before pushing a repository to a server, ensuring the transferred pack is optimally compressed.
- When experimenting with compression parameters like --window and --depth to find an optimal balance between compression and packing time.
HISTORY
Git initially stored all objects as individual loose files. As repositories grew, this became inefficient in terms of disk space and performance. To address this, Git introduced "packfiles" to store multiple objects in a single, compressed file, leveraging delta compression. git-repack was developed as the primary utility to create and manage these packfiles, consolidating loose objects and optimizing existing packs. Its evolution is fundamental to Git's efficient storage and performance over time.
SEE ALSO
git gc(1), git prune(1), git verify-pack(1), git count-objects(1)