LinuxCommandLibrary

git-hash-object

Calculate object ID and optionally store object

TLDR

Compute the object ID without storing it

$ git hash-object [path/to/file]
copy

Compute the object ID and store it in the Git database
$ git hash-object -w [path/to/file]
copy

Compute the object ID specifying the object type
$ git hash-object -t [blob|commit|tag|tree] [path/to/file]
copy

Compute the object ID from stdin
$ cat [path/to/file] | git hash-object --stdin
copy

SYNOPSIS

git hash-object [-w] [--stdin | --stdin-paths] [--literally] [-t <type>] [--path=<path>] [--no-filters] [--no-replace-object] <file>...

PARAMETERS

-w
    Write the object into the object database. Without this option, the command only outputs the hash to standard output.

--stdin
    Read the object's content from standard input instead of from a file specified on the command line.

--stdin-paths
    Read null-terminated paths from standard input, and compute and optionally write the hash of each file found at those paths. This is typically used with git add --hash.

--literally
    When --stdin is used, treat the input as literal content to be hashed, not a path to a file whose content should be hashed. This is the default behavior when --stdin is used without --path.

-t <type>
    Specify the type of object. Valid types are blob (default), tree, commit, and tag.

--path=<path>
    Hash the content as if it were a file at the specified <path>. This is important for applying content filters (e.g., CRLF conversion or custom clean/smudge filters) defined in .gitattributes for the given path.

--no-filters
    Do not apply content filters (e.g., CRLF conversion, clean/smudge filters) even if --path is specified. The content is hashed exactly as-is.

--no-replace-object
    Don't replace existing objects in the database, even if an object with the same hash already exists. This can prevent unexpected behavior in certain workflows.

<file>...
    One or more file paths whose content should be hashed. This is ignored if --stdin or --stdin-paths is used.

DESCRIPTION

git hash-object is a low-level "plumbing" command used to compute the SHA-1 (or configured hash) of a file or content, and optionally write that object to the Git object database. It takes content from a file path or standard input. This command is fundamental for understanding how Git stores objects and is often used in scripts or for manual object manipulation rather than day-to-day operations. It allows specifying the object type (blob, tree, commit, tag) and handles content filtering based on .gitattributes.

CAVEATS

This is a "plumbing" command, meaning it's designed for scripting and internal Git operations rather than direct user interaction. Misuse can lead to a corrupted repository or unexpected behavior, especially when writing objects without fully understanding Git's internal object model.
When using --stdin, ensure the input stream is correctly terminated or closed, especially for binary data, to avoid premature hashing or waiting for more input.

OBJECT TYPES IN GIT

Git stores different types of objects: blobs (file contents), trees (directory structures representing snapshots), commits (snapshots of the repository at a specific point in time, linking to trees and parents), and tags (references to specific commits or objects, often used for releases). git hash-object primarily deals with blobs by default but can simulate other types using the -t option for specific use cases.

PLUMBING VS. PORCELAIN COMMANDS

git hash-object is considered a 'plumbing' command. Git commands are broadly categorized into 'porcelain' (user-friendly, high-level commands like git commit, git branch) and 'plumbing' (low-level, internal commands like git hash-object, git cat-file). Plumbing commands are the fundamental building blocks upon which the more convenient porcelain commands are constructed, making them crucial for understanding Git's internals and for advanced scripting.

HISTORY

The git hash-object command has been a core part of Git since its early days, serving as one of the fundamental "plumbing" commands that operate directly on the object database. It was essential for Linus Torvalds' initial design, allowing Git to store and address content by its hash. Its existence highlights Git's content-addressable storage model, where data integrity is inherently linked to content hashing. It has remained relatively stable over time, reflecting its foundational role.

SEE ALSO

Copied to clipboard