LinuxCommandLibrary

hashdeep

Compute and compare file hashes

SYNOPSIS

hashdeep [OPTIONS] [FILE_OR_DIRECTORY...]
hashdeep -k hashfile [OPTIONS] [FILE_OR_DIRECTORY...]

PARAMETERS

-r
    Recursively process all files in the specified directories.

-k hashfile
    Read known hashes from hashfile for verification mode.

-x
    Enable verification mode. If hashfile is specified, compare; otherwise, report any issues. Exits 0 on success, non-zero on failure.

-a algo
    Add algo to the list of algorithms to compute. Can be used multiple times (e.g., --md5, --sha1).

-c algo
    In verification mode, only compare hashes using the specified algo.

-j count
    Use count parallel jobs for hashing, speeding up processing on multi-core systems.

-l
    List all file paths encountered during processing, even if not hashing.

-m size
    Maximum file size to process. Files larger than size are skipped. Size can be like '10M' or '2G'.

-M count
    Maximum amount of memory (in bytes) to use for storing hashes in verification mode. Can be like '1G'.

-s
    Silent mode. Suppress output except for error messages.

-v
    Verbose mode. Display more information about the processing.

-C
    Output in CSV (Comma Separated Values) format.

-p
    Output in pretty format, with human-readable bytes and aligned columns.

-b
    Bare output. Suppress the header line and any comments.

-L
    Follow symbolic links encountered during recursive directory traversal.

-o filename
    Output results to the specified filename instead of standard output.

-i filename
    Read list of files/directories to process from filename, one per line. If '-' is specified, read from stdin.

--md5
    Enable MD5 hashing (shortcut for -a md5).

--sha1
    Enable SHA-1 hashing (shortcut for -a sha1).

--sha256
    Enable SHA-256 hashing (shortcut for -a sha256).

--sha512
    Enable SHA-512 hashing (shortcut for -a sha512).

--tiger
    Enable Tiger hashing (shortcut for -a tiger).

--whirlpool
    Enable Whirlpool hashing (shortcut for -a whirlpool).

-h, --help
    Display help message and exit.

-V, --version
    Display version information and exit.

DESCRIPTION

hashdeep is a powerful command-line utility for recursively computing and verifying cryptographic checksums (hashes) of files and directories. It is part of the hashdeep package, which also includes specialized tools like md5deep and sha1deep.

Capable of generating hash lists for entire directory structures, hashdeep supports multiple algorithms, including MD5, SHA-1, SHA-256, SHA-512, Tiger, and Whirlpool. This makes it invaluable for digital forensic analysis, data integrity verification, identifying duplicate files, and ensuring the authenticity of data.

It can output hash lists in various formats and, critically, can take a previously generated list of hashes as input to verify the integrity of the data at a later time, reporting any discrepancies or missing files. Its recursive nature and ability to process large datasets efficiently make it a go-to tool for large-scale data integrity checks.

CAVEATS

When performing recursive operations, hashdeep does not follow symbolic links by default. Use the -L option explicitly if symbolic link traversal is desired, as this can significantly change the scope of files processed.

Verification mode with large hash sets (-k) can consume substantial memory; consider using -M to limit memory usage if necessary.

The accuracy of integrity checks relies on the cryptographic strength of the chosen hash algorithms. For robust security, modern algorithms like SHA-256 or SHA-512 are recommended over MD5 or SHA-1 for new applications.

OUTPUT FORMATS

hashdeep offers several output formats. By default, it produces a header followed by entries showing the size, hash, and path for each file. The -C option changes the output to a CSV format, which is machine-parseable. The -p (pretty) option formats the output for human readability, aligning columns and showing file sizes in a more intuitive way. The -b (bare) option suppresses the header and comments, providing only the raw data entries. Choosing the appropriate format depends on whether the output is intended for human review or automated processing.

VERIFICATION PROCESS

In verification mode (enabled with -k hashfile), hashdeep reads a list of known hashes and compares them against the files found on the system. It reports various states: MATCH for files with matching hashes, MISMATCH for files whose hashes do not match, MISSING for files listed in the hashfile but not found on the system, and NEW FILE for files found on the system but not present in the hashfile. This comprehensive reporting is crucial for detecting unauthorized modifications, missing data, or unexpected new files.

HISTORY

The hashdeep command is a successor to a suite of deep-hashing tools, originally known as md5deep, sha1deep, etc. Developed by Jesse Kornblum, these tools were designed primarily for digital forensic purposes, allowing investigators to quickly and reliably calculate cryptographic hashes of files and entire directory structures. The hashdeep command was introduced to unify the functionality of these separate tools into a single, more versatile utility capable of supporting multiple hashing algorithms simultaneously, thereby streamlining integrity verification and data analysis workflows.

SEE ALSO

md5sum(1), sha1sum(1), sha256sum(1), sha512sum(1), cksum(1), find(1), xargs(1)

Copied to clipboard