hashdeep
Compute and compare file hashes
SYNOPSIS
hashdeep [OPTIONS] [FILE_OR_DIRECTORY...]
hashdeep -k hashfile [OPTIONS] [FILE_OR_DIRECTORY...]
PARAMETERS
-r
Recursively process all files in the specified directories.
-k hashfile
Read known hashes from hashfile for verification mode.
-x
Enable verification mode. If hashfile is specified, compare; otherwise, report any issues. Exits 0 on success, non-zero on failure.
-a algo
Add algo to the list of algorithms to compute. Can be used multiple times (e.g., --md5, --sha1).
-c algo
In verification mode, only compare hashes using the specified algo.
-j count
Use count parallel jobs for hashing, speeding up processing on multi-core systems.
-l
List all file paths encountered during processing, even if not hashing.
-m size
Maximum file size to process. Files larger than size are skipped. Size can be like '10M' or '2G'.
-M count
Maximum amount of memory (in bytes) to use for storing hashes in verification mode. Can be like '1G'.
-s
Silent mode. Suppress output except for error messages.
-v
Verbose mode. Display more information about the processing.
-C
Output in CSV (Comma Separated Values) format.
-p
Output in pretty format, with human-readable bytes and aligned columns.
-b
Bare output. Suppress the header line and any comments.
-L
Follow symbolic links encountered during recursive directory traversal.
-o filename
Output results to the specified filename instead of standard output.
-i filename
Read list of files/directories to process from filename, one per line. If '-' is specified, read from stdin.
--md5
Enable MD5 hashing (shortcut for -a md5).
--sha1
Enable SHA-1 hashing (shortcut for -a sha1).
--sha256
Enable SHA-256 hashing (shortcut for -a sha256).
--sha512
Enable SHA-512 hashing (shortcut for -a sha512).
--tiger
Enable Tiger hashing (shortcut for -a tiger).
--whirlpool
Enable Whirlpool hashing (shortcut for -a whirlpool).
-h, --help
Display help message and exit.
-V, --version
Display version information and exit.
DESCRIPTION
hashdeep is a powerful command-line utility for recursively computing and verifying cryptographic checksums (hashes) of files and directories. It is part of the hashdeep package, which also includes specialized tools like md5deep and sha1deep.
Capable of generating hash lists for entire directory structures, hashdeep supports multiple algorithms, including MD5, SHA-1, SHA-256, SHA-512, Tiger, and Whirlpool. This makes it invaluable for digital forensic analysis, data integrity verification, identifying duplicate files, and ensuring the authenticity of data.
It can output hash lists in various formats and, critically, can take a previously generated list of hashes as input to verify the integrity of the data at a later time, reporting any discrepancies or missing files. Its recursive nature and ability to process large datasets efficiently make it a go-to tool for large-scale data integrity checks.
CAVEATS
When performing recursive operations, hashdeep does not follow symbolic links by default. Use the -L option explicitly if symbolic link traversal is desired, as this can significantly change the scope of files processed.
Verification mode with large hash sets (-k) can consume substantial memory; consider using -M to limit memory usage if necessary.
The accuracy of integrity checks relies on the cryptographic strength of the chosen hash algorithms. For robust security, modern algorithms like SHA-256 or SHA-512 are recommended over MD5 or SHA-1 for new applications.
OUTPUT FORMATS
hashdeep offers several output formats. By default, it produces a header followed by entries showing the size, hash, and path for each file. The -C option changes the output to a CSV format, which is machine-parseable. The -p (pretty) option formats the output for human readability, aligning columns and showing file sizes in a more intuitive way. The -b (bare) option suppresses the header and comments, providing only the raw data entries. Choosing the appropriate format depends on whether the output is intended for human review or automated processing.
VERIFICATION PROCESS
In verification mode (enabled with -k hashfile), hashdeep reads a list of known hashes and compares them against the files found on the system. It reports various states: MATCH for files with matching hashes, MISMATCH for files whose hashes do not match, MISSING for files listed in the hashfile but not found on the system, and NEW FILE for files found on the system but not present in the hashfile. This comprehensive reporting is crucial for detecting unauthorized modifications, missing data, or unexpected new files.
HISTORY
The hashdeep command is a successor to a suite of deep-hashing tools, originally known as md5deep, sha1deep, etc. Developed by Jesse Kornblum, these tools were designed primarily for digital forensic purposes, allowing investigators to quickly and reliably calculate cryptographic hashes of files and entire directory structures. The hashdeep command was introduced to unify the functionality of these separate tools into a single, more versatile utility capable of supporting multiple hashing algorithms simultaneously, thereby streamlining integrity verification and data analysis workflows.