md5sum
Verify file integrity using MD5 hash
TLDR
Calculate the MD5 checksum for one or more files
Calculate and save the list of MD5 checksums to a file
Calculate an MD5 checksum from stdin
Read a file of MD5 checksums and filenames and verify all files have matching checksums
Only show a message for missing files or when verification fails
Only show a message when verification fails, ignoring missing files
Check a known MD5 checksum of a file
SYNOPSIS
md5sum [OPTION]... [FILE]...
md5sum --check [OPTION]... [FILE]
PARAMETERS
-b, --binary
Read in binary mode.
-t, --text
Read in text mode (default on non-Windows systems).
-c, --check
Read MD5 sums from FILEs and check them.
--ignore-missing
Don't fail or report status for missing files when checking.
--quiet
Don't print OK for each successfully verified file when checking.
--status
Don't output anything, status code shows success when checking.
--strict
Exit non-zero for any invalid input line when checking.
-w, --warn
Warn about improperly formatted checksum lines when checking.
-z, --zero
End each output line with NUL, not newline; read NUL-terminated input.
--tag
Create a BSD style checksum format.
--help
Display help information and exit.
--version
Output version information and exit.
DESCRIPTION
md5sum is a command-line utility used to compute and verify MD5 (Message-Digest Algorithm 5) checksums of files. An MD5 checksum is a 128-bit (16-byte) hash value, typically represented as a 32-character hexadecimal number, that uniquely identifies a file's content. Its primary purpose is to ensure data integrity during transmission or storage. By comparing the MD5 checksum of a file before and after transfer, users can detect whether the file has been altered or corrupted. While md5sum is effective for detecting accidental data corruption, it's important to note that MD5 is no longer considered cryptographically secure for purposes requiring collision resistance due to discovered vulnerabilities. For such security-critical applications, stronger hash algorithms like SHA-256 or SHA-512 are recommended.
CAVEATS
md5sum uses the MD5 algorithm, which has known cryptographic weaknesses, particularly regarding collision resistance. This means it is possible to find two different files that produce the same MD5 hash. Therefore, md5sum should not be used for security-critical applications where cryptographic collision resistance is paramount, such as digital signatures or verifying software authenticity against malicious tampering. It remains suitable for detecting accidental data corruption or verifying data integrity where an adversary is not involved. For stronger cryptographic assurances, consider using SHA-256 or SHA-512 based utilities.
CHECKSUM VERIFICATION EXAMPLE
To verify a file's integrity, first generate its checksum and save it: md5sum myfile.txt > myfile.md5. Later, to check the file against this saved checksum: md5sum -c myfile.md5. The command will output "myfile.txt: OK" if the checksum matches, or report an error if it doesn't.
STANDARD INPUT/OUTPUT
md5sum can read content from standard input if no file is specified or if - is used as a filename: echo "hello world" | md5sum. It outputs the checksum and filename (or '-' for stdin) to standard output.
HISTORY
The MD5 algorithm was designed by Ronald Rivest in 1991, succeeding MD4. The md5sum utility itself is part of the GNU Core Utilities (coreutils) package, which provides essential command-line tools for Unix-like operating systems. It gained widespread adoption due to its efficiency and the strong collision resistance initially believed for MD5. However, significant cryptographic vulnerabilities were discovered in MD5 in the early 2000s, leading to its deprecation for security-sensitive applications. Despite this, md5sum remains a commonly used tool for non-cryptographic integrity checks due to its ubiquity and ease of use.