foremost
Recover files based on their headers/footers
SYNOPSIS
foremost [-vVwq] [-a] [-b
PARAMETERS
-v
Enable verbose output.
-V
Show version information and exit.
-w
Write only the audit file, no actual file recovery. Useful for testing configurations.
-a
Turn off all error messages. Only warnings and fatal errors will be displayed.
-b <size>
Specify the block size for reading the input file. Default is 512 bytes. Useful for certain raw disk formats.
-c <configfile>
Specify a configuration file to use. Default is /etc/foremost.conf.
-d
Enable data deduplication. If turned on, all recovered files are stored in a single directory and duplicate files are not written.
-D <num>
Declares that the footer search will not exceed <num> bytes from the beginning of the header. For variable length headers.
-e <num>
Declares that the header search will not exceed <num> bytes from the end of the footer. For variable length footers.
-f <file>
Specify a file or device to write the recovered data to. (Deprecated, use -o)
-i <file>
Specify the input file (e.g., disk image, partition, or whole device) to scan. If omitted, standard input is used.
-o <dir>
Specify the output directory for recovered files. foremost will create subdirectories for each file type.
-q
Quick mode. Only search for headers, do not perform full data carving using footers or internal structures.
-s <num>
Specify the block size for sparse file detection. Useful for detecting sparse blocks in disk images.
-t <type>
Specify file types to search for (e.g., jpg, pdf, doc, or all). Multiple types can be separated by commas.
-T
Timestamp the output directory. The output directory will be named with the current date and time (e.g., output_YYYYMMDDHHMMSS).
-x
Automatically create an audit file named audit.txt in the output directory, logging the carving process.
-X
Enable verbose output for all file types during a scan, showing details for each file found.
DESCRIPTION
foremost is a console-based data carving program designed to recover files based on their headers, footers, and internal data structures. It can work on image files (like dd images) or directly on a raw storage device (e.g., hard drive, USB drive, memory card). foremost ignores the filesystem structure, making it effective for recovering files from corrupted, formatted, or partially overwritten media.
It uses a configuration file (typically foremost.conf) to define file types it can recover, specifying magic numbers, file extensions, and maximum file sizes. When run, it extracts files into a specified output directory, organizing them by type (e.g., jpg, doc, pdf). It's commonly used in digital forensics and data recovery scenarios when traditional filesystem-aware recovery tools are insufficient.
CAVEATS
foremost recovers files based on signatures, ignoring filesystem metadata. This means:
- Recovered files may be fragmented, partial, or corrupted.
- File names and original directory structures are not preserved.
- It can be a resource-intensive and time-consuming process on large drives or images.
- Always work on a copy of the evidence or in a read-only manner to prevent further data loss.
CONFIGURATION FILE (<I>FOREMOST.CONF</I>)
foremost relies heavily on its configuration file (by default, /etc/foremost.conf or specified by -c) to define the file types it can recognize and recover. This file specifies the file extension, whether the search is case-sensitive, the maximum file size, and the hexadecimal or ASCII values for the file's header and footer. Users can customize this file to add support for new or proprietary file types, or to adjust the recovery parameters for existing ones.
OUTPUT DIRECTORY STRUCTURE
Upon successful recovery, foremost organizes the extracted files into subdirectories within the specified output folder (e.g., output/jpg, output/pdf). Each recovered file is renamed numerically (e.g., 000000.jpg). An audit.txt file is also created in the main output directory, providing a log of the carving process, including details like the original offset of each recovered file on the input media and its new name.
HISTORY
foremost was originally developed by Kris Kendall and Jesse Kornblum while working for the United States Air Force Office of Special Investigations (AFOSI) and the Center for Information Systems Security Studies and Research (CISSR) at Naval Postgraduate School. It was designed as an open-source tool for digital forensic investigations, initially released around 2001-2002. Its development aimed to provide a reliable method for recovering data from damaged or fragmented media by focusing on file signatures rather than filesystem metadata, a technique known as 'data carving.' It has since been maintained and updated by the open-source community, remaining a staple in many digital forensics toolkits.