bzip2recover
Recover data from damaged bzip2 files
TLDR
Recover all intact blocks from a damaged .bz2 file
SYNOPSIS
bzip2recover [ -f ] [ -q ] [ -s ] [ -v ] filename
PARAMETERS
filename
The path to the damaged bzip2 compressed file from which to attempt data recovery.
-f, --force
Force overwrite of any existing output recovery files (e.g., recNNNNNfilename.bz2
) without prompting for confirmation.
-q, --quiet
Be quiet; suppress most informational messages and warnings that would normally be displayed during the recovery process.
-s, --small
Use a less verbose output format for messages generated by bzip2recover itself.
-v, --verbose
Be verbose; print more detailed information about the recovery progress, including blocks found and written to new files.
-V, --version
Display the version number of bzip2recover and then exit immediately.
DESCRIPTION
bzip2recover is a utility specifically designed to attempt recovery of data from corrupt bzip2 compressed files. It operates by scanning the damaged .bz2
file for 40-bit bzip2 block headers. For each block header it successfully identifies, it extracts that block and writes it to a new, separate bzip2 file. The output files are named sequentially, such as rec00001file.bz2
, rec00002file.bz2
, and so on, where file
is the name of the original damaged file. While bzip2recover cannot repair the original file or salvage data from within a corrupted block, it can often retrieve significant portions of a multi-block bzip2 archive, allowing you to access data from the uncorrupted segments. This is particularly valuable for large backups or archives where partial corruption should not lead to complete data loss.
CAVEATS
bzip2recover cannot magically fix corrupt data within a damaged block; it can only salvage complete, undamaged blocks that it can identify by their valid headers. If a block's header is corrupted, that specific block and potentially subsequent data until the next valid header might be lost. The utility creates multiple output files, each containing a single recovered block, named in the format recNNNNNoriginal_filename.bz2
. This means the user must then concatenate and decompress these individual files to reconstruct the recoverable data. It is specifically designed for bzip2 files and will not work for other archive types or general file recovery.
OUTPUT FILES
When successful, bzip2recover generates one or more new bzip2 files in the current directory. These files are named following the pattern recNNNNNoriginal_filename.bz2
, where NNNNN
is a five-digit sequence number (e.g., 00001, 00002). Each of these files contains a single recovered block from the original damaged file. To fully reconstruct the recoverable data, these individual files typically need to be concatenated in the correct order and then decompressed. For example, to recover all possible data from myarchive.bz2
, you might use: cat rec*.bz2 | bunzip2 > recovered_myarchive_data
HISTORY
bzip2recover is an integral part of the bzip2 compression suite, which was developed by Julian Seward. It was designed to address the practical need for data recovery in the event of partial file corruption, a common issue with large compressed archives. Its inclusion in the suite highlights the project's emphasis on data integrity and usability, complementing the primary goal of efficient compression.