bzegrep
Search compressed files for a pattern
TLDR
View documentation for the original command
SYNOPSIS
bzegrep [grep-options] [-d] EXPRESSION [FILE...]
PARAMETERS
-d, --decompress
Decompress files fully to temporary files before searching (allows full grep features)
-i, -I
Ignore case distinctions
-v
Invert match: select non-matching lines
-w
Select only lines matching whole words
-n
Prefix each line with its line number
-l
Output only names of files containing matches
-c
Output count of matching lines per file
-r, -R
Recurse into directories
-E, --extended-regexp
Use extended regular expressions
-F, --fixed-strings
Interpret pattern as fixed strings, not regex
--color[=WHEN]
Highlight matches with color (auto/always/never)
-H
Always print filename headers
-A NUM, --after-context=NUM
Print NUM lines of trailing context
-B NUM, --before-context=NUM
Print NUM lines of leading context
--version
Display version info
--help
Show help
DESCRIPTION
bzegrep is a command-line utility designed to search for regular expression patterns in files compressed with bzip2 (.bz2 format), functioning as a compressed-aware version of grep. It automatically decompresses input files on the fly using bzip2 -dc and pipes the decompressed stream directly to grep for matching, eliminating the need to uncompress files manually. This is ideal for large compressed logs, archives, or datasets in sysadmin, data analysis, or backup workflows.
By default, it operates in streaming mode for memory efficiency, processing files sequentially without creating temporary decompressed copies. It handles multiple files, standard input (if no files specified), and supports recursive searches via grep's options. Output includes matching lines prefixed with filenames when multiple files are searched.
bzegrep inherits nearly all grep functionality, including colored output, context lines (-A/-B/-C), and Perl-compatible regex (-P). However, streaming limits some features like accurate byte offsets. For full grep compatibility on tricky options, use the -d flag to decompress entirely first.
Common use cases: auditing errors in bzipped logs (bzegrep 'ERROR' access.bz2) or scanning directories (bzegrep -r 'pattern' /logs/). It's part of the bzip2 package on most Linux distributions.
CAVEATS
Streaming mode limits random-access grep options like -b (byte offset) or -o (precise overlap positions); use -d for full compatibility but increased disk use. Not recursive on bz2 subdirs by default—combine with -r. Large files may be slow due to decompression CPU cost.
EXIT STATUS
0: matches found; 1: no matches; 2: errors (same as grep)
EXAMPLES
bzegrep 'error|fail' /var/log/app.bz2
bzegrep -i -r -l 'user123' /logs/*.bz2
bzegrep -c '^2023' access.bz2
HISTORY
Developed by Julian Seward as part of the bzip2 suite (initial release 1996), bzegrep mirrors gzip's zgrep for bzip2 handling. Evolved with GNU grep integration for broader option support; widely available since early 2000s in Linux distros.


