LinuxCommandLibrary

bzgrep

Search compressed bzip2 files for a pattern

TLDR

Search for a pattern within a compressed file

$ bzgrep "[search_pattern]" [path/to/file]
copy

Recursively search files in a bzip2 compressed .tar archive for a pattern
$ bzgrep [[-r|--recursive]] "[search_pattern]" [path/to/tar_file]
copy

Print 3 lines of [C]ontext around, [B]efore, or [A]fter each match
$ bzgrep [--context|--before-context|--after-context] 3 "[search_pattern]" [path/to/file]
copy

Print file name and line number for each match
$ bzgrep [[-H|--with-filename]] [[-n|--line-number]] "[search_pattern]" [path/to/file]
copy

Search for lines matching a pattern, printing only the matched text
$ bzgrep [[-o|--only-matching]] "[search_pattern]" [path/to/file]
copy

Search stdin for lines that do not match a pattern
$ cat [path/to/bz_compressed_file] | bzgrep [[-v|--invert-match]] "[search_pattern]"
copy

Use extended regex (supports ?, +, {}, (), and |), in case-insensitive mode
$ bzgrep [[-E|--extended-regexp]] [[-i|--ignore-case]] "[search_pattern]" [path/to/file]
copy

SYNOPSIS

bzgrep [grep-options] <EXPRESSION> [FILE...]
bzgrep [grep-options] <EXPRESSION>

PARAMETERS

-i, --ignore-case
    Ignore case distinctions in patterns

-v, --invert-match
    Select non-matching lines

-n, --line-number
    Prefix each line with its line number

-l, --files-with-matches
    Output only names of files containing matches

-c, --count
    Suppress normal output; show count of matching lines

-r, --recursive
    Recurse into directories (decompress .bz2 files found)

-E, --extended-regexp
    Use extended regular expressions

-F, --fixed-strings
    Interpret pattern as fixed strings, not regex

-H, --with-filename
    Always print filename with matches

--color[=WHEN]
    Highlight matches with color (auto, always, never)

-A NUM, --after-context=NUM
    Print NUM lines after each match

-B NUM, --before-context=NUM
    Print NUM lines before each match

DESCRIPTION

bzgrep is a Linux utility for searching regular expression patterns in bzip2 compressed files (.bz2) without manual decompression. It wraps the grep command, using bzip2 to decompress input on-the-fly and pipe it to grep for matching.

This makes it ideal for analyzing large compressed logs, archives, or datasets where space-saving compression is used. When given .bz2 files, bzgrep processes each sequentially, outputting matches in the standard grep format: filename:line_number:content. If no files are specified, it reads compressed data from stdin.

It inherits full grep functionality, supporting basic/extended regex, fixed strings, and options like counting or inversion. Performance is efficient for selective searches but CPU-heavy for full scans due to sequential decompression.

Common in sysadmin tasks, such as bzgrep 'ERROR' /var/log/app.bz2, it complements tools like zgrep for gzip files.

CAVEATS

No native directory recursion on non-.bz2 files; use find with bzgrep for complex trees.
Decompression is single-threaded and memory-intensive for huge files.
stdin assumes bzip2-compressed data; plain text causes errors.

EXIT STATUS

0: matches found; 1: no matches; 2: errors (like grep(1))

EXAMPLES

bzgrep 'error|fail' *.bz2 (search multiple files)
bzgrep -i -n 'TODO' doc.bz2 (case-insensitive with lines)
find /logs -name '*.bz2' -exec bzgrep 'critical' {} + (recursive with find)

HISTORY

Developed by Julian Seward as part of the bzip2 suite; first released in 1996. Evolved with bzip2 versions (up to 1.0.8 in 2010), bundled in most distros like Ubuntu, Fedora since early 2000s for compressed file utilities.

SEE ALSO

grep(1), bzip2(1), bzless(1), bzdiff(1), zgrep(1), egrep(1)

Copied to clipboard