regex

Search text using pattern matching

TLDR

Match any single character

$ .

Match the start of a line

$ ^[hello]

Match the end of a line

$ [world]$

Match zero or more repeated characters

$ [a]*

Match a set of characters

$ [[abc]]

Match ranges of characters

$ [[a-z]][[3-9]]

Match anything but the specified character

$ [^[a]]

Match a boundary around a word

$ "\b[text]\b"

SYNOPSIS

grep [-E|-P|-F] [options] pattern [file...]
sed [options] script [file...]
find [path...] -regex pattern [actions]

-E
    Interpret pattern as an Extended Regular Expression (ERE).

-P
    Use Perl‑compatible regular expressions (PCRE) where supported (e.g., grep -P).

-F
    Treat pattern as a fixed string (no regex interpretation).

-i
    Ignore case distinctions in the pattern.

-v
    Invert the match – select non‑matching lines.

-o
    Print only the part of a line that matches the pattern.

-w
    Match only whole words.

-x
    Match only whole lines.

-e expr
    Specify a pattern; allows multiple -e options.

-r
    Recursive search (e.g., grep -r).

DESCRIPTION

regex is not a standalone executable in most Linux distributions; it is the regular‑expression engine that underlies many text‑processing utilities such as grep, sed, awk, find, and perl. Regular expressions allow users to describe complex search patterns with a concise syntax, enabling searching, validation, substitution, and extraction of text. Linux follows the POSIX standard for Basic Regular Expressions (BRE) and Extended Regular Expressions (ERE), while many tools also support Perl‑compatible regular expressions (PCRE) for richer features like look‑ahead, look‑behind, and non‑greedy quantifiers. Mastering regex dramatically reduces the amount of shell scripting needed for log analysis, configuration parsing, and bulk file renaming, making it an essential skill for administrators and developers.

CAVEATS

Regex syntax differs slightly between utilities (BRE vs. ERE vs. PCRE). Not all options are available in every command, and some tools impose limits on pattern length or supported constructs. Careful quoting is required to prevent the shell from interpreting metacharacters.

CORE METACHARACTERS

. – any single character except newline
^ – start of line
$ – end of line
* – zero or more repetitions
+ – one or more repetitions
? – zero or one repetition
[] – character class
() – grouping
| – alternation

POSIX CHARACTER CLASSES

[[:alnum:]] – alphanumeric
[[:alpha:]] – alphabetic
[[:digit:]] – digits
[[:space:]] – whitespace characters

HISTORY

Regular expressions originated in the 1960s with the QED editor and were formalized by Ken Thompson for the early ed and grep utilities on Unix. POSIX later standardized BRE and ERE, while the rise of Perl in the 1990s introduced PCRE, which many modern Linux tools now optionally support. Over time, regex has become a lingua franca for text processing across shells, scripting languages, and system utilities.