LinuxCommandLibrary

grex

Generate regular expressions from examples

TLDR

Generate a simple regex

$ grex [space_separated_strings]
copy

Generate a case-insensitive regex
$ grex [[-i|--ignore-case]] [space_separated_strings]
copy

Replace digits with '\d'
$ grex [[-d|--digits]] [space_separated_strings]
copy

Replace Unicode word character with '\w'
$ grex [[-w|--words]] [space_separated_strings]
copy

Replace spaces with '\s'
$ grex [[-s|--spaces]] [space_separated_strings]
copy

Add {min, max} quantifier representation for repeating sub-strings
$ grex [[-r|--repetitions]] [space_separated_strings]
copy

SYNOPSIS

grex [OPTIONS] [STRING...]
or
command | grex [OPTIONS]

PARAMETERS

--glob, -g
    Interpret input strings as glob patterns (e.g., `*.txt`) rather than literal strings for regex inference.

--format <FORMAT>, -f <FORMAT>
    Specify the output format for the generated regular expression. Common formats include pcre, rust, java, javascript, and python.

--literal, -l
    Treat input strings as purely literal, escaping all special regular expression characters and matching them exactly.

--invert
    Generate a regular expression that matches none of the provided input strings.

--strict
    Generate the strictest possible regular expression, matching only the given strings exactly without over-generalizing patterns.

--word, -w
    Force word boundaries (\b) around the inferred regular expression pattern, ensuring it matches whole words.

--digits
    Treat sequences of digits (e.g., '123') as the regex pattern \d+ (one or more digits) or \d* (zero or more digits).

--escape
    Escape all characters in the input that are not ASCII alphanumerics, treating them as literals.

--case-insensitive, -i
    Generate a case-insensitive regular expression, typically by adding an 'i' flag if the format supports it.

--no-anchors
    Do not add start-of-string (^) and end-of-string ($) anchors to the generated regular expression.

--output <FILE>, -o <FILE>
    Write the generated regular expression to the specified output file instead of standard output.

--help, -h
    Display help information for the grex command.

--version, -V
    Print the version information for grex.

[STRING...]
    One or more input strings from which grex will infer the regular expression. If no strings are provided, input is read from standard input.

DESCRIPTION

grex is a command-line utility written in Rust that intelligently infers a regular expression from one or more given test strings.

It analyzes the input strings to identify common patterns like digits, whitespace, word characters, and literal parts, then constructs a concise regex that matches all provided examples. This tool is invaluable for developers, system administrators, and anyone needing to quickly generate a regex without manual trial and error, especially when working with many example inputs. It's particularly useful for tasks like validating user input, parsing log files, or sanitizing data, by providing a robust pattern based on representative samples.

CAVEATS

Regular expression inference is based on the provided samples; insufficient or unrepresentative inputs may lead to an overly broad or too narrow regex. Complex regex constructs such as backreferences or lookaheads/lookbehinds are generally not inferred by grex. Its effectiveness depends on the quality and diversity of the test cases.

INPUT MODES

grex offers flexible input methods. You can provide input strings directly as command-line arguments, separating multiple strings with spaces. Alternatively, for larger sets of data or integration into pipelines, grex can read input from standard input (stdin), where each line is treated as a separate string for regex inference.

LANGUAGE-SPECIFIC OUTPUT

A powerful feature of grex is its ability to generate regular expressions tailored for specific programming languages. Using the --format option, you can output regexes compatible with Rust, JavaScript, Python, Java, and other environments, simplifying direct integration into your code without manual translation or escaping adjustments.

HISTORY

grex is a relatively modern command-line utility, implemented in the Rust programming language. It was created by Peter Bourgon to automate the often tedious and error-prone process of manually crafting regular expressions from examples. Its development in Rust ensures high performance and memory safety, contributing to its growing popularity within the developer community for its ease of use and effectiveness in generating robust regex patterns.

SEE ALSO

grep(1), sed(1), awk(1), perlre(1)

Copied to clipboard