grex
Generate regular expressions from examples
TLDR
Generate a simple regex
Generate a case-insensitive regex
Replace digits with '\d'
Replace Unicode word character with '\w'
Replace spaces with '\s'
Add {min, max} quantifier representation for repeating sub-strings
SYNOPSIS
grex [OPTIONS] [STRING...]
or
command | grex [OPTIONS]
PARAMETERS
--glob, -g
Interpret input strings as glob patterns (e.g., `*.txt`) rather than literal strings for regex inference.
--format <FORMAT>, -f <FORMAT>
Specify the output format for the generated regular expression. Common formats include pcre, rust, java, javascript, and python.
--literal, -l
Treat input strings as purely literal, escaping all special regular expression characters and matching them exactly.
--invert
Generate a regular expression that matches none of the provided input strings.
--strict
Generate the strictest possible regular expression, matching only the given strings exactly without over-generalizing patterns.
--word, -w
Force word boundaries (\b
) around the inferred regular expression pattern, ensuring it matches whole words.
--digits
Treat sequences of digits (e.g., '123') as the regex pattern \d+
(one or more digits) or \d*
(zero or more digits).
--escape
Escape all characters in the input that are not ASCII alphanumerics, treating them as literals.
--case-insensitive, -i
Generate a case-insensitive regular expression, typically by adding an 'i' flag if the format supports it.
--no-anchors
Do not add start-of-string (^
) and end-of-string ($
) anchors to the generated regular expression.
--output <FILE>, -o <FILE>
Write the generated regular expression to the specified output file instead of standard output.
--help, -h
Display help information for the grex command.
--version, -V
Print the version information for grex.
[STRING...]
One or more input strings from which grex will infer the regular expression. If no strings are provided, input is read from standard input.
DESCRIPTION
grex is a command-line utility written in Rust that intelligently infers a regular expression from one or more given test strings.
It analyzes the input strings to identify common patterns like digits, whitespace, word characters, and literal parts, then constructs a concise regex that matches all provided examples. This tool is invaluable for developers, system administrators, and anyone needing to quickly generate a regex without manual trial and error, especially when working with many example inputs. It's particularly useful for tasks like validating user input, parsing log files, or sanitizing data, by providing a robust pattern based on representative samples.
CAVEATS
Regular expression inference is based on the provided samples; insufficient or unrepresentative inputs may lead to an overly broad or too narrow regex. Complex regex constructs such as backreferences or lookaheads/lookbehinds are generally not inferred by grex. Its effectiveness depends on the quality and diversity of the test cases.
INPUT MODES
grex offers flexible input methods. You can provide input strings directly as command-line arguments, separating multiple strings with spaces. Alternatively, for larger sets of data or integration into pipelines, grex can read input from standard input (stdin), where each line is treated as a separate string for regex inference.
LANGUAGE-SPECIFIC OUTPUT
A powerful feature of grex is its ability to generate regular expressions tailored for specific programming languages. Using the --format option, you can output regexes compatible with Rust, JavaScript, Python, Java, and other environments, simplifying direct integration into your code without manual translation or escaping adjustments.
HISTORY
grex is a relatively modern command-line utility, implemented in the Rust programming language. It was created by Peter Bourgon to automate the often tedious and error-prone process of manually crafting regular expressions from examples. Its development in Rust ensures high performance and memory safety, contributing to its growing popularity within the developer community for its ease of use and effectiveness in generating robust regex patterns.