gawk
Process and manipulate text-based data
TLDR
Print the fifth column (a.k.a. field) in a space-separated file
Print the second column of the lines containing "foo" in a space-separated file
Print the last column of each line in a file, using a comma (instead of space) as a field separator
Sum the values in the first column of a file and print the total
Print every third line starting from the first line
Print different values based on conditions
Print all the lines which the 10th column value is between a min and a max
Print table of users with UID >=1000 with header and formatted output, using colon as separator (%-20s mean: 20 left-align string characters, %6s means: 6 right-align string characters)
SYNOPSIS
gawk [options] [-f progfile | 'program'] [files]
PARAMETERS
-F fs
Set input field separator (FS variable).
-f file
Read AWK program source from file.
-v var=val
Assign val to var before program runs.
-b, --characters-as-bytes
Treat strings as bytes, ignoring locales.
--posix
Enforce POSIX-compatible behavior.
--traditional
Disable GNU extensions for traditional AWK.
--lint[=value]
Warn about dubious usage (fatal, warn, normal).
-mf n
Max n forks in MP mode (gawk-specific).
-o[outfile]
Dump parse tree to outfile.
--profile[=file]
Profile program execution to file.
--sandbox
Disable file access for security.
-V, --version
Print version and exit.
-h, --help
Print usage summary and exit.
--load ext
Load extension ext at startup.
-e 'prog'
Program source as argument (multiple ok).
DESCRIPTION
gawk is the GNU implementation of AWK, a pattern-matching language for data manipulation and reporting. It reads input line-by-line, splits into fields, tests patterns (often regex), and executes actions like printing reformatted data, calculations, or control flow.
Programs use syntax pattern { action }: patterns select lines (default: all), actions define operations. Special patterns BEGIN run before input, END after. Supports variables (e.g., FS for field separator, OFS for output), arrays (associative), built-in functions (substr, match, printf), and user functions.
Ideal for log analysis, CSV processing, report generation, sysadmin tasks. Processes stdin/files, outputs to stdout. GNU extensions include networking, i18n, extensions via --load. Portable but extensions vary across AWK impls.
CAVEATS
GNU extensions reduce portability; use --posix for standards. High memory use with large arrays/files. Multibyte locales may slow processing.
SPECIAL PATTERNS
BEGIN executes once before input; END once after; /regex/ matches lines.
BUILT-IN VARIABLES
NR (record number), NF (fields), $0 (whole line), $1..$n (fields).
HISTORY
AWK created 1977 by Aho, Weinberger, Kernighan at Bell Labs. gawk first released 1986 by GNU Project; maintained by Arnold Robbins since 1988, adding extensions like networking, profiling.


