cut
Extract sections from lines of text
TLDR
Print a specific [c]haracter/[f]ield range of each line
Print a field range of each line with a specific delimiter
Print a character range of each line of the specific file
Print specific fields of NUL terminated lines (e.g. as in find . -print0) instead of newlines
SYNOPSIS
cut OPTION... [FILE]...
PARAMETERS
-b, --bytes=LIST
Selects bytes from each line. LIST can be a single number, a range (e.g., 1-5), or a comma-separated list of numbers and ranges (e.g., 1,3-5,9).
-c, --characters=LIST
Selects characters from each line. Similar to -b in LIST format. Handles multi-byte characters correctly in modern versions.
-d, --delimiter=DELIM
Uses DELIM instead of TAB for field delimiter when used with -f.
-f, --fields=LIST
Selects fields from each line. LIST format is similar to -b/-c. Requires -d for custom delimiters or uses TAB by default.
-s, --only-delimited
Suppresses lines that do not contain the specified delimiter character. Only useful when used with -f.
--output-delimiter=STRING
Uses STRING as the output delimiter. The default output delimiter is the input delimiter defined by -d.
--complement
Inverts the selection; selects everything except for the specified bytes, characters, or fields.
--zero-terminated
Line delimiter is NUL, not newline. Output lines are NUL-terminated. Useful for piping with xargs -0.
--help
Displays help information and exits.
--version
Outputs version information and exits.
DESCRIPTION
The cut command is a standard Unix utility used to extract specific sections or fields from each line of a file or standard input. It provides flexible ways to select parts of lines by byte position (using -b), character position (using -c), or delimited fields (using -f).
This makes it incredibly useful for processing structured text data, such as log files, CSV content, or output from other commands, where only certain columns or segments are needed. You typically specify the desired selection with options like -b, -c, or -f, often combined with -d to define a field delimiter. By default, cut reads from standard input, or from specified files, and writes the results to standard output. It's a fundamental tool frequently used in shell pipelines with commands like grep, sort, or awk.
CAVEATS
cut operates on a line-by-line basis and doesn't understand complex data structures or nested fields. For more advanced parsing, tools like awk or sed are often more appropriate.
When used with -f, if a line does not contain the specified delimiter, cut will print the entire line by default, unless the -s option is also used.
It is not designed for editing files in place; it reads input and writes to standard output, requiring shell redirection to save changes to a file.
BYTE VS. CHARACTER MODE
The -b option operates on byte positions, which is suitable for fixed-width records or binary data. The -c option operates on character positions, which is generally more intuitive for human-readable text, especially with multi-byte character encodings like UTF-8 (in modern cut implementations).
DEFAULT DELIMITER
When using the -f (fields) option without explicitly specifying a delimiter using -d, cut uses the TAB character as the default field delimiter. This is a common point of confusion for new users who might expect space-delimited fields to be the default.
HISTORY
The cut command is a fundamental and long-standing utility within Unix-like operating systems, part of the Single Unix Specification (SUS). It has been available since the early days of Unix, providing a straightforward yet powerful tool for basic text manipulation. Its design emphasizes simplicity and efficiency, making it a common choice for data extraction tasks in shell scripts, especially when the greater complexity or overhead of tools like awk is not required.