cut

Extract sections from lines of text

TLDR

Print the fifth character on each line

$ [command] | cut [[-c|--characters]] 5

Print the fifth to tenth character of each line of the specified file

$ cut [[-c|--characters]] 5-10 [path/to/file]

Split each line in a file by a delimiter into fields and print fields two and six (default delimiter is TAB)

$ cut [[-f|--fields]] 2,6 [path/to/file]

Split each line by the specified delimiter and print all from the second field onward

$ [command] | cut [[-d|--delimiter]] "[delimiter]" [[-f|--fields]] 2-

Use space as a delimiter and print only the first 3 fields

$ [command] | cut [[-d|--delimiter]] " " [[-f|--fields]] -3

Print specific fields of lines that use NUL to terminate lines instead of newlines

$ [find . -print0] | cut [[-z|--zero-terminated]] [[-d|--delimiter]] "[/]" [[-f|--fields]] [2]

-b, --bytes=LIST
    Selects bytes from each line. LIST can be a single number, a range (e.g., 1-5), or a comma-separated list of numbers and ranges (e.g., 1,3-5,9).

-c, --characters=LIST
    Selects characters from each line. Similar to -b in LIST format. Handles multi-byte characters correctly in modern versions.

-d, --delimiter=DELIM
    Uses DELIM instead of TAB for field delimiter when used with -f.

-f, --fields=LIST
    Selects fields from each line. LIST format is similar to -b/-c. Requires -d for custom delimiters or uses TAB by default.

-s, --only-delimited
    Suppresses lines that do not contain the specified delimiter character. Only useful when used with -f.

--output-delimiter=STRING
    Uses STRING as the output delimiter. The default output delimiter is the input delimiter defined by -d.

--complement
    Inverts the selection; selects everything except for the specified bytes, characters, or fields.

--zero-terminated
    Line delimiter is NUL, not newline. Output lines are NUL-terminated. Useful for piping with xargs -0.

--help
    Displays help information and exits.

--version
    Outputs version information and exits.

DESCRIPTION

The cut command is a standard Unix utility used to extract specific sections or fields from each line of a file or standard input. It provides flexible ways to select parts of lines by byte position (using -b), character position (using -c), or delimited fields (using -f).

This makes it incredibly useful for processing structured text data, such as log files, CSV content, or output from other commands, where only certain columns or segments are needed. You typically specify the desired selection with options like -b, -c, or -f, often combined with -d to define a field delimiter. By default, cut reads from standard input, or from specified files, and writes the results to standard output. It's a fundamental tool frequently used in shell pipelines with commands like grep, sort, or awk.

CAVEATS

cut operates on a line-by-line basis and doesn't understand complex data structures or nested fields. For more advanced parsing, tools like awk or sed are often more appropriate.

When used with -f, if a line does not contain the specified delimiter, cut will print the entire line by default, unless the -s option is also used.

It is not designed for editing files in place; it reads input and writes to standard output, requiring shell redirection to save changes to a file.

BYTE VS. CHARACTER MODE

The -b option operates on byte positions, which is suitable for fixed-width records or binary data. The -c option operates on character positions, which is generally more intuitive for human-readable text, especially with multi-byte character encodings like UTF-8 (in modern cut implementations).

DEFAULT DELIMITER

When using the -f (fields) option without explicitly specifying a delimiter using -d, cut uses the TAB character as the default field delimiter. This is a common point of confusion for new users who might expect space-delimited fields to be the default.

HISTORY

The cut command is a fundamental and long-standing utility within Unix-like operating systems, part of the Single Unix Specification (SUS). It has been available since the early days of Unix, providing a straightforward yet powerful tool for basic text manipulation. Its design emphasizes simplicity and efficiency, making it a common choice for data extraction tasks in shell scripts, especially when the greater complexity or overhead of tools like awk is not required.

cut