LinuxCommandLibrary

cut

Extract sections from lines of text

TLDR

Print a specific [c]haracter/[f]ield range of each line

$ [command] | cut --[characters|fields] [1|1,10|1-10|1-|-10]
copy

Print a field range of each line with a specific delimiter
$ [command] | cut [[-d|--delimiter]] "[delimiter]" [[-f|--fields]] [1|1,10|1-10|1-|-10]
copy

Print a character range of each line of the specific file
$ cut [[-c|--characters]] [1] [path/to/file]
copy

Print specific fields of NUL terminated lines (e.g. as in find . -print0) instead of newlines
$ [command] | cut [[-z|--zero-terminated]] [[-f|--fields]] [1]
copy

SYNOPSIS

cut OPTION... [FILE]...

PARAMETERS

-b, --bytes=LIST
    Select only these bytes.

-c, --characters=LIST
    Select only these characters.

-d, --delimiter=DELIM
    Use DELIM instead of TAB for field delimiter.

-f, --fields=LIST
    Select only these fields; also print any line that contains no delimiter character, unless the -s option is specified.

-n
    With `-b`: don't split multibyte characters.

-s, --only-delimited
    Do not print lines not containing delimiters.

--output-delimiter=STRING
    Use STRING as the output delimiter. The default is to use the input delimiter.

--help
    Display a help message and exit.

--version
    Output version information and exit.

DESCRIPTION

The `cut` command in Linux is a utility for extracting specific sections from lines of text in a file or from standard input. It allows you to select portions of lines based on delimiters or character positions. Common use cases include extracting specific columns from delimited data (like CSV files), extracting characters from fixed-width data, or manipulating text strings. `cut` is a fundamental text processing tool, often used in conjunction with other commands via pipes to perform more complex data transformations. It's designed for relatively simple extraction tasks; for more complex pattern matching or data manipulation, tools like `awk` or `sed` are often preferred. However, `cut`'s simplicity and efficiency make it a valuable tool in many scripting and command-line workflows.

It is very easy to use with shell scripts, because the syntax is designed to be short and easy to read.

CAVEATS

The `-n` option is only relevant when using `-b` and dealing with multibyte character encodings like UTF-8. Without `-n`, `cut` might split multibyte characters, resulting in invalid output.

When using `-f` without specifying `-d`, the default delimiter is the TAB character. Be mindful of whitespace in your input files when relying on the default delimiter.

LIST SYNTAX

The LIST parameter used with `-b`, `-c`, and `-f` specifies the range of bytes, characters, or fields to extract. It can be a single number, a range (e.g., `1-5`), a list of numbers (e.g., `1,3,5`), or a combination of ranges and numbers (e.g., `1-3,5,8-10`). A missing start number in a range means 'the first,' and a missing end number means 'the last.' For example, `-c -5` means characters 1 through 5, and `-c 5-` means character 5 through the end of the line.

EXIT STATUS

The `cut` command returns an exit status of 0 on success and a non-zero value on error. Errors can occur due to invalid options, incorrect file permissions, or if the specified file does not exist.

HISTORY

The `cut` command has been a standard utility in Unix-like operating systems for a long time, originating in early versions of Unix. It was designed as a basic text processing tool to extract columns or sections of lines from files. Over time, the command has been standardized by POSIX, ensuring consistent behavior across different Unix-like systems. The core functionality of `cut` has remained largely unchanged, focusing on its initial purpose of simple, efficient extraction.

SEE ALSO

awk(1), sed(1), grep(1)

Copied to clipboard