LinuxCommandLibrary

csvcut

Extract columns from CSV files

TLDR

Print indices and names of all columns

$ csvcut [[-n|--names]] [data.csv]
copy

Extract the first and third columns
$ csvcut [[-c|--columns]] [1,3] [data.csv]
copy

Extract all columns except the fourth one
$ csvcut [[-C|--not-columns]] [4] [data.csv]
copy

Extract the columns named "id" and "first name" (in that order)
$ csvcut [[-c|--columns]] [id,"first name"] [data.csv]
copy

SYNOPSIS

csvcut [options] [FILE]

PARAMETERS

-c COLS, --columns COLS
    Comma-separated list of column names or indices (starting at 1) to keep

-C COLS, --not-columns COLS
    Comma-separated list of column names or indices to exclude

-n, --names
    Print column names and indices without data

-H, --no-header-row
    Treat first row as data, not headers

--zero-based
    Use zero-based column indices

-d DELIM, --delimiter DELIM
    Field delimiter for input (default: comma)

-t, --tabs
    Input uses tabs as delimiter

-q QUOTECHAR, --quotechar QUOTECHAR
    Quote character for input (default: ")

-u, --doublequote
    Double quotes are escaped by doubling them

-b, --blanks
    Do not ignore blank column names

-P, --no-inference
    Disable type inference on input

-N HEADER, --names HEADER
    Replace column headers with these names

-K SKIPLINES, --skip-lines SKIPLINES
    Number of initial lines to skip

-S, --skip-initial-space
    Ignore whitespace after delimiters

DESCRIPTION

csvcut is a powerful command-line utility from the csvkit suite for selecting specific columns from CSV files. Unlike traditional tools like cut, it handles CSV semantics intelligently: it respects headers, quoted fields, delimiters, and irregular quoting. Users can specify columns by name (e.g., name,age) or index (e.g., 1,3), exclude columns, or preview available columns with -n. It supports various CSV dialects via options for delimiters, quotes, and escaping.

Ideal for data wrangling, csvcut enables quick subsetting of large datasets, renaming headers, skipping lines, or disabling type inference. Output remains valid CSV, piping seamlessly to other tools like csvlook or csvstat. It processes stdin or files, making it versatile for scripts and pipelines. For malformed CSVs, pair with csvclean. Efficient for big data, it avoids loading entire files into memory where possible.

CAVEATS

Assumes mostly well-formed CSV; use csvclean for fixes. Column names with spaces or specials need quoting. Large files process efficiently but complex quoting may slow parsing.

EXAMPLES

csvcut -c name,age data.csv
Keep 'name' and 'age' columns.

csvcut -C 1,3 input.csv
Exclude first and third columns.

csvcut -n file.csv | head
List column info.

HISTORY

Part of csvkit, created by Kenneth Reitz in 2010 for Python-based CSV tools. Evolved through community contributions; now at version 2.x, emphasizing cross-platform compatibility and dialect support.

SEE ALSO

csvlook(1), csvgrep(1), csvstat(1), cut(1)

Copied to clipboard