LinuxCommandLibrary

csvkit

Suite of command-line CSV processing tools

TLDR

Display CSV file in a readable table format

$ csvlook [data.csv]
copy
Get statistics about a CSV file
$ csvstat [data.csv]
copy
Select specific columns
$ csvcut -c [col1,col2] [data.csv]
copy
Sort by a column
$ csvsort -c [column] [data.csv]
copy
Query CSV with SQL
$ csvsql --query "[SELECT * FROM data WHERE id > 100]" [data.csv]
copy
Stack multiple CSV files vertically
$ csvstack [file1.csv] [file2.csv]
copy

SYNOPSIS

csvkit is a suite of utilities for working with CSV files

DESCRIPTION

csvkit is a comprehensive suite of command-line tools for working with CSV files. It brings database-like operations to tabular data without requiring a database, following Unix philosophy principles.
The tools handle CSV quoting and escaping correctly, avoiding the pitfalls of using awk, sed, or cut directly on CSV data. They support various input encodings and delimiters, making them versatile for real-world data processing.
csvkit is particularly useful for data journalism, quick data exploration, ETL processes, and as part of data pipelines. All tools can read from stdin and write to stdout for easy chaining.

INCLUDED TOOLS

in2csv

Convert various formats (Excel, JSON) to CSV.
csvlook
Display CSV in a human-readable table format.
csvstat
Generate statistics for CSV columns.
csvcut
Select and reorder columns.
csvgrep
Filter rows by column values.
csvsort
Sort rows by columns.
csvjoin
Join two CSV files on common columns.
csvstack
Concatenate CSV files vertically.
csvsql
Query CSV files using SQL.
csvjson
Convert CSV to JSON.
csvclean
Validate and fix CSV formatting issues.
csvformat
Convert CSV to other delimited formats.

CAVEATS

Some operations load entire files into memory. Type inference can sometimes misclassify data. Performance may be slower than specialized tools for very large files. Requires Python installation.

HISTORY

csvkit was created by Christopher Groskopf and first released in 2011. It was designed to provide data journalists and analysts with powerful command-line tools for processing CSV data, becoming a standard toolkit in the data science community.

SEE ALSO

miller(1), xsv(1), jq(1), awk(1)

> TERMINAL_GEAR

Curated for the Linux community

Copied to clipboard

> TERMINAL_GEAR

Curated for the Linux community