csvtool
Manipulate CSV (Comma Separated Value) files
TLDR
Extract the second column from a CSV file
Extract the second and fourth columns from a CSV file
Extract lines from a CSV file where the second column exactly matches 'Foo'
Extract lines from a CSV file where the second column starts with 'Bar'
Find lines in a CSV file where the second column ends with 'Baz' and then extract the third and sixth columns
SYNOPSIS
csvtool subcommand [options] [input_file...]
PARAMETERS
-t char
Specify input field separator (default ',').
--separator char
Long form for specifying input field separator.
-T char
Specify output field separator (default ',').
--output-separator char
Long form for specifying output field separator.
-u
Do not quote output fields.
--unquoted
Long form for not quoting output fields.
-q char
Specify output quote character (default '"').
--quote char
Long form for specifying output quote character.
-U char
Separate unquoted fields with specified character.
--unquoted-separator char
Long form for separating unquoted fields with specified character.
-n
Do not treat the first line as a header.
--no-header
Long form for not treating the first line as a header.
-H
Explicitly treat the first line as a header (default for some commands).
--header
Long form for explicitly treating the first line as a header.
-v
Show version information and exit.
--version
Long form for showing version information.
-h
Display help message and exit.
--help
Long form for displaying help message.
subcommand-specific-options
Many subcommands accept additional options tailored to their specific functionality. Use `csvtool subcommand --help` for details (e.g., `-f`, `-r`, `-n`).
DESCRIPTION
csvtool is a powerful and versatile command-line utility designed specifically for processing Comma Separated Values (CSV) files. Unlike general-purpose text processing tools like awk or sed, csvtool understands the structured nature of CSV, correctly handling quoted fields, various delimiters, and embedded newlines. It provides a wide array of operations, including selecting specific columns, reordering fields, merging multiple CSV files, splitting large files, sorting data, identifying unique records, and counting rows. Its intuitive syntax makes complex CSV manipulations straightforward, making it an indispensable tool for data analysts, system administrators, and anyone working with tabular data in the command line environment. It streamlines tasks that would be cumbersome with less specialized tools, ensuring data integrity by respecting CSV formatting rules.
CAVEATS
csvtool is a powerful tool but might not be installed by default on all Linux distributions, requiring manual installation. While generally efficient, for extremely large datasets or highly complex transformations, specialized scripting (e.g., Python with Pandas) or database solutions might offer superior performance or flexibility. The 'select' subcommand uses a custom expression language that requires familiarity with its syntax.
KEY FEATURES
csvtool is distinguished by its direct handling of CSV intricacies, including quoted fields and varied delimiters, ensuring data integrity. It offers a rich set of subcommands for common data tasks, such as column extraction, row filtering, sorting, merging, and more. Its performance is optimized for typical command-line usage, providing a fast and reliable alternative to general-purpose text utilities for structured data.
BASIC USAGE EXAMPLES
Below are some common usage examples for csvtool:
Extract columns 1 and 3 from a file:
csvtool col 1,3 input.csv
Sort a CSV file by the second column numerically:
csvtool sort -n 2 input.csv
Filter rows where the first column equals 'active':
csvtool select '{1} == "active"' input.csv
Concatenate multiple CSV files:
csvtool cat file1.csv file2.csv > output.csv
HISTORY
csvtool was developed by Chris Double as a lightweight, efficient, and user-friendly command-line alternative for common CSV manipulation tasks. Its design prioritizes correct handling of CSV specifics like quoted fields and delimiters, adhering to the Unix philosophy. It has become a valuable utility for quick and reliable CSV processing in command-line environments, avoiding reliance on more complex scripting language setups.