Show all stats for all columns

$ csvstat [data.csv]

Show all stats for columns 2 and 4
$ csvstat -c [2,4] [data.csv]

Show sums for all columns
$ csvstat --sum [data.csv]

Show the max value length for column 3
$ csvstat -c [3] --len [data.csv]

Show the number of unique values in the "name" column
$ csvstat -c [name] --unique [data.csv]


usage: csvstat [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]

[-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-S] [-H] [-K SKIP_LINES] [-v] [-l] [--zero] [-V] [--csv] [-n] [-c COLUMNS] [--type] [--nulls] [--unique] [--min] [--max] [--sum] [--mean] [--median] [--stdev] [--len] [--freq] [--freq-count FREQ_COUNT] [--count] [-y SNIFF_LIMIT] [FILE]

Print descriptive statistics for each column in a CSV file.

positional arguments:


The CSV file to operate on. If omitted, will accept input as piped data via STDIN.

optional arguments:

-h, --help

show this help message and exit


Delimiting character of the input CSV file.

-t, --tabs

Specify that the input CSV file is delimited with tabs. Overrides "-d".


Character used to quote strings in the input CSV file.

-u {0,1,2,3}, --quoting {0,1,2,3}

Quoting style used in the input CSV file. 0 = Quote Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 = Quote None.

-b, --no-doublequote

Whether or not double quotes are doubled in the input CSV file.


Character used to escape the delimiter if --quoting 3 ("Quote None") is specified and to escape the QUOTECHAR if --no-doublequote is specified.


Maximum length of a single field in the input CSV file.

-e ENCODING, --encoding ENCODING

Specify the encoding of the input CSV file.

-S, --skipinitialspace

Ignore whitespace immediately following the delimiter.

-H, --no-header-row

Specify that the input CSV file has no header row. Will create default headers (a,b,c,...).

-K SKIP_LINES, --skip-lines SKIP_LINES

Specify the number of initial lines to skip before the header row (e.g. comments, copyright notices, empty rows).

-v, --verbose

Print detailed tracebacks when errors occur.

-l, --linenumbers

Insert a column of line numbers at the front of the output. Useful when piping to grep or as a simple primary key.


When interpreting or displaying column numbers, use zero-based numbering instead of the default 1-based numbering.

-V, --version

Display version information and exit.


Output results as a CSV, rather than text.

-n, --names

Display column names and indices from the input CSV and exit.

-c COLUMNS, --columns COLUMNS

A comma separated list of column indices, names or ranges to be examined, e.g. "1,id,3-5". Defaults to all columns.


Only output data type.


Only output whether columns contains nulls.


Only output counts of unique values.


Only output smallest values.


Only output largest values.


Only output sums.


Only output means.


Only output medians.


Only output standard deviations.


Only output the length of the longest values.


Only output lists of frequent values.

--freq-count FREQ_COUNT

The maximum number of frequent values to display.


Only output total row count.

-y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT

Limit CSV dialect sniffing to the specified number of bytes. Specify "0" to disable sniffing entirely.


The full documentation for csvstat is maintained as a Texinfo manual. If the info and csvstat programs are properly installed at your site, the command info csvstat should give you access to the complete manual.

Copied to clipboard