LinuxCommandLibrary

csvpy

Execute Python code on CSV files

TLDR

Load a CSV file into a CSVKitReader object

$ csvpy [data.csv]
copy

Load a CSV file into a CSVKitDictReader object
$ csvpy --dict [data.csv]
copy

SYNOPSIS

csvpy [OPTIONS] [FILE]
csvpy [OPTIONS] -
FILE can be a path to a CSV file or '-' for standard input.

PARAMETERS

-H, --no-header-row
    Exclude the header row from the output.

-n, --no-data-rows
    Exclude data rows from the output, printing only the header and structure.

-l, --list-column-names
    Output a list of column names only. This is equivalent to using -n -H.

-d CHAR, --delimiter CHAR
    Specify the character used to separate fields in the input CSV.

-t, --tabs
    Use tabs as the delimiter. This option overrides any character set by -d.

-q CHAR, --quotechar CHAR
    Specify the character used to quote fields containing special characters.

-u NUM, --quoting NUM
    Control how fields are quoted (0: QUOTE_MINIMAL, 1: QUOTE_ALL, 2: QUOTE_NONNUMERIC, 3: QUOTE_NONE).

-b, --no-doublequote
    Disable the escaping of quote characters by doubling them within a quoted field.

-p CHAR, --escapechar CHAR
    Specify a character to escape special characters not covered by quoting.

-z INT, --maxfieldsize INT
    Set the maximum allowed size of a single field in bytes to prevent runaway memory usage.

-e ENCODING, --encoding ENCODING
    Specify the character encoding of the input file (e.g., 'utf-8', 'latin-1').

-S, --skipinitialspace
    Skip whitespace immediately following the delimiter.

-I, --ignore-aliases
    Do not attempt to convert column names to valid Python identifiers (e.g., spaces to underscores).

--version
    Show the program's version number and exit.

--locale LOCALE
    Specify the locale to use for number formatting, especially for numerical columns.

DESCRIPTION

csvpy is a command-line utility from the
csvkit suite that converts a CSV file into
Python data structures (typically a list of lists or
a list of dictionaries).
It's particularly useful for quickly generating
Python code that represents your CSV data,
which can then be directly incorporated into Python
scripts, used in interactive Python sessions, or
for debugging purposes.
Instead of processing the data for display or
transformation, it outputs Python literal representations,
demonstrating how csvkit internally parses CSVs.
It provides a convenient way to get a Python-friendly
view of your tabular data without writing
parsing logic from scratch, making data accessible
for immediate programmatic use.

CAVEATS

csvpy is designed to output Python code, not raw data for further piping
to other command-line tools in the traditional Unix sense.
For very large CSV files, the generated Python output can be
extremely verbose and may consume significant memory
if directly loaded into a Python interpreter without care.
It's an integral part of the
csvkit suite, so it requires
csvkit to be installed to function.

<B>USE CASES</B>

csvpy is commonly used for:
1. Generating boilerplate Python code to represent CSV data for testing or quick prototyping.
2. Quickly inspecting the structure of a CSV file in a Pythonic way, showing how csvkit interprets columns and rows.
3. Debugging CSV parsing issues by seeing exactly how csvkit translates the file's contents into Python objects.

<B>PIPING AND INPUT</B>

csvpy can read from standard input (pipe) by specifying '-' as the file argument, allowing it to be integrated into shell pipelines:
cat my_data.csv | csvpy
some_command_generating_csv | csvpy -
However, it's important to remember that its output is Python code, not raw tabular data, which limits its utility for piping into other text-processing command-line tools like grep or awk.

HISTORY

csvpy is a component of the
csvkit project, an open-source suite of command-line tools for
working with CSV files.
csvkit was initially developed by Christopher Groskopf
around 2011 to provide robust, text-based tools for common
CSV data processing tasks, drawing inspiration from Unix
utilities.
csvpy's specific role was to bridge the gap
between raw CSV data and its direct use within
Python environments, making it easier for developers
and data analysts to integrate CSV content into
scripts or explore data interactively in Python.
It reflects csvkit's underlying Pythonic approach to CSV parsing.

SEE ALSO

csvkit(1), csvcut(1), csvgrep(1), csvlook(1), csvstat(1)

Copied to clipboard