csvpy
Execute Python code on CSV files
TLDR
Load a CSV file into a CSVKitReader object
Load a CSV file into a CSVKitDictReader object
SYNOPSIS
csvpy [OPTIONS] [FILE]
csvpy [OPTIONS] -
FILE can be a path to a CSV file or '-' for standard input.
PARAMETERS
-H, --no-header-row
Exclude the header row from the output.
-n, --no-data-rows
Exclude data rows from the output, printing only the header and structure.
-l, --list-column-names
Output a list of column names only. This is equivalent to using -n -H.
-d CHAR, --delimiter CHAR
Specify the character used to separate fields in the input CSV.
-t, --tabs
Use tabs as the delimiter. This option overrides any character set by -d.
-q CHAR, --quotechar CHAR
Specify the character used to quote fields containing special characters.
-u NUM, --quoting NUM
Control how fields are quoted (0: QUOTE_MINIMAL, 1: QUOTE_ALL, 2: QUOTE_NONNUMERIC, 3: QUOTE_NONE).
-b, --no-doublequote
Disable the escaping of quote characters by doubling them within a quoted field.
-p CHAR, --escapechar CHAR
Specify a character to escape special characters not covered by quoting.
-z INT, --maxfieldsize INT
Set the maximum allowed size of a single field in bytes to prevent runaway memory usage.
-e ENCODING, --encoding ENCODING
Specify the character encoding of the input file (e.g., 'utf-8', 'latin-1').
-S, --skipinitialspace
Skip whitespace immediately following the delimiter.
-I, --ignore-aliases
Do not attempt to convert column names to valid Python identifiers (e.g., spaces to underscores).
--version
Show the program's version number and exit.
--locale LOCALE
Specify the locale to use for number formatting, especially for numerical columns.
DESCRIPTION
csvpy is a command-line utility from the
csvkit suite that converts a CSV file into
Python data structures (typically a list of lists or
a list of dictionaries).
It's particularly useful for quickly generating
Python code that represents your CSV data,
which can then be directly incorporated into Python
scripts, used in interactive Python sessions, or
for debugging purposes.
Instead of processing the data for display or
transformation, it outputs Python literal representations,
demonstrating how csvkit internally parses CSVs.
It provides a convenient way to get a Python-friendly
view of your tabular data without writing
parsing logic from scratch, making data accessible
for immediate programmatic use.
CAVEATS
csvpy is designed to output Python code, not raw data for further piping
to other command-line tools in the traditional Unix sense.
For very large CSV files, the generated Python output can be
extremely verbose and may consume significant memory
if directly loaded into a Python interpreter without care.
It's an integral part of the
csvkit suite, so it requires
csvkit to be installed to function.
<B>USE CASES</B>
csvpy is commonly used for:
1. Generating boilerplate Python code to represent CSV data for testing or quick prototyping.
2. Quickly inspecting the structure of a CSV file in a Pythonic way, showing how csvkit interprets columns and rows.
3. Debugging CSV parsing issues by seeing exactly how csvkit translates the file's contents into Python objects.
<B>PIPING AND INPUT</B>
csvpy can read from standard input (pipe) by specifying '-' as the file argument, allowing it to be integrated into shell pipelines:
cat my_data.csv | csvpy
some_command_generating_csv | csvpy -
However, it's important to remember that its output is Python code, not raw tabular data, which limits its utility for piping into other text-processing command-line tools like grep or awk.
HISTORY
csvpy is a component of the
csvkit project, an open-source suite of command-line tools for
working with CSV files.
csvkit was initially developed by Christopher Groskopf
around 2011 to provide robust, text-based tools for common
CSV data processing tasks, drawing inspiration from Unix
utilities.
csvpy's specific role was to bridge the gap
between raw CSV data and its direct use within
Python environments, making it easier for developers
and data analysts to integrate CSV content into
scripts or explore data interactively in Python.
It reflects csvkit's underlying Pythonic approach to CSV parsing.