csvsql
Execute SQL queries against CSV files
TLDR
Generate a CREATE TABLE SQL statement for a CSV file
Import a CSV file into an SQL database
Run an SQL query on a CSV file
SYNOPSIS
csvsql [OPTION...] [FILE]
PARAMETERS
-h, --help
Show help message and exit
-v, --version
Show program's version number and exit
--db=DB
Database connection string (SQLite by default)
--insert
Generate INSERT statements
--create[=/path]
Generate CREATE TABLE statements
--tables
Generate both CREATE and INSERT statements
--query=QUERY
Execute given SQL query on CSV
--where=WHERE
WHERE clause for INSERT statements
-s SCHEMA, --schema=SCHEMA
Schema to create tables in
--no-inference
Disable type inference; use TEXT
--types={text|float|numeric|integer|boolean|date}
Override types for all columns
--blanks
Skip generating SQL for blank rows
-f FIELDS, --primary-key=FIELDS
Fields to use as primary key
--overwrite
Drop table if it exists before creating
-d DELIM, --delimiter=DELIM
Field delimiter (default: ',')
-t, --tabs
Tab-delimited input
-q QUOTE, --quote=QUOTE
Quote character
-u, --doublequote
Double-quote characters within fields
--escape=CHAR
Escape character for fields
-z MAX, --maxfieldsize=MAX
Maximum field size in bytes
-e LTERM, --lineterminator=LTERM
Line terminator
--encoding=ENCODING
Input file encoding
-H, --no-header-row
Treat first row as data, not headers
-c COLS, --columns=COLS
Operate on specific columns only
--excel
Use Excel-style escaping
DESCRIPTION
csvsql is a powerful command-line utility from the csvkit suite designed for converting CSV data into SQL statements or querying CSV files with SQL. It excels at generating CREATE TABLE statements with inferred column types, producing INSERT statements for database loading, or executing custom SQL queries directly on CSV input using an in-memory SQLite database.
Key use cases include data migration from CSV to relational databases like PostgreSQL or MySQL, data exploration via SQL without loading into a full DB, and automating ETL processes. Type inference automatically detects integers, floats, booleans, dates, and text, but can be overridden. It supports standard CSV variations with custom delimiters, quotes, and encodings.
When no database is specified, it uses SQLite for queries. For production, connect to external DBs via connection strings. Blank rows can be skipped, primary keys defined, and tables overwritten. Ideal for data engineers and analysts bridging flat files with SQL workflows, saving time on boilerplate scripting.
CAVEATS
Type inference may misclassify data; use --no-inference or --types for precision.
Large CSVs limited by SQLite memory; use --db for external DBs.
Requires csvkit package installed.
EXAMPLE
csvsql --create --insert data.csv > schema.sql
csvsql --query "SELECT * FROM stdin WHERE col1 > 10" data.csv
HISTORY
Part of csvkit, created by Christopher Filler in 2010. Evolved through Python 2/3 transitions; now v2+ maintained on GitHub (sf csvkit org). Widely used in data pipelines since early 2010s.


