LinuxCommandLibrary

csvsql

SQL query executor and database importer for CSV

TLDR

Execute SQL query on a CSV file

$ csvsql --query "[SELECT * FROM data WHERE id > 10]" [data.csv]
copy
Generate CREATE TABLE statement for a CSV
$ csvsql [data.csv]
copy
Import CSV into database
$ csvsql --db [postgresql:///mydb] --insert [data.csv]
copy
Query multiple files (tables named after filenames)
$ csvsql --query "[SELECT * FROM file1 JOIN file2 ON file1.id = file2.id]" [file1.csv] [file2.csv]
copy
Create table and insert data into SQLite
$ csvsql --db [sqlite:///data.db] --insert --create-if-not-exists [data.csv]
copy
Query with aggregation
$ csvsql --query "[SELECT category, COUNT(*) FROM data GROUP BY category]" [data.csv]
copy

SYNOPSIS

csvsql [options] file...

DESCRIPTION

csvsql is part of csvkit that enables SQL queries on CSV files or imports CSV data into databases. It creates an in-memory SQLite database for queries or connects to external databases for import operations.
Without --query, it outputs CREATE TABLE statements suitable for the data, useful for generating schema. With --query, it executes SQL against the CSV data, supporting joins, aggregations, and all SQL operations.
For database import, csvsql supports various databases through SQLAlchemy connection strings, including PostgreSQL, MySQL, SQLite, and others. It handles type inference to create appropriate column definitions.

PARAMETERS

--query SQL

Execute SQL query on the CSV data.
--db CONNECTION
Database connection string (SQLAlchemy format).
--insert
Insert data into database (requires --db).
--create-if-not-exists
Create table if it doesn't exist.
--tables NAMES
Comma-separated table names (default: filenames).
--no-create
Don't generate CREATE TABLE statement.
-d CHAR, --delimiter CHAR
Field delimiter (default: comma).
-e ENCODING, --encoding ENCODING
Input file encoding.
--no-inference
Disable type inference.

CAVEATS

In-memory queries load all data into RAM. Complex queries on large files may be slow. Type inference can misidentify columns. Database imports require appropriate drivers installed.

HISTORY

csvsql is part of csvkit, created by Christopher Groskopf in 2011. It bridges the gap between CSV files and databases, enabling SQL-powered data analysis without manual database setup.

SEE ALSO

csvkit(1), sqlite3(1), sql2csv(1)

> TERMINAL_GEAR

Curated for the Linux community

Copied to clipboard

> TERMINAL_GEAR

Curated for the Linux community