textql
Execute SQL queries against text-based data
TLDR
Print the lines in the specified CSV file that match an SQL query to stdout
Query a TSV file
Query file with header row
Read data from stdin
Join two files on a specified common column
Format output using an output delimiter with an output header line
SYNOPSIS
textql [options] [--query <SQL query> | --sql <file>] [<input_file>...]
textql [options] <SQL query>
PARAMETERS
-header, --header
Treat the first row of input files as column headers. These headers will be used as column names in SQL queries.
-dlm <char>, --delimiter <char>
Specify the input field delimiter (e.g., ',' for CSV, '\t' for TSV). Defaults to comma.
-tab-table
Shorthand for --delimiter='\t', indicating that input files are tab-separated.
-output-dlm <char>, --output-delimiter <char>
Specify the output field delimiter. Defaults to the input delimiter if not explicitly set.
-output-header, --output-header
Include column headers in the query output. Useful for generating new delimited files.
-source <file>..., --source <file>...
Specify one or more input file paths for the query. Can be omitted if data is piped via stdin.
-query <query>, --query <query>
The SQL query string to execute. It's recommended to enclose the query in quotes.
-sql <file>, --sql <file>
Path to a file containing the SQL query to execute, allowing for complex queries.
-output-file <file>, -o <file>
Direct the query output to the specified file instead of printing to stdout.
-db-path <path>, --db-path <path>
Path to a SQLite database file. Useful for persistent tables, indexing, or more complex multi-step queries.
-save-to <table>, --save-to <table>
Save the results of the query into a specified table within the database given by -db-path.
-console, --console
Start an interactive SQLite console with the input files already loaded as temporary tables, for direct SQL interaction.
-pretty, --pretty
Format output in a human-readable, tabular form, especially useful for displaying results directly in the terminal.
-version
Print the textql version and exit.
DESCRIPTION
textql is a versatile command-line utility that enables users to execute SQL queries directly on CSV, TSV, or any other delimited text files. It eliminates the need for manual data imports into a traditional database by treating each input file as a virtual table. Built upon the robust SQLite engine, textql allows for powerful data manipulation, including filtering, sorting, aggregation, and joining across multiple files, using familiar SQL syntax.
It's an invaluable tool for ad-hoc data analysis, quick reporting, and automating data transformation tasks on flat files. Whether you need to extract specific columns, count occurrences, or combine data from several spreadsheets, textql provides a fast and efficient solution right from your terminal.
CAVEATS
While powerful, textql treats input files as temporary tables by default unless the --db-path option is used, meaning changes are not saved back to the original files directly. Its SQL dialect is based on SQLite, so specific functions or syntaxes from other SQL implementations (e.g., PostgreSQL, MySQL) might not be supported. Performance for extremely large files (gigabytes+) can be slower than dedicated database systems, though it generally performs well for typical command-line data processing tasks.
TABLE NAMING CONVENTION
When processing files, textql automatically assigns table names based on the input filename. For example, 'data.csv' becomes the table 'data' within the SQL context. If multiple files are provided, each corresponds to its own table name based on its base name (filename without extension).
PIPING INPUT
textql can accept input directly from stdin. When piping data, the input is typically treated as a single table named 'stdin'. This allows for chaining commands, e.g., 'cat file.csv | textql "SELECT * FROM stdin WHERE column > 10"'.
HISTORY
textql was initially developed by Paul Pellegrini and released as an open-source project written in Go. Its creation was motivated by the desire to bring the power and familiarity of SQL to the common use case of processing delimited text files directly from the command line, without the overhead of importing data into a full-fledged database. It quickly gained traction among data analysts and system administrators for its simplicity, efficiency, and cross-platform compatibility.