textql
Execute SQL queries against text-based data
TLDR
Print the lines in the specified CSV file that match an SQL query to stdout
Query a TSV file
Query file with header row
Read data from stdin
Join two files on a specified common column
Format output using an output delimiter with an output header line
SYNOPSIS
textql [options]
PARAMETERS
--input <file>
Specifies the input file to query. If not provided, textql will try to read from STDIN.
--sql <query>
The SQL query to execute against the data. Use single quotes to encapsulate your query to avoid shell interpretation.
--dl <delimiter>
Sets the delimiter for the input file. Common delimiters include comma (,), tab (\t), and pipe (|). Defaults to comma.
--header
Specifies that the first row of the input file should be treated as the header row, defining the column names. If not provided, columns will be named col1, col2, etc.
--output-table
The name of the table you want to use to query your file.
--output-dl <delimiter>
Delimiter for output, defaults to comma.
--max-lines <number>
Sets the maximum number of lines to read from input to determine schema.
--verbose
Enables verbose output, providing more information about the process.
--version
Displays the version of textql.
DESCRIPTION
textql is a command-line tool that allows you to treat text-based data files like CSV, TSV, and log files as relational tables and query them using SQL.
It automatically infers the schema from the file contents, making it easy to extract, transform, and analyze data without complex scripting.
It helps you to analyze big amount of data in a simple way.
CAVEATS
textql relies on SQLite under the hood, so complex queries that may be slow in SQLite will also be slow in textql.
The automatic schema inference may not always be perfect, especially with complex or inconsistent data files.
You may need to manually adjust the delimiter or column names.
<B>EXAMPLES</B>
1. Querying a CSV file:textql --input data.csv --sql 'SELECT * FROM data WHERE column1 > 10'
2. Using a tab-separated file:textql --input data.tsv --dl '\t' --header --sql 'SELECT column2, column3 FROM data'
3. Reading from STDIN:cat data.csv | textql --sql 'SELECT * FROM stdin WHERE col1 LIKE "A%"'
<B>PERFORMANCE CONSIDERATIONS</B>
For very large files, consider using the `--max-lines` parameter to limit the number of lines used for schema inference. Optimizing your SQL queries is also crucial for performance. Using indexes and avoiding full table scans can significantly improve query speed.
HISTORY
textql was created to provide a more intuitive way to query text-based data files compared to traditional tools like `awk` and `sed`.
It leverages the power and familiarity of SQL, making data analysis more accessible to a wider audience. The tool has gained popularity as a quick and easy solution for ad-hoc data exploration and transformation.