LinuxCommandLibrary

polars

Query and convert data files via CLI

TLDR

Read and display CSV file

$ polars read [file.csv]
copy
Query with SQL
$ polars sql "SELECT * FROM '[file.csv]' WHERE value > 100"
copy
Convert CSV to Parquet
$ polars convert [input.csv] [output.parquet]
copy
Show schema of file
$ polars schema [file.parquet]
copy
Filter and output as JSON
$ polars sql "SELECT name, score FROM '[data.csv]' ORDER BY score DESC LIMIT 10" -o json
copy
Join two files
$ polars sql "SELECT * FROM '[a.csv]' JOIN '[b.csv]' ON a.id = b.id"
copy

SYNOPSIS

polars command [options] [args...]

DESCRIPTION

polars is the command-line interface for Polars, a fast DataFrame library. It provides SQL querying and format conversion for data files without writing code.
The sql command executes SQL queries directly on files. Reference files as table names in quotes within the query. Polars' query engine optimizes execution for large datasets.
Supported formats include CSV, Parquet, JSON, and Arrow. The convert command transforms between formats, useful for creating optimized Parquet files from CSV sources.
Polars uses Apache Arrow columnar format internally, enabling efficient processing of large datasets with minimal memory overhead. Query optimization includes predicate pushdown and projection.

PARAMETERS

-o, --output format

Output format: csv, json, parquet, table.
--delimiter char
CSV delimiter character.
--no-header
CSV has no header row.
-n, --limit rows
Limit output rows.
-h, --help
Display help information.
-V, --version
Display version information.

COMMANDS

read file

Read and display data file.
sql query
Execute SQL query on file(s).
schema file
Display schema/column information.
convert input output
Convert between formats (CSV, Parquet, JSON, Arrow).

CAVEATS

The CLI provides a subset of Polars library features. Complex transformations may require the Python or Rust API. Very large files benefit from Parquet format. SQL dialect has some differences from standard SQL.

HISTORY

Polars was created by Ritchie Vink in 2020 as a fast alternative to pandas. Written in Rust with Python bindings, it quickly gained popularity for performance-critical data processing. The DataFrame library leverages Apache Arrow and lazy evaluation. The CLI tool was added to enable command-line data workflows. Polars has become a leading choice for large-scale data analysis.

SEE ALSO

duckdb(1), datafusion-cli(1), xsv(1), miller(1)

> TERMINAL_GEAR

Curated for the Linux community

Copied to clipboard

> TERMINAL_GEAR

Curated for the Linux community