agg
Aggregate input data
TLDR
Create a GIF
Create a GIF that is 80 columns wide and 25 rows in height
Create a GIF with a font size of 24 pixels
Display help
SYNOPSIS
agg [OPTIONS] '<sql_query>' [FILE ...]
PARAMETERS
-d, --delimiter <DELIM>
Field delimiter (default: auto-detect from first line)
-q, --quote <QUOTE>
Quote character (default: ")
-H, --no-header
Input lacks header row
-i, --ignore-case
Case-insensitive string matching
-I, --ignore-case-names
Case-insensitive column name matching
-s, --silent
Suppress error messages
-S, --stats
Print execution statistics
-n, --no-header-output
Omit column names from output
-N, --names-from-sql
Derive output column names from SQL query
--help
Show this help message
--version
Print version information
DESCRIPTION
Agg is a high-performance command-line tool for aggregating tabular data in CSV or TSV files using a simplified SQL-like query syntax. Designed for speed on large datasets, it excels at GROUP BY operations, filtering, and computing aggregates like COUNT, SUM, AVG, MIN, MAX, STDDEV, and FIRST/LAST.
Unlike line-based tools such as awk or grep, agg parses structured data, automatically detecting delimiters, quotes, and headers. Queries are enclosed in single quotes to preserve SQL syntax from shell expansion, e.g., agg 'select avg(age) group by city' data.csv.
It supports multiple input files, stdin piping, case-insensitive matching, and customizable output. Written in Rust, agg processes gigabytes quickly, making it ideal for log analysis, data exploration, and ad-hoc reporting in data science workflows. Limitations include no support for joins, updates, or full SQL features—focused purely on aggregation.
Output includes headers by default (unless suppressed) and can derive names from the query. Stats mode reveals parsing and execution times for optimization.
CAVEATS
Supports only aggregate SELECT with GROUP BY; no JOINs, subqueries, or non-aggregate SELECT. Limited SQL dialect. Requires structured input.
BASIC EXAMPLE
agg 'select count(*), avg(salary) group by dept' employees.csv
Groups by department, counts rows, averages salary.
FILTER EXAMPLE
agg 'select sum(revenue) where region="EU" group by product' sales.tsv
Filters region, sums revenue per product.
HISTORY
Created by Mark Riedl in 2018 as a Rust-based alternative to slow CSV aggregation tools. First GitHub release in March 2018; actively maintained with performance improvements for large-scale data processing.


