LinuxCommandLibrary

agg

Aggregate input data

TLDR

Create a GIF

$ agg [path/to/demo.cast] [path/to/demo.gif]
copy

Create a GIF that is 80 columns wide and 25 rows in height
$ agg --cols 80 --rows 25 [path/to/demo.cast] [path/to/demo.gif]
copy

Create a GIF with a font size of 24 pixels
$ agg --font-size 24 [path/to/demo.cast] [path/to/demo.gif]
copy

Display help
$ agg [[-h|--help]]
copy

SYNOPSIS

agg [OPTIONS] '<sql_query>' [FILE ...]

PARAMETERS

-d, --delimiter <DELIM>
    Field delimiter (default: auto-detect from first line)

-q, --quote <QUOTE>
    Quote character (default: ")

-H, --no-header
    Input lacks header row

-i, --ignore-case
    Case-insensitive string matching

-I, --ignore-case-names
    Case-insensitive column name matching

-s, --silent
    Suppress error messages

-S, --stats
    Print execution statistics

-n, --no-header-output
    Omit column names from output

-N, --names-from-sql
    Derive output column names from SQL query

--help
    Show this help message

--version
    Print version information

DESCRIPTION

Agg is a high-performance command-line tool for aggregating tabular data in CSV or TSV files using a simplified SQL-like query syntax. Designed for speed on large datasets, it excels at GROUP BY operations, filtering, and computing aggregates like COUNT, SUM, AVG, MIN, MAX, STDDEV, and FIRST/LAST.

Unlike line-based tools such as awk or grep, agg parses structured data, automatically detecting delimiters, quotes, and headers. Queries are enclosed in single quotes to preserve SQL syntax from shell expansion, e.g., agg 'select avg(age) group by city' data.csv.

It supports multiple input files, stdin piping, case-insensitive matching, and customizable output. Written in Rust, agg processes gigabytes quickly, making it ideal for log analysis, data exploration, and ad-hoc reporting in data science workflows. Limitations include no support for joins, updates, or full SQL features—focused purely on aggregation.

Output includes headers by default (unless suppressed) and can derive names from the query. Stats mode reveals parsing and execution times for optimization.

CAVEATS

Supports only aggregate SELECT with GROUP BY; no JOINs, subqueries, or non-aggregate SELECT. Limited SQL dialect. Requires structured input.

BASIC EXAMPLE

agg 'select count(*), avg(salary) group by dept' employees.csv
Groups by department, counts rows, averages salary.

FILTER EXAMPLE

agg 'select sum(revenue) where region="EU" group by product' sales.tsv
Filters region, sums revenue per product.

HISTORY

Created by Mark Riedl in 2018 as a Rust-based alternative to slow CSV aggregation tools. First GitHub release in March 2018; actively maintained with performance improvements for large-scale data processing.

SEE ALSO

awk(1), grep(1), mlr(1), csvkit(1)

Copied to clipboard