mlr
Process, reshape, and analyze tabular data
TLDR
Pretty-print a CSV file in a tabular format
Receive JSON data and pretty print the output
Sort alphabetically on a field
Sort in descending numerical order on a field
Convert CSV to JSON, performing calculations and display those calculations
Receive JSON and format the output as vertical JSON
Filter lines of a compressed CSV file treating numbers as [S]trings
SYNOPSIS
mlr [global options] verb [verb options] [file...]
PARAMETERS
--csv
Input and output data as CSV.
--tsv
Input and output data as TSV.
--json
Input and output data as JSON.
--jsonl
Input and output data as JSON Lines.
-n
Don't use headers.
filter 'expression'
Filter records based on an expression.
put 'statement'
Evaluate statements for each record to compute fields.
cut -f field1,field2,...
Select specified fields.
stats
Calculate descriptive statistics.
uniq
Remove adjacent duplicate records. Add -c to count how many records was grouped.
DESCRIPTION
Miller (mlr) is a command-line data processing tool for working with CSV, TSV, JSON, JSON Lines, and other formats. It provides a powerful set of operations for filtering, grouping, aggregating, reformatting, and reshaping data. It is designed to be a more expressive and versatile alternative to tools like `awk`, `sed`, `cut`, `join`, and similar utilities. Miller aims to simplify data manipulation workflows by offering a single, unified interface for various data formats and tasks, focusing on efficiency and usability.
Miller is useful for tasks ranging from simple data extraction and reformatting to complex data analysis and reporting. Its key features include support for a wide range of data formats, a comprehensive set of built-in functions, and a flexible syntax for specifying data processing operations. It's particularly well-suited for processing large datasets, thanks to its efficient memory management and optimized algorithms. Miller leverages data format autodetection to facilitate user experience.
EXAMPLE USES
To cut specific fields from a csv file: mlr --csv cut -f field1,field2 input.csv
To filter records where field equals 'value':mlr filter '$field == "value"' input.json
To calculate the sum of a specific field:mlr stats -a sum -f field input.tsv
HISTORY
Miller has been developed over several years, with a focus on providing a user-friendly and efficient command-line data processing tool. The project aimed to address the limitations of existing utilities like `awk`, `sed`, and `cut` by offering a more powerful and versatile solution for working with various data formats. It has gained popularity among data scientists, engineers, and system administrators for its ease of use, performance, and comprehensive feature set. Miller has been continuously improved and expanded, with regular updates and contributions from the open-source community.