tsv-filter
Filter rows in tab-separated value (TSV) data
TLDR
Print the lines where a specific column is numerically equal to a given number
Print the lines where a specific column is [eq]ual/[n]on [e]qual/[l]ess [t]han/[l]ess than or [e]qual/[g]reater [t]han/[g]reater than or [e]qual to a given number
Print the lines where a specific column is [eq]ual/[n]ot [e]qual/part of/not part of a given string
Filter for non-empty fields
Print the lines where a specific column is empty
Print the lines that satisfy two conditions
Print the lines that match at least one condition
Count matching lines, interpreting first line as a [H]eader
SYNOPSIS
tsv-filter [options]
PARAMETERS
-h
Display help message.
-c
Specify the column number to filter on (1-based).
-v
Specify the value to compare against.
-eq
Filter for rows where the column value is equal to the specified value.
-ne
Filter for rows where the column value is not equal to the specified value.
-gt
Filter for rows where the column value is greater than the specified value.
-lt
Filter for rows where the column value is less than the specified value.
-ge
Filter for rows where the column value is greater or equal to the specified value.
-le
Filter for rows where the column value is less or equal to the specified value.
-re
Filter for rows where the column value matches the specified regular expression.
-i
Make the regular expression matching case-insensitive (used with -re).
DESCRIPTION
The `tsv-filter` command is a simple yet powerful tool for filtering Tab-Separated Values (TSV) data. It allows you to extract specific rows from a TSV file based on conditions applied to one or more columns. It operates by comparing column values against specified criteria (equal, not equal, greater than, less than, regex match, etc.) and only outputs rows that satisfy these conditions. This is useful for quickly extracting subsets of data from large TSV files for further analysis or processing. The command provides a flexible and scriptable way to perform data filtering tasks from the command line. Its versatility allows for diverse filtering scenarios, enhancing data manipulation workflows.
Key benefits include the ability to specify filtering criteria based on column number, regular expressions, and different comparison operators, making it an indispensable tool for data manipulation tasks.
CAVEATS
The tool assumes a simple TSV format, without escaped tabs or other complex features. Numeric comparisons (-gt, -lt, -ge, -le) might not work as expected with columns containing non-numeric data.
EXAMPLES
- To filter rows where the value in the 2nd column is equal to 'example':
tsv-filter -c 2 -v example -eq file.tsv
- To filter rows where the value in the 3rd column is greater than '100':
tsv-filter -c 3 -v 100 -gt file.tsv
- To filter rows where the value in the 1st column matches the regular expression 'pattern':
tsv-filter -c 1 -v pattern -re file.tsv