LinuxCommandLibrary

compseq

Search DNA/RNA sequence databases

TLDR

Count observed frequencies of words in a FASTA file, providing parameter values with interactive prompt

$ compseq [path/to/file.fasta]
copy

Count observed frequencies of amino acid pairs from a FASTA file, save output to a text file
$ compseq [path/to/input_protein.fasta] -word 2 [path/to/output_file.comp]
copy

Count observed frequencies of hexanucleotides from a FASTA file, save output to a text file and ignore zero counts
$ compseq [path/to/input_dna.fasta] -word 6 [path/to/output_file.comp] -nozero
copy

Count observed frequencies of codons in a particular reading frame; ignoring any overlapping counts (i.e. move window across by word-length 3)
$ compseq -sequence [path/to/input_rna.fasta] -word 3 [path/to/output_file.comp] -nozero -frame [1]
copy

Count observed frequencies of codons frame-shifted by 3 positions; ignoring any overlapping counts (should report all codons except the first one)
$ compseq -sequence [path/to/input_rna.fasta] -word 3 [path/to/output_file.comp] -nozero -frame 3
copy

Count amino acid triplets in a FASTA file and compare to a previous run of compseq to calculate expected and normalized frequency values
$ compseq -sequence [path/to/human_proteome.fasta] -word 3 [path/to/output_file1.comp] -nozero -infile [path/to/output_file2.comp]
copy

Approximate the above command without a previously prepared file, by calculating expected frequencies using the single base/residue frequencies in the supplied input sequence(s)
$ compseq -sequence [path/to/human_proteome.fasta] -word 3 [path/to/output_file.comp] -nozero -calcfreq
copy

Display help (use -help -verbose for more information on associated and general qualifiers)
$ compseq -help
copy

SYNOPSIS

compseq arg1 [arg2 ...]

DESCRIPTION

compseq is a shell function from the bash-completion framework, not a standalone executable. It outputs its arguments separated by null characters (\0), making it perfect for generating completion lists in Bash programmable completion scripts.

Bash's COMPREPLY array expects NUL-delimited strings to safely handle words with spaces, newlines, or glob characters. Completion functions source bash_completion (usually /usr/share/bash-completion/bash_completion), which defines compseq as: compseq() { printf '%s\0' "$@"; }.

Typical use:
COMPREPLY=( $(compseq foo bar 'baz qux') )
This populates completions reliably. It's essential for robust, portable completion scripts across Bash versions, avoiding issues with whitespace. Available only after sourcing bash_completion; check with type compseq. Widely used in /usr/share/bash-completion/completions/* files.

CAVEATS

Function only; requires sourced bash_completion. Command substitution behavior depends on Bash version and IFS. Use mapfile -d '' in Bash 4+ for safer handling.

EXAMPLE USAGE

_mycomp() {
local cur="${COMP_WORDS[COMP_CWORD]}"
COMPREPLY=( $(compseq --all --help --version) )
}
complete -F _mycomp mycmd

DEFINITION SOURCE

Found in /usr/share/bash-completion/bash_completion or /etc/bash_completion.

HISTORY

Defined in bash-completion package (Debian origins, ~2000 by Ian Macdonald). Evolved with Bash 2+ for better completion scripting; standardized in modern distros like Ubuntu/Fedora.

SEE ALSO

compgen(1), complete(1), bash(1)

Copied to clipboard