LinuxCommandLibrary

chars

Convert strings to character arrays

TLDR

Look up a character by its value

$ chars '[ß]'
copy

Look up a character by its Unicode code point
$ chars [U+1F63C]
copy

Look up possible characters given an ambiguous code point
$ chars [10]
copy

Look up a control character
$ chars "[^C]"
copy

SYNOPSIS

chars [options] [character...]

PARAMETERS

-a, --all
    Print all known information about the characters

-b, --blocks
    Print Unicode blocks

-c, --categories
    Print Unicode categories

-C, --combinations
    Print possible precomposed characters

-d, --decomposition
    Print canonical decomposition

-f, --fields=LIST
    Specify fields (comma-separated: name,codepoint,category,subcategory,combining,block,decomp,jamo,uppercase,lowercase,titlecase,mirrored,decimal,digit,numeric,unicode1,bidi)

-F, --format=FORMAT
    Output format: table (default), json, csv

-n, --names
    Print Unicode names

-p, --permutations
    Print possible permutations of input characters

-s, --separator=SEP
    Separator string (default: tab)

-w, --width=N
    Set output width (default: 80)

-h, --help
    Display help message and exit

--version
    Output version information

DESCRIPTION

chars is a versatile command-line tool from the util-linux package that displays comprehensive details about Unicode characters. Users specify characters via hexadecimal code points (e.g., U+0041 or 0x41), official Unicode names (e.g., LATIN CAPITAL LETTER A), or directly as UTF-8 encoded bytes (e.g., A or π).

By default, it outputs the character's name and code point in a tab-separated table format, ideal for quick lookups. Powerful options allow customization: -a, --all reveals everything known, including category, subcategory, combining class, block, decomposition, case mappings, mirroring, numeric values, and more. Fields can be selected via -f from a list like name, codepoint, category, block, decomp.

Output formats include table (default), JSON, or CSV for scripting and parsing. It handles sequences with -p for permutations or -C for precomposed combinations, useful for typography, text normalization, and encoding debugging. Supports width adjustment and custom separators.

Perfect for developers, linguists, and sysadmins exploring Unicode's 140,000+ characters across scripts, emojis, and symbols. Requires a UTF-8 locale for accurate rendering.

CAVEATS

Requires UTF-8 locale and terminal for proper character display. Some wide characters (e.g., emojis) may wrap or truncate in narrow terminals. Unicode data limited to tool's version.

EXAMPLES

chars A
LATIN CAPITAL LETTER A   U+0041

chars -a ☃
All properties for SNOWMAN (U+2603).

chars --format=json U+1F600 😀
JSON output: {"name":"GRINNING FACE",...}

chars -f name,block,category é
LATIN SMALL LETTER E WITH ACUTE   LATIN-1 SUPPLEMENT   Ll

HISTORY

Introduced in util-linux 2.32 (January 2018) as a modern, feature-rich replacement for the older unichar(1) command, with expanded Unicode support and JSON/CSV outputs.

SEE ALSO

unichar(1), unicode(7)

Copied to clipboard