uc
Unicode-aware uppercase filter
TLDR
SYNOPSIS
uc [file ...]
DESCRIPTION
uc reads text from standard input (or the named files), applies the full Unicode toUpper case mapping, and writes the result to standard output. Unlike a naive tr 'a-z' 'A-Z', it handles case folds that change length (German ß → SS, the ffi ligature → FFI) and respects language-specific rules for Greek, Cyrillic, Armenian, and other scripts.It ships as one of roughly thirty small filter scripts in Tom Christiansen's Unicode::Tussle Perl distribution, alongside companions such as lc (lowercase), tc (titlecase), nfd/nfc/nfkd/nfkc (normalization), ucsort, uniwc, and tcgrep — together forming a Unicode-correct replacement for many GNU coreutils.
CAVEATS
The name uc is ambiguous and collides with Perl's built-in uc() function and a number of unrelated tools on other platforms. The script is not installed by default on most distributions; install the full set with cpanm Unicode::Tussle. Because the mapping is locale-independent, language-tailored rules (for example Turkish dotted/dotless I) are not applied.
HISTORY
Unicode::Tussle grew out of a collection of one-off scripts by Tom Christiansen presented at OSCON 2011 and is now packaged on CPAN by brian d foy. The distribution is the standard reference for "Unicode coreutils" in Perl.
