LinuxCommandLibrary
GitHubF-DroidGoogle Play Store

enca

character encoding detection and conversion

TLDR

Detect encoding of file
$ enca [file.txt]
copy
Detect with language hint
$ enca -L [czech] [file.txt]
copy
Convert encoding
$ enca -x [UTF-8] [file.txt]
copy
Detect and show confidence
$ enca -d [file.txt]
copy
Process multiple files
$ enca -L [russian] [*.txt]
copy

SYNOPSIS

enca [options] [files...]

DESCRIPTION

enca (Extremely Naive Charset Analyser) detects character encodings of text files using language-based statistical heuristics, and can convert files between encodings by piping through iconv. It is particularly strong on legacy 8-bit charsets used for Slavic and Central/Eastern European languages (ISO-8859-2/5, KOI8-R, CP1250/1251, Mazovia, T.61, ...), where simpler tools like file -i struggle.Detection works best with a language hint passed via -L; without it, enca falls back to a general profile and may return ambiguous matches. The output is one detected encoding per file by default, or extended information with -d or -v. Conversion is performed in place with -x ENCODING, which calls iconv under the hood; pair it with --cstocs for transliteration when the target charset lacks specific characters.

PARAMETERS

FILES

Files to analyze.
-L LANGUAGE
Hint language for detection.
-x ENCODING
Convert to specified encoding.
-d
Show detailed detection info.
-g, --guess
Output best guess only.
-i, --info
Show available encodings.
--help
Display help information.

CAVEATS

Detection is heuristic, not deterministic. Short files may be ambiguous. Works best with specific language hints. Some encodings indistinguishable.

HISTORY

enca was developed for handling the encoding diversity in Central/Eastern European computing, where many incompatible character sets were historically used for the same languages.

SEE ALSO

file(1), iconv(1), chardet(1)

Copied to clipboard
Kai