ucs2any

Convert UCS-2 files to another encoding

SYNOPSIS

ucs2any [OPTION...] TARGET_ENCODING [FILE...]

TARGET_ENCODING
    The desired character encoding for the output. This is a mandatory argument, not a flag. Examples include latin1, utf8, koi8-r, etc.

-f, --force
    Forces the conversion even if there are ambiguities or unrepresentable characters, potentially leading to data loss or substitution.

-v, --verbose
    Displays detailed information about the conversion process, including statistics and any warnings.

-i, --info, -l, --list
    Lists all supported character encodings and conversion methods available to recode (and thus ucs2any).

-h, --help
    Displays a help message and exits.

-V, --version
    Displays version information and exits.

DESCRIPTION

ucs2any is a utility, often a symbolic link to the more general recode command, specifically designed for converting text files encoded in UCS-2 (Unicode Basic Multilingual Plane) to other desired character encodings. It simplifies the process by implicitly setting the input encoding to UCS-2, requiring the user only to specify the target encoding. This command is particularly useful for handling older Unicode files or data streams that might be in the fixed-width UCS-2 format and need to be transformed for compatibility with modern systems, which often use UTF-8 or other locale-specific encodings. It can process files provided as arguments or read from standard input, writing the converted output to standard output.

CAVEATS

ucs2any is typically a specialized invocation of the recode command. Its capabilities and supported encodings are inherited from recode.

UCS-2 only covers the Basic Multilingual Plane (BMP) of Unicode (U+0000 to U+FFFF). Characters outside the BMP (Supplementary Planes) are not representable in pure UCS-2 and cannot be correctly converted if the source data claims to be UCS-2 but contains such characters. In such cases, the source might actually be UTF-16.

Conversion to character sets that cannot represent all characters present in the UCS-2 input may result in character loss or replacement with a substitute character (e.g., '?').

IMPLICIT SOURCE ENCODING

When invoked as ucs2any, the command automatically assumes the input character encoding is UCS-2. This simplifies the command line as you do not need to explicitly specify ucs2 as the source.

STANDARD INPUT/OUTPUT

If no FILE arguments are provided, ucs2any reads from standard input (stdin) and writes the converted output to standard output (stdout). This allows for piping data from other commands or for interactive conversion.

HISTORY

The recode project, which includes ucs2any, has been a long-standing utility in the Unix/Linux ecosystem for character set conversion. It was developed to address the complexities of managing various character encodings, especially in the pre-UTF-8 dominant era. ucs2any specifically caters to the need for converting UCS-2, an older fixed-width Unicode encoding, to more modern or locale-specific encodings. While iconv has become the more prevalent and standard tool for encoding conversion in many contemporary systems, recode and its symlinks like ucs2any still exist and are used in environments where they are preferred or already integrated.