LinuxCommandLibrary

iconv

Convert text file character encoding

TLDR

Convert file to a specific encoding, and print to stdout

$ iconv [[-f|--from-code]] [from_encoding] [[-t|--to-code]] [to_encoding] [input_file]
copy

Convert file to the current locale's encoding, and output to a file
$ iconv [[-f|--from-code]] [from_encoding] [input_file] > [output_file]
copy

List supported encodings
$ iconv [[-l|--list]]
copy

SYNOPSIS

iconv [options] [-f from-encoding] [-t to-encoding] [inputfile]

PARAMETERS

-f ENCODING, --from-code=ENCODING
    Input character encoding (default from locale)

-t ENCODING, --to-code=ENCODING
    Output character encoding (default UTF-8)

-l, --list
    List all known encoded character sets

-c
    Discard characters unconvertible to output encoding

-o FILE, --output=FILE
    Output to FILE instead of stdout

-s, --silent
    Suppress warnings about conversion problems

--verbose
    Print progress information (bytes converted)

--help
    Display usage summary and exit

--version
    Output version information and exit

DESCRIPTION

iconv is a standard Unix command-line utility for converting the character encoding of input text (from stdin or file) to a specified output encoding, writing to stdout or a file. It leverages the system's iconv(3) library, supporting hundreds of encodings like UTF-8, ISO-8859-1, CP1252, Shift_JIS, and more. Essential for internationalization (i18n), it handles tasks such as migrating legacy files to Unicode, cleaning web-scraped data, or preparing CSVs for cross-platform use.

Specify source encoding with -f and target with -t. Encoding names can include modifiers like //IGNORE (skip invalid chars), //TRANSLIT (approximate), or //ESCAPE. By default, invalid sequences cause exit with error code 1; use -c to discard them silently. The tool streams data efficiently, suitable for large files without excessive memory use.

Common pitfalls include mismatched endianness (UCS-2BE vs. UCS-2LE) or locale assumptions. Verify encodings with file or hexdump. Widely portable across Linux, BSD, and macOS, iconv is invaluable for scripting and data processing pipelines.

CAVEATS

Encoding support varies by system (glibc vs. libiconv); list with -l. Invalid sequences fail unless -c or //IGNORE used. Bidirectional conversions may lose data (e.g., emoji in legacy encodings). No auto-detection.

EXAMPLES

Convert ISO-8859-1 to UTF-8:
iconv -f ISO-8859-1 -t UTF-8 file.txt > file-utf8.txt

Omit invalid chars:
iconv -c -f CP1252 -t UTF-8//IGNORE input.txt

Transliterate:
iconv -f ISO-8859-1 -t ASCII//TRANSLIT text.txt

HISTORY

Originated from Plan 9 Unix; GNU version by Ulrich Drepper (glibc 2.2, 2001) and Bruno Haible (standalone libiconv). Integrated into coreutils/glibc; now standard POSIX.1-2008.

SEE ALSO

recode(1), enca(1)

Copied to clipboard