LinuxCommandLibrary

convmv

Convert file names from one encoding to another

TLDR

Test filename encoding conversion (don't actually change the filename)

$ convmv -f [from_encoding] -t [to_encoding] [input_file]
copy

Convert filename encoding and rename the file to the new encoding
$ convmv -f [from_encoding] -t [to_encoding] --notest [input_file]
copy

SYNOPSIS

convmv -f ENCODING_FROM -t ENCODING_TO [OPTIONS] [FILE_OR_DIR...]

PARAMETERS

-f ENCODING_FROM
    Specifies the current character encoding of the filenames that need to be converted.

-t ENCODING_TO
    Specifies the target character encoding to which filenames should be converted.

-r, --recursive
    Recursively converts filenames in subdirectories as well as the specified directory.

--notest
    Performs the actual filename conversion. By default, convmv operates in dry-run mode, only showing what it would do.

--dry-run
    Explicitly enables dry-run mode. This is the default behavior and shows changes without applying them.

--list-encodings
    Lists all available character encodings that convmv supports for conversion.

--replace
    Replaces characters that cannot be converted to the target encoding with a specified or default placeholder character (e.g., '?').

-i, --interactive
    Prompts for confirmation before converting each individual file.

--log FILE
    Logs all changes made to filenames, including successful conversions and errors, to the specified file.

DESCRIPTION

convmv is a utility for converting filenames to a different character encoding. It's particularly useful when dealing with files created on systems using different character encodings (e.g., ISO-8859-1, EUC-JP) and you need to display or use them correctly on a system primarily using UTF-8. It does not convert the content of files, only their names.

The command offers a dry-run mode by default, allowing users to preview the changes before applying them, which is crucial given the potential impact on filenames. It can operate on individual files, directories, or recursively convert all filenames within a directory tree, making it a powerful tool for migrating entire file systems or large archives between systems with differing locale settings. It leverages underlying encoding conversion libraries to perform its task reliably.

CAVEATS

Always perform a dry-run first (the default behavior) to preview changes, as filename alterations can be irreversible.
It is highly recommended to create a backup of your data before running convmv with the --notest option.
Remember, convmv only changes filenames, not the content of the files themselves.
Filenames with severely corrupted or completely unknown original encodings might not be fully recoverable.

DEFAULT DRY-RUN MODE

By default, convmv operates in a 'dry-run' mode, meaning it will only show you which filenames would be converted without making any actual changes. This is a crucial safety feature to verify the intended outcome before committing changes. To apply the changes, you must explicitly use the --notest option.

ENCODING DETECTION AND SPECIFICITY

While convmv can sometimes intelligently detect broken UTF-8 sequences and try to guess the original encoding, it is always safer and more reliable to explicitly specify both the source (-f) and target (-t) encodings when you know them, to ensure accurate conversions.

HISTORY

convmv was developed to address the common and often frustrating problem of character encoding mismatches, particularly relevant when migrating files between different operating systems or locales. As UTF-8 became the predominant encoding in modern Linux distributions, tools like convmv became essential for cleaning up 'garbled' or incorrectly displayed filenames originating from systems using older encodings like ISO-8859-1 or various Windows code pages, simplifying file management and data integrity.

SEE ALSO

iconv(1), mv(1), ls(1), locale(1)

Copied to clipboard