LinuxCommandLibrary

convmv

Convert file names from one encoding to another

TLDR

Test filename encoding conversion (don't actually change the filename)

$ convmv -f [from_encoding] -t [to_encoding] [input_file]
copy

Convert filename encoding and rename the file to the new encoding
$ convmv -f [from_encoding] -t [to_encoding] --notest [input_file]
copy

SYNOPSIS

convmv [OPTION]... -f FROM-ENC -t TO-ENC [FILE...]

PARAMETERS

-f, --fromcode CHARSET
    Original filename encoding (e.g., iso-8859-1, cp1252)

-t, --tocode CHARSET
    Target filename encoding (e.g., utf-8)

-r, --recursive
    Recurse into subdirectories

-n, --notest
    Perform actual renames (default: preview only)

-v, --verbose
    Enable verbose output

-l, --list
    List all supported charsets

--nosmart
    Convert even valid target-encoded names

-i, --interactive
    Prompt before each rename

-q, --quiet
    Suppress non-error output

-s, --squeeze
    Replace multiple spaces with single space

-H
    Do not follow directory symlinks

-h, --help
    Show help

-V, --version
    Show version

DESCRIPTION

Convmv is a versatile Linux command-line utility for converting the character encoding of filenames within directories, including recursively. It excels at repairing mojibake—garbled text from mismatched encodings, such as ISO-8859-1 filenames on UTF-8 filesystems or vice versa. Common scenarios include migrating Windows-created files to Linux or restoring old backups.

By default, convmv runs in safe preview mode, simulating renames without changes and showing before/after filenames. Activate real conversions with -n or --notest. Specify source encoding with -f (e.g., iso-8859-1, cp1252) and target with -t (e.g., utf-8). It leverages Perl's Encode module for broad charset support.

Smart features include skipping correctly encoded files (override with --nosmart), interactive prompting (-i), verbose logging (-v), and charset listing (-l). Options like -s squeeze spaces, and -r enables recursion. Symlinks are handled carefully.

Essential for sysadmins handling legacy data, convmv prevents filesystem issues from invalid UTF-8 names while preserving file contents unchanged.

CAVEATS

Only renames filenames, not file contents. Preview mode essential first—name collisions or invalid chars may cause data loss. Backup data. Not for in-place content conversion.

COMMON EXAMPLE

Preview: convmv -f windows-1252 -t utf-8 -r /path/to/dir
Apply: add -n after verifying output.

CHARSET TIPS

Use convmv -l to list. Common: iso-8859-1, cp1252, utf8, koi8-r. Perl aliases work (e.g., UTF-8)

HISTORY

Created by Michal Schmidt in 2002 as a Perl script. Evolved with Perl Encode integration for modern charsets. Sporadically maintained; widely used in Unix-like systems for encoding fixes.

SEE ALSO

iconv(1), recode(1), rename(1)

Copied to clipboard