LinuxCommandLibrary

dos2unix

Convert DOS line endings to Unix

TLDR

Change the line endings of a file

$ dos2unix [path/to/file]
copy

Create a copy with Unix-style line endings
$ dos2unix [[-n|--newfile]] [path/to/file] [path/to/new_file]
copy

Display file information
$ dos2unix [[-i|--info]] [path/to/file]
copy

Keep/add/remove Byte Order Mark
$ dos2unix --[keep-bom|add-bom|remove-bom] [path/to/file]
copy

SYNOPSIS

dos2unix [OPTIONS] [FILE ...]
If no FILE is specified, dos2unix reads from standard input and writes to standard output.

PARAMETERS

-h, --help
    Displays a help message with command usage and options.

-k, --keepdate
    Keeps the original modification date of the converted file.

-q, --quiet
    Suppresses all warning messages and output to standard error.

-V, --version
    Prints the version information of the dos2unix utility.

-c, --convmode MODE
    Sets the conversion mode. Common modes include auto (default, detects based on content), dos (CRLF to LF), mac (CR to LF), and unix (LF to LF, does nothing).

-o, --newfile OUTPUT_FILE
    Writes the converted output to a specified new file instead of converting in-place. Requires a single input file.

-s, --suffix SUFFIX
    Creates a backup of the original file by appending SUFFIX to its name before conversion. For example, .bak or .orig.

--
    Marks the end of options, useful when file names might begin with a hyphen (-).

DESCRIPTION

The dos2unix command is a utility designed to convert plain text files from DOS or Macintosh line ending conventions to the standard Unix line ending format.

Historically, different operating systems have used distinct character sequences to denote the end of a line in text files. DOS/Windows systems use a carriage return followed by a line feed (CRLF - \r\n), while Macintosh systems (pre-OSX) used only a carriage return (CR - \r). Unix-like systems, including Linux, exclusively use a line feed (LF - \n).

When text files created on DOS or Macintosh are transferred to a Unix environment, the differing line endings can cause issues, such as extra visible characters (^M for CR) or script execution problems. dos2unix resolves this by reading the input file and replacing CRLF or CR sequences with a single LF, making the file compatible with Unix text processing tools and scripts. It can convert files in-place or write to a new output file.

CAVEATS

dos2unix is primarily designed for text files. Applying it to binary files (e.g., images, executables, compressed archives) can corrupt them, making them unusable. Always ensure the input file is a plain text file before conversion.

While it can handle UTF-8 with or without a Byte Order Mark (BOM), dos2unix is not a general-purpose character encoding converter. Its main function is line ending normalization. For full encoding conversions, other tools like iconv should be used.

By default, dos2unix converts files in-place. This means the original file is overwritten. It is highly recommended to use the -s (suffix) option to create a backup or the -o (newfile) option to output to a separate file, especially for critical data.

STANDARD INPUT/OUTPUT USAGE

When no FILE argument is provided, dos2unix reads from standard input (stdin) and writes to standard output (stdout). This allows it to be integrated into pipelines, for example:
cat input.txt | dos2unix > output.txt

PROCESSING MULTIPLE FILES

You can provide multiple file paths as arguments to dos2unix to convert them all in a single command. Wildcards can also be used, for example:
dos2unix *.txt
This will convert all files ending with .txt in the current directory, in-place.

HISTORY

The dos2unix utility, along with its counterpart unix2dos, has been a staple in Unix-like operating systems for decades. Its existence reflects the historical divergence in text file conventions across different computing platforms, particularly between Unix/Linux and Microsoft DOS/Windows.

Originally developed to address cross-platform compatibility issues in file transfers and scripting, it became an essential tool for developers and system administrators working in heterogeneous environments. The command's simplicity and singular focus on line ending conversion have ensured its enduring relevance, making it a widely adopted and stable component of most Linux distributions.

SEE ALSO

unix2dos(1), file(1), tr(1), sed(1), awk(1), iconv(1)

Copied to clipboard