xml-c14n
Canonicalize XML documents
TLDR
View documentation for the original command
SYNOPSIS
xml-c14n [OPTIONS] [FILE]
Reads XML from FILE or standard input, outputs canonicalized XML.
PARAMETERS
-o, --output file
Specifies the output file for the canonicalized XML. If not specified, output goes to standard output.
-i, --in-place
Performs canonicalization on the input file directly, overwriting its content. Use with caution.
-e, --exc-c14n
Uses Exclusive XML Canonicalization 1.0, which handles namespaces differently for portability.
-C, --with-comments
Includes comments in the canonicalized output. Note: standard C14N removes comments.
-I, --id id
Specifies the ID of the root element to be canonicalized. Only that element and its descendants are processed.
-P, --prefix-ns
Used with exclusive canonicalization, prefixes namespace attributes with 'xml' if necessary.
--version
Displays version information and exits.
--help
Displays a help message with usage information and exits.
DESCRIPTION
xml-c14n is a command-line tool designed for canonicalizing XML documents. XML canonicalization (C14N) is a process that transforms an XML document into a physical representation, a byte stream, that is consistent across different XML parsers and environments. This consistency is crucial for applications like digital signatures, where even a slight change in whitespace or attribute order would invalidate the signature. The tool reads an XML document from standard input or a specified file and outputs its canonicalized form to standard output or an output file. It supports both standard XML Canonicalization 1.0 and Exclusive XML Canonicalization 1.0, which is often used in SOAP and WS-Security contexts by excluding non-declared namespace prefixes from the canonical form. By default, xml-c14n removes comments, resolves character and entity references, and normalizes whitespace, ensuring a deterministic byte-for-byte representation of the document's logical content.
CAVEATS
Canonicalization can be complex, especially with DTDs and external entities. While xml-c14n handles standard cases, specific edge cases related to DTD processing or unresolvable external references might lead to unexpected results or errors. The --in-place option should be used with extreme caution as it overwrites the original file, making it prone to data loss if an error occurs.
CANONICALIZATION STANDARDS
xml-c14n implements Canonical XML 1.0 by default. The --exc-c14n option allows it to use Exclusive XML Canonicalization 1.0. These standards define precise rules for transforming an XML document into a canonical form, ensuring that logically equivalent documents produce identical byte sequences, regardless of minor differences like whitespace or attribute order.
INPUT AND OUTPUT
The command reads XML from the specified FILE argument. If no file is provided, it reads from standard input (stdin), allowing it to be piped from other commands. The canonicalized output is sent to standard output (stdout) unless an output file is specified using the --output option.
HISTORY
The concept of XML Canonicalization was developed by the World Wide Web Consortium (W3C) to address the need for a consistent byte-stream representation of XML documents, primarily for digital signatures. The first recommendation, Canonical XML 1.0, was published in 2001. Exclusive XML Canonicalization 1.0 followed in 2002 to address specific needs in Web Services. The xml-c14n command is part of the libxml2 utilities, a widely used XML toolkit in the Linux ecosystem, reflecting its importance in XML processing workflows.