xmlto
Convert XML to other formats
TLDR
Convert a DocBook XML document to PDF format
Convert a DocBook XML document to HTML format and store the resulting files in a separate directory
Convert a DocBook XML document to a single HTML file
Specify a stylesheet to use while converting a DocBook XML document
SYNOPSIS
xmlto [OPTIONS] FORMAT XML_FILE
PARAMETERS
FORMAT
The target output format for the conversion. Examples include html, man, pdf, fo, text, dvi, ps, epub, etc. The availability of formats depends on installed backend tools.
XML_FILE
The path to the input XML document to be converted.
-o directory, --output-dir=directory
Specify the directory where the output file(s) should be placed.
-x file, --xsl=file
Use a custom XSL stylesheet file for the transformation, overriding the default stylesheet selection.
-m type, --markup-type=type
Force the markup type of the input XML document (e.g., docbook), useful when xmlto cannot auto-detect it.
-s style, --style=style
Specify a specific style to be applied (e.g., for DocBook stylesheets).
-v, --verbose
Enable verbose output, showing more details about the conversion process and commands executed.
-q, --quiet
Suppress most output messages during conversion.
--skip-validation
Skip DTD validation of the input XML document.
--with-fop
Explicitly tell xmlto to use Apache FOP for XSL-FO processing to PDF.
--with-dblatex
Explicitly tell xmlto to use dblatex for DocBook to PDF/PS conversions.
-h, --help
Display a help message and exit.
-V, --version
Display version information and exit.
DESCRIPTION
xmlto is a command-line utility designed to simplify the conversion of XML documents into various output formats, such as HTML, PDF, man pages, and plain text. It acts as a front-end to XSLT processors like xsltproc, automating the complex task of applying the correct XSL stylesheets for a given XML document type (e.g., DocBook) and desired output format.
The command intelligently determines the appropriate XSL stylesheet based on the input XML's root element and the specified output format. It then invokes the underlying XSLT processor and, if necessary, external tools (like Apache FOP for PDF generation from XSL-FO, or dblatex for DocBook to PDF/PS). This makes xmlto an indispensable tool in documentation pipelines, especially for projects utilizing DocBook XML.
CAVEATS
xmlto requires an XSLT processor (commonly xsltproc) to be installed on the system. For certain output formats (e.g., PDF, PS), additional backend tools are necessary, such as Apache FOP (for XSL-FO), dblatex, or groff. The absence of these tools will prevent xmlto from generating the desired output for those formats, often resulting in an error indicating missing dependencies. Dependency management can be complex for specific conversion chains.
DOCBOOK INTEGRATION
xmlto is heavily utilized within the DocBook ecosystem. It provides an elegant way to convert DocBook XML source files into a wide array of formats including HTML, chunked HTML, PDF, EPUB, man pages, and plain text, simplifying the DocBook toolchain significantly for technical writers and developers.
CUSTOMIZATION WITH XSL STYLESHEETS
While xmlto is adept at automatically selecting appropriate stylesheets, it offers robust customization options. Users can specify their own XSL stylesheets using the -x or --xsl option, allowing for highly tailored transformations that meet specific formatting or content requirements.
SUPPORTED OUTPUT FORMATS
The flexibility of xmlto is showcased by its support for numerous output formats. Common ones include html (single HTML file), html-nochunks (HTML without chunking), fo (XSL-FO intermediate format), pdf, man, text, epub, dvi, and ps. The actual formats available depend on the presence of the necessary XSLT stylesheets and rendering engines on the system.
HISTORY
xmlto was developed as a convenient wrapper around XSLT processors and related tools, primarily to streamline the process of transforming DocBook XML documents into various publishing formats. Its aim was to abstract away the intricate command-line arguments and specific stylesheet paths often required for complex XSLT transformations, thereby making XML-based documentation pipelines more accessible and easier to manage.