LinuxCommandLibrary

xmlstarlet

Query, transform, and validate XML documents

TLDR

Format an XML document and print to stdout

$ xmlstarlet format [path/to/file.xml]
copy

XML document can also be piped from stdin
$ [cat path/to/file.xml] | xmlstarlet format
copy

Print all nodes that match a given XPath
$ xmlstarlet select --template --copy-of [xpath] [path/to/file.xml]
copy

Insert an attribute to all matching nodes, and print to stdout (source file is unchanged)
$ xmlstarlet edit --insert [xpath] --type attr --name [attribute_name] --value [attribute_value] [path/to/file.xml]
copy

Update the value of all matching nodes in place (source file is changed)
$ xmlstarlet edit --inplace --update [xpath] --value [new_value] [file.xml]
copy

Delete all matching nodes in place (source file is changed)
$ xmlstarlet edit --inplace --delete [xpath] [file.xml]
copy

Escape or unescape special XML characters in a given string
$ xmlstarlet [un]escape [string]
copy

List a given directory as XML (omit argument to list current directory)
$ xmlstarlet ls [path/to/directory]
copy

SYNOPSIS

xmlstarlet [global_options] <command> [command_options] [arguments]

Common command forms:
xmlstarlet sel [options] [file...]
xmlstarlet ed [options] [file...]
xmlstarlet tr [options] stylesheet [file...]
xmlstarlet val [options] [file...]

PARAMETERS

-v, --version
    Prints the xmlstarlet version information.

--help
    Displays the main help message for xmlstarlet, including a list of subcommands.

--list-commands
    Lists all available subcommands provided by xmlstarlet.

--check
    Checks the input XML document for well-formedness without processing.

--disable-internal-entities
    Disables the resolution of internal XML entities.

--disable-external-entities
    Disables the resolution of external XML entities, preventing network access or file system lookups for DTDs/schemas.

--net
    Enables network access for resolving DTDs, XML Schemas, or XSLT stylesheets.

--noent
    Replaces XML entities with their corresponding values (entity substitution).

--dtdload
    Loads the Document Type Definition (DTD) for validation or processing.

--no-blanks
    Removes ignorable whitespace nodes from the XML document, useful for compacting output or simplifying XPath queries.

--recover
    Attempts to recover from parsing errors in malformed XML documents, useful for processing non-standard XML.

DESCRIPTION

xmlstarlet is a powerful and versatile open-source command-line utility designed for processing XML documents. It provides a comprehensive set of tools for various XML-related tasks, including validation against DTD or XML Schema, XSLT transformations, XPath queries for selecting nodes, and in-place editing operations like adding, deleting, or updating elements and attributes.

Its modular design, utilizing subcommands, makes it highly scriptable and an invaluable tool for automating XML manipulation in shell scripts. xmlstarlet aims to be a lightweight alternative or complement to more heavyweight XML processing libraries, offering quick and efficient command-line access to core XML functionalities.

CAVEATS

xmlstarlet generally requires well-formed XML input; malformed documents may cause errors unless the --recover option is used.
It primarily supports XPath 1.0, lacking features found in newer XPath 2.0+ specifications.
Processing extremely large XML files can be memory-intensive. Error messages for complex operations can sometimes be less descriptive, requiring careful debugging.

KEY SUBCOMMANDS OVERVIEW

xmlstarlet operates through a series of dedicated subcommands, each designed for a specific category of XML operation:
sel: Selects or queries XML nodes using XPath expressions.
ed: Edits XML documents by inserting, deleting, updating, or renaming nodes and attributes.
tr: Performs XSLT transformations on XML documents using provided stylesheets.
val: Validates XML documents against DTDs or XML Schemas.
fo: Formats XML documents for readability, adding indentation and proper line breaks.
c14n: Canonicalizes XML documents according to the XML Canonicalization (C14N) specification.
esc: Escapes special characters within text for use in XML.
unesc: Unescapes XML entities back to their original characters.

HISTORY

xmlstarlet was developed by Roman Bednar, with initial releases around 2004-2005. It was designed to provide a lightweight, scriptable command-line interface leveraging the robust libxml2 and libxslt libraries. Over the years, it has become a stable and widely adopted tool in Unix-like environments for efficient XML manipulation and automation.

SEE ALSO

xmllint(1), xsltproc(1), grep(1), sed(1), awk(1), jq(1)

Copied to clipboard