nokogiri
ruby HTML/XML parser CLI
TLDR
SYNOPSIS
nokogiri [options] [fileorurl]
DESCRIPTION
nokogiri is the command-line front-end for the Nokogiri Ruby gem, a fast HTML/XML parser backed by libxml2 and libxslt. The CLI parses a file, URL, or stdin into a Nokogiri::HTML::Document or Nokogiri::XML::Document (bound as doc) and either drops you into an IRB session or runs the Ruby snippet supplied with -e so you can query it with CSS selectors (doc.css) or XPath (doc.xpath).
PARAMETERS
FILEORURL
HTML/XML file path or URL to parse. If absent, the document is read from stdin.-e CODE
Execute Ruby CODE against the parsed document (which is bound to doc).--type TYPE
Document type: xml or html. Defaults to autodetection by content type / extension.-C FILE
Load a custom Ruby initialization file. Default: ~/.nokogirirc.-E, --encoding ENCODING
Read input using the named character encoding (e.g. UTF-8, ISO-8859-1).--rng URIORPATH
Validate the document against the given RelaxNG schema.-v, --version
Show the Nokogiri version.-?, --help
Display help.
CAVEATS
Requires Ruby and the nokogiri gem (`gem install nokogiri`). The -i interactive flag is not part of the modern CLI — running nokogiri file on a TTY drops into IRB by default; pass -e to run non-interactively. Fetching URLs uses open-uri, so HTTPS sites need OpenSSL support in the underlying Ruby build.
HISTORY
Nokogiri (Japanese for "saw") was created by Aaron Patterson and Mike Dalessio in 2008 as a faster, libxml2-backed alternative to Hpricot. It is one of the most-installed Ruby gems and ships a small CLI for ad-hoc parsing and validation.
