LinuxCommandLibrary

roff2x

Convert roff/man pages to other formats

SYNOPSIS

roff2x [OPTIONS] [FILE...]

PARAMETERS

-format <format>
    Specifies the desired output format. Common values include docbook, linuxdoc, and html. If omitted, a generic XML output is produced.

-output <file>
    Writes the converted output to the specified file instead of standard output (stdout).

-tag <tag>
    Defines the top-level tag name for generic XML/SGML output formats.

-noname
    Prevents the emission of the man tag in the output, relevant for specific structural requirements.

-novalid
    Suppresses the emission of DOCTYPE or PUBLIC declarations, useful when external validation is not desired or handled by other means.

-nohead
    Omits the HEAD element from the generated output.

-nofoot
    Omits the FOOT element from the generated output.

-nofrag
    Disables the emission of fragment markers, often represented as <br> tags, which can break output into smaller chunks.

-nohtml
    Suppresses the generation of HTML-specific tags when producing generic output, ensuring purer XML/SGML.

-warn
    Enables verbose warning messages during conversion, aiding in debugging and identifying potential issues.

-debug
    Activates debug output, providing detailed information about the conversion process for advanced troubleshooting.

FILE...
    One or more input groff source files to be converted. If no files are specified, roff2x reads from standard input (stdin).

DESCRIPTION

roff2x is a command-line utility designed to convert documents written in the groff (GNU roff) typesetting system into various XML or SGML output formats. It acts as a post-processor or converter for groff source files, commonly used for man pages, technical documentation, and other structured text.

Its primary purpose is to facilitate the migration of legacy roff-formatted documentation into more modern, structured markup languages like DocBook XML or LinuxDoc SGML. This enables easier processing by other tools, integration into larger documentation frameworks, or publishing on the web.

While it can produce HTML output, groff -Thtml is often preferred for more direct HTML generation. roff2x seamlessly integrates with groff's preprocessors, handling tables (gtbl), equations (geqn), and pictures (gpic) embedded within the source document.

CAVEATS

roff2x is part of the groff package and may not be pre-installed on all minimal Linux distributions. The quality and structure of the output heavily depend on the consistency and specific macros used in the input roff file. For straightforward HTML conversion of man pages, using groff -Thtml is often more direct and can yield better results.

SUPPORTED OUTPUT FORMATS

Beyond generic XML/SGML, roff2x explicitly supports specific document type definitions (DTDs) like DocBook XML and LinuxDoc SGML. The choice of format significantly influences the structure and semantic tags in the generated output.

INPUT PREPROCESSOR INTEGRATION

roff2x handles input files that utilize groff preprocessors. This means it can correctly process documents containing tables (processed by gtbl), mathematical equations (processed by geqn), and embedded graphics (processed by gpic), converting their formatted output into the target XML/SGML structure.

HISTORY

roff2x is an integral part of the GNU groff project, which serves as a modern re-implementation of the classic troff typesetting system. Its development emerged from the growing need to bridge traditional roff-formatted documentation with contemporary structured markup standards. It reflects the ongoing efforts to enable seamless data interchange and leverage groff's powerful text processing capabilities within broader digital documentation ecosystems. Its creation underscores the evolution from purely print-oriented formatting to versatile, machine-readable document formats.

SEE ALSO

groff(1), troff(1), nroff(1), man(1), gtbl(1), geqn(1), gpic(1)

Copied to clipboard