LinuxCommandLibrary

pandoc

Convert documents from one format to another

TLDR

Convert a Markdown file to PDF using pdflatex (the formats are determined by file extensions)

$ pandoc [path/to/input.md] [[-o|--output]] [path/to/output.pdf]
copy

Convert a Markdown file to PDF using the specified PDF engine
$ pandoc [path/to/input.md] --pdf-engine [tectonic|weasyprint|typst|...] [[-o|--output]] [path/to/output.pdf]
copy

Convert to a standalone file with the appropriate headers/footers (for LaTeX, HTML, etc.)
$ pandoc [path/to/input.md] [[-s|--standalone]] [[-o|--output]] [path/to/output.html]
copy

Manually specify formats (overriding automatic format detection using the filename extension, or when there is no extension)
$ pandoc [[-f|--from]] [docx|...] [path/to/input] [[-t|--to]] [pdf|...] [[-o|--output]] [path/to/output]
copy

Transform a document using a Lua script (see for more information)
$ pandoc [path/to/input] [[-L|--lua-filter]] [path/to/filter.lua] [[-o|--output]] [path/to/output]
copy

List all supported input formats
$ pandoc --list-input-formats
copy

List all supported output formats
$ pandoc --list-output-formats
copy

SYNOPSIS

pandoc [OPTIONS] [INPUTFILE...]

Reads from INPUTFILE(s) or standard input if no files are given, and writes to standard output or a specified output file.

PARAMETERS

-f FORMAT, --from=FORMAT
    Specify the input format of the document.

-t FORMAT, --to=FORMAT
    Specify the output format for conversion.

-o FILE, --output=FILE
    Write the converted output to the specified file.

-s, --standalone
    Produce a complete, standalone document with appropriate headers.

--template=FILE
    Use a custom template file for output generation.

-V KEY[=VAL], --variable=KEY[=VAL]
    Set a template variable (KEY) to an optional value (VAL).

-C, --citeproc
    Enable processing of citations and bibliographies.

--bibliography=FILE
    Use the specified file for bibliographic data.

--toc, --table-of-contents
    Include a table of contents in the output.

--lua-filter=FILE
    Apply a Lua filter script to transform the document's AST.

--filter=PROGRAM
    Run an external program as a filter on the document's AST.

--css=FILE
    Link to a CSS stylesheet file for HTML output.

--mathjax, --katex, --mathml
    Render mathematical equations using the chosen method.

-D FORMAT, --print-default-template=FORMAT
    Print the default template for a given output format.

--list-input-formats
    Display a list of all supported input formats.

--list-output-formats
    Display a list of all supported output formats.

--extract-media=DIR
    Extract embedded media files (images, videos) to a specified directory.

--read-raw-html, --write-raw-html
    Control how raw HTML blocks are handled during conversion.

--data-dir=DIR
    Specify the user data directory for templates, filters, etc.

-v, --version
    Show program version number and exit.

-h, --help
    Display a help message and exit.

DESCRIPTION

Pandoc is a powerful command-line tool designed for universal document conversion. It acts as a swiss army knife for markup, allowing seamless transformation between a vast array of formats including Markdown, reStructuredText, HTML, LaTeX, DocBook, EPUB, ODT, and DOCX. This versatility makes it an indispensable asset for writers, academics, publishers, and developers managing diverse document workflows. Pandoc excels at preserving document structure, metadata, and complex elements like citations, footnotes, and mathematical equations during conversion. Developed in Haskell, it's widely adopted for creating static websites, generating presentations, converting e-books, and automating complex document processing tasks, providing robust and reliable conversions.

CAVEATS

External Dependencies: Certain output formats like PDF require external tools (e.g., a LaTeX distribution for high-quality PDF rendering).
Conversion Fidelity: While robust, converting between very disparate formats may lead to minor loss of specific styling or features not inherently supported by the target format.
Complexity for Advanced Use: Achieving highly customized outputs often requires familiarity with Markdown syntax, templates, and filter scripting (Lua or external programs).

FILTERS

Pandoc's most powerful extensibility feature. Users can write custom programs (in any language) or Lua scripts to transform the document's intermediate representation (Abstract Syntax Tree or AST) during conversion, allowing for highly complex and specific transformations.

TEMPLATES

For standalone document formats (like HTML, LaTeX, EPUB), Pandoc uses templates to control the overall structure and appearance of the output. Users can customize these templates to achieve precise layout and styling, offering immense flexibility beyond basic format conversion.

PANDOC MARKDOWN

While supporting many Markdown flavors, Pandoc has its own extended version, which includes features like definition lists, tables, footnotes, citations, and math, making it a rich and expressive format for writing academic and technical documents.

HISTORY

Pandoc was created by John MacFarlane, with its initial public release in 2006. Developed entirely in the functional programming language Haskell, it has significantly evolved over the years, adding support for a vast and ever-growing number of input and output formats. Its robustness, extensibility via filters and templates, and comprehensive format support have made it a widely adopted standard for document conversion in academic, publishing, and open-source communities. It continues to be actively maintained and developed.

SEE ALSO

latex(1): Typesetting system often used by Pandoc for PDF generation., markdown(7): Refers to the Markdown format; Pandoc provides its primary implementation., odt2txt(1): Converts OpenDocument Text to plain text., html2text(1): Converts HTML to plain text.

Copied to clipboard