pandoc

Convert documents from one format to another

TLDR

Convert a Markdown file to PDF using pdflatex (the formats are determined by file extensions)

$ pandoc [path/to/input.md] [[-o|--output]] [path/to/output.pdf]

Convert the output from another command to PDF, using a specific PDF engine

Convert to a standalone file with the appropriate headers/footers (for LaTeX, HTML, etc.)

$ pandoc [path/to/input.md] [[-s|--standalone]] [[-o|--output]] [path/to/output.html]

Manually specify formats (overriding automatic format detection using the filename extension, or when there is no extension)

Transform a document using a Lua script (see for more information)

$ pandoc [path/to/input] [[-L|--lua-filter]] [path/to/filter.lua] [[-o|--output]] [path/to/output]

Convert a remote HTML file to markdown and print the result to stdout

$ pandoc [[-f|--from]] html [[-t|--to]] markdown [https://example.com]

List all supported input formats

$ pandoc --list-input-formats

List all supported output formats

$ pandoc --list-output-formats

SYNOPSIS

pandoc [OPTIONS] [INPUTFILE...]

Reads from INPUTFILE(s) or standard input if no files are given, and writes to standard output or a specified output file.

-f FORMAT, --from=FORMAT
    Specify the input format of the document.

-t FORMAT, --to=FORMAT
    Specify the output format for conversion.

-o FILE, --output=FILE
    Write the converted output to the specified file.

-s, --standalone
    Produce a complete, standalone document with appropriate headers.

--template=FILE
    Use a custom template file for output generation.

-V KEY[=VAL], --variable=KEY[=VAL]
    Set a template variable (KEY) to an optional value (VAL).

-C, --citeproc
    Enable processing of citations and bibliographies.

--bibliography=FILE
    Use the specified file for bibliographic data.

--toc, --table-of-contents
    Include a table of contents in the output.

--lua-filter=FILE
    Apply a Lua filter script to transform the document's AST.

--filter=PROGRAM
    Run an external program as a filter on the document's AST.

--css=FILE
    Link to a CSS stylesheet file for HTML output.

--mathjax, --katex, --mathml
    Render mathematical equations using the chosen method.

-D FORMAT, --print-default-template=FORMAT
    Print the default template for a given output format.

--list-input-formats
    Display a list of all supported input formats.

--list-output-formats
    Display a list of all supported output formats.

--extract-media=DIR
    Extract embedded media files (images, videos) to a specified directory.

--read-raw-html, --write-raw-html
    Control how raw HTML blocks are handled during conversion.

--data-dir=DIR
    Specify the user data directory for templates, filters, etc.

-v, --version
    Show program version number and exit.

-h, --help
    Display a help message and exit.

DESCRIPTION

Pandoc is a powerful command-line tool designed for universal document conversion. It acts as a swiss army knife for markup, allowing seamless transformation between a vast array of formats including Markdown, reStructuredText, HTML, LaTeX, DocBook, EPUB, ODT, and DOCX. This versatility makes it an indispensable asset for writers, academics, publishers, and developers managing diverse document workflows. Pandoc excels at preserving document structure, metadata, and complex elements like citations, footnotes, and mathematical equations during conversion. Developed in Haskell, it's widely adopted for creating static websites, generating presentations, converting e-books, and automating complex document processing tasks, providing robust and reliable conversions.

CAVEATS

External Dependencies: Certain output formats like PDF require external tools (e.g., a LaTeX distribution for high-quality PDF rendering).
Conversion Fidelity: While robust, converting between very disparate formats may lead to minor loss of specific styling or features not inherently supported by the target format.
Complexity for Advanced Use: Achieving highly customized outputs often requires familiarity with Markdown syntax, templates, and filter scripting (Lua or external programs).

FILTERS

Pandoc's most powerful extensibility feature. Users can write custom programs (in any language) or Lua scripts to transform the document's intermediate representation (Abstract Syntax Tree or AST) during conversion, allowing for highly complex and specific transformations.

TEMPLATES

For standalone document formats (like HTML, LaTeX, EPUB), Pandoc uses templates to control the overall structure and appearance of the output. Users can customize these templates to achieve precise layout and styling, offering immense flexibility beyond basic format conversion.

PANDOC MARKDOWN

While supporting many Markdown flavors, Pandoc has its own extended version, which includes features like definition lists, tables, footnotes, citations, and math, making it a rich and expressive format for writing academic and technical documents.

HISTORY

Pandoc was created by John MacFarlane, with its initial public release in 2006. Developed entirely in the functional programming language Haskell, it has significantly evolved over the years, adding support for a vast and ever-growing number of input and output formats. Its robustness, extensibility via filters and templates, and comprehensive format support have made it a widely adopted standard for document conversion in academic, publishing, and open-source communities. It continues to be actively maintained and developed.