LinuxCommandLibrary

pdftocairo

Convert PDF documents to image formats

TLDR

Convert a PDF file to JPEG

$ pdftocairo [path/to/file.pdf] -jpeg
copy

Convert to PDF expanding the output to fill the paper
$ pdftocairo [path/to/file.pdf] [output.pdf] -pdf -expand
copy

Convert to SVG specifying the first/last page to convert
$ pdftocairo [path/to/file.pdf] [output.svg] -svg -f [first_page] -l [last_page]
copy

Convert to PNG with 200ppi resolution
$ pdftocairo [path/to/file.pdf] [output.png] -png -r 200
copy

Convert to grayscale TIFF setting paper size to A3
$ pdftocairo [path/to/file.pdf] -tiff -gray -paper A3
copy

Convert to PNG cropping x and y pixels from the top-left corner
$ pdftocairo [path/to/file.pdf] -png -x [x_pixels] -y [y_pixels]
copy

SYNOPSIS

pdftocairo [options] <PDF-file> [<output-file>]

PARAMETERS

-f <int>
    Specifies the first page to convert.

-l <int>
    Specifies the last page to convert.

-scale-to-x <int>
    Scales the output image to a fixed width of <int> pixels. Height is adjusted proportionally.

-scale-to-y <int>
    Scales the output image to a fixed height of <int> pixels. Width is adjusted proportionally.

-scale-to <int>
    Scales the output image so that its largest dimension (width or height) is <int> pixels.

-r <fp>
    Specifies the X and Y resolution, in DPI. Default is 150 DPI for image output.

-rx <fp>
    Specifies the X resolution, in DPI.

-ry <fp>
    Specifies the Y resolution, in DPI.

-cropbox
    Use the PDF's crop box rather than the media box to determine the page size.

-mono
    Generate a monochrome (1-bit) image output (PNG, TIFF).

-gray
    Generate a grayscale image output (PNG, TIFF).

-jpeg
    Generate JPEG image output.

-jpegopt <string>
    Set JPEG options, e.g., quality=80 or optimize=true.

-png
    Generate PNG image output.

-tiff
    Generate TIFF image output.

-tiffcompression <string>
    Set TIFF compression, e.g., lzw, jpeg, zip, ccittfax4.

-ps
    Generate PostScript output.

-eps
    Generate Encapsulated PostScript (EPS) output. Implies -ps.

-svg
    Generate Scalable Vector Graphics (SVG) output.

-matte <color>
    Set the background color for transparent images. Color can be hex (e.g., #RRGGBB), or a named color.

-q
    Don't print any messages or errors.

-v
    Print copyright and version information.

DESCRIPTION

pdftocairo is a utility that converts Portable Document Format (PDF) files into various high-quality image and vector formats, leveraging the Cairo graphics library for rendering. It is part of the poppler-utils package, which provides a suite of command-line tools for working with PDF documents.

Unlike some other PDF conversion tools, pdftocairo stands out for its superior rendering quality, particularly due to Cairo's advanced anti-aliasing and subpixel rendering capabilities, which result in crisp text and smooth graphics. It supports a wide range of output formats including PNG, JPEG, TIFF for raster images, and PostScript (PS), Encapsulated PostScript (EPS), and Scalable Vector Graphics (SVG) for vector-based output.

Users can control various aspects of the conversion process, such as resolution, page range, cropping, image compression, and output color space (monochrome, grayscale, color). This flexibility makes pdftocairo an indispensable tool for tasks like generating high-resolution thumbnails, preparing documents for web display, or creating print-ready vector graphics from PDF sources.

CAVEATS

While pdftocairo offers high-quality output, its rendering speed might be slower for very complex or large PDF documents compared to less feature-rich converters. For PostScript/EPS output, ensuring fonts are available on the system or properly embedded in the PDF is crucial for accurate rendering, as font substitution might occur otherwise.

Not all PDF features (e.g., certain transparency modes) may be perfectly represented in all output formats, particularly older PostScript language levels.

OUTPUT FILE NAMING

When the <output-file> argument is omitted, pdftocairo automatically names the output file(s) based on the input PDF file name and the chosen output format. For multi-page image formats (PNG, JPEG, TIFF), it appends the page number (e.g., document-0001.png, document-0002.png). For single-page output like PS, EPS, or SVG, it creates a single file. If <output-file> is specified as - (dash), the output will be written to standard output (stdout).

CAIRO LIBRARY INTEGRATION

The command's name, pdftocairo, directly references its reliance on the Cairo graphics library. Cairo is a 2D graphics library with support for various output devices. Its integration allows pdftocairo to produce high-quality, anti-aliased renderings of PDF content, which is a key differentiator from other conversion tools that might use simpler or less visually refined rendering engines.

HISTORY

pdftocairo emerged as a crucial component within the Poppler PDF rendering library ecosystem. Poppler itself is a fork of Xpdf, initiated to provide an open-source, robust, and feature-rich PDF viewer and toolkit. The integration of the Cairo graphics library into Poppler, and subsequently the creation of pdftocairo, marked a significant advancement in output quality.

Before pdftocairo, tools like pdftoppm were commonly used for raster conversions. However, Cairo's sophisticated rendering engine offered superior anti-aliasing and color handling, leading to visually sharper and more accurate conversions, especially for text and vector elements. This made pdftocairo the preferred choice for scenarios requiring high-fidelity image or vector output from PDFs, solidifying its place as a standard utility in Linux distributions.

SEE ALSO

pdftoppm(1), pdftops(1), pdftosvg(1), pdfimages(1), pdfinfo(1)

Copied to clipboard