LinuxCommandLibrary

pdfseparate

Extract individual pages from PDF documents

TLDR

Extract pages from PDF file and make a separate PDF file for each page

$ pdfseparate [path/to/source_filename.pdf] [path/to/destination_filename-%d.pdf]
copy

Specify the first/start page for extraction
$ pdfseparate -f [3] [path/to/source_filename.pdf] [path/to/destination_filename-%d.pdf]
copy

Specify the last page for extraction
$ pdfseparate -l [10] [path/to/source_filename.pdf] [path/to/destination_filename-%d.pdf]
copy

SYNOPSIS

pdfseparate [-f <int>] [-l <int>] [-v] [-h] input.pdf output-pattern.pdf

PARAMETERS

input.pdf
    The path to the source PDF file from which pages will be extracted.

output-pattern.pdf
    The naming pattern for the output PDF files. This pattern must contain a %d placeholder, which will be replaced by the page number. For example, chapter_%d.pdf will produce chapter_1.pdf, chapter_2.pdf, etc.

-f <int>
    Specifies the first page to extract. Pages are numbered starting from 1. If omitted, extraction starts from the first page.

-l <int>
    Specifies the last page to extract. Pages are numbered starting from 1. If omitted, extraction continues to the last page of the document.

-v
    Prints copyright and version information about the pdfseparate utility.

-h
    Displays a brief usage message and exits.

DESCRIPTION

pdfseparate is a command-line utility from the Poppler PDF rendering library designed for extracting individual pages or a specified range of pages from a Portable Document Format (PDF) file. It operates by taking an input PDF document and an output filename pattern. The output pattern is crucial as it dictates how the newly created single-page PDF files will be named. Specifically, the pattern must contain a %d placeholder, which pdfseparate dynamically replaces with the sequential page number during the extraction process. For instance, if the pattern is chapter_%d.pdf, it will generate files like chapter_1.pdf, chapter_2.pdf, and so on. This tool is highly effective for breaking down large PDF documents into smaller, more manageable files, or for isolating specific content for further processing or distribution. It simplifies tasks such as creating individual chapter files from a book or extracting specific forms from a compilation.

CAVEATS

The output filename pattern must include the %d placeholder for page numbering, otherwise pdfseparate will not work as expected.
This command extracts contiguous page ranges. To extract non-contiguous or specific individual pages, you would typically need to run pdfseparate multiple times with different -f and -l parameters, or use a more advanced PDF manipulation tool.
It's important to ensure that the output directory for the separated PDF files exists and that you have write permissions to it.

USAGE EXAMPLES

1. Extract all pages from a PDF:
pdfseparate input.pdf output_page_%d.pdf
This will create files named output_page_1.pdf, output_page_2.pdf, and so on, for every page in input.pdf.

2. Extract pages from 5 to 10:
pdfseparate -f 5 -l 10 document.pdf doc_page_%d.pdf
This will extract pages 5 through 10 from document.pdf, naming them doc_page_5.pdf to doc_page_10.pdf.

HISTORY

pdfseparate is a component of the Poppler utilities, a free software PDF rendering library. Poppler originated from the Xpdf project and has been a staple in Linux and Unix-like operating systems for robust PDF document manipulation, offering a reliable command-line interface for tasks like page extraction.

SEE ALSO

pdfunite(1), pdftk(1), qpdf(1)

Copied to clipboard