LinuxCommandLibrary

pdftk

Manipulate PDF documents

TLDR

Extract pages 1-3, 5 and 6-10 from a PDF file and save them as another one

$ pdftk [input.pdf] cat [1-3 5 6-10] output [output.pdf]
copy

Merge (concatenate) a list of PDF files and save the result as another one
$ pdftk [file1.pdf file2.pdf ...] cat output [output.pdf]
copy

Split each page of a PDF file into a separate file, with a given filename output pattern
$ pdftk [input.pdf] burst output [out_%d.pdf]
copy

Rotate all pages by 180 degrees clockwise
$ pdftk [input.pdf] cat [1-endsouth] output [output.pdf]
copy

Rotate third page by 90 degrees clockwise and leave others unchanged
$ pdftk [input.pdf] cat [1-2 3east 4-end] output [output.pdf]
copy

SYNOPSIS

pdftk input_pdf_files operation [output output_file] [options]

Examples:
Merge: pdftk doc1.pdf doc2.pdf cat output combined.pdf
Split: pdftk input.pdf burst
Rotate: pdftk in.pdf cat 1-endS output out.pdf

PARAMETERS

cat page_ranges
    Concatenates (merges) multiple PDF documents or selected pages/page ranges into a single output file.

burst
    Splits a single input PDF into individual pages, creating separate PDF files for each page.

fill_form data_file
    Fills interactive PDF forms using data from an FDF or XFDF file.

dump_data_fields
    Lists all form fields found in a PDF document, including their names and types, to standard output.

background bg_pdf
    Adds a PDF file as a background layer to the input PDF pages.

stamp stamp_pdf
    Adds a PDF file as a stamp (overlay) layer on top of the input PDF pages.

encrypt strength user_pw password [owner_pw password] [allow permissions]
    Encrypts a PDF document with specified password(s) and permissions (e.g., printing, copying).

decrypt
    Decrypts an encrypted PDF document, typically requiring the correct password.

update_info data_file
    Updates the metadata (e.g., Author, Title, Subject) of a PDF from a data file or key-value pairs.

rotate page_selectiondirection
    Rotates specified pages within a PDF by 90, 180, or 270 degrees. Directions are N (0/360), E (90), S (180), W (270).

output filename
    Specifies the name of the output PDF file. This is mandatory for most operations.

DESCRIPTION

pdftk (PDF Toolkit) is a powerful and versatile command-line utility for manipulating PDF documents. It allows users to perform a wide array of operations such as merging multiple PDF files into one, splitting a single PDF into individual pages or page ranges, rotating pages, watermarking documents, encrypting and decrypting PDFs, filling PDF forms with FDF/XFDF data, and collecting data from PDF forms.

Designed for scripting and automation, pdftk operates directly on PDF data structures, making it a fast and efficient tool for batch processing PDF files without requiring a graphical interface. While highly functional, its development has largely stalled, and newer alternatives like qpdf or pypdf might be preferred on modern systems. Nevertheless, pdftk remains a go-to tool for many complex PDF manipulation tasks via the command line.

CAVEATS

pdftk is no longer actively maintained upstream (since around 2012-2013 for the original C++ version).
Installation can be challenging on modern Linux distributions, often requiring older dependencies or manual compilation.
Error handling can sometimes be cryptic, making debugging difficult.
It does not render PDFs; it only manipulates their internal structure.

DEPENDENCY

The functionality of pdftk is largely based on the iText Java library for its core PDF manipulation capabilities. The pdftk-java variant explicitly relies on a Java Runtime Environment (JRE).

COMMAND-LINE INTERFACE (CLI)

pdftk's primary strength lies in its CLI nature, making it exceptionally well-suited for scripting, batch processing, and integration into automated workflows without the need for a graphical user interface.

HISTORY

pdftk was originally developed by Sid Steward in C++ and released around 2002, quickly gaining popularity for its powerful command-line capabilities. Later, a Java port, pdftk-java, was created, providing similar functionality. The original C++ project ceased active development around 2012-2013, leading to its unmaintained status in many distribution repositories. Despite this, it remains a widely used tool for complex PDF manipulation tasks, though more modern and actively maintained alternatives like qpdf are increasingly recommended.

SEE ALSO

qpdf(1), mutool(1), pdfinfo(1), pdftotext(1), gs(1)

Copied to clipboard