pdf-parser
analyzes PDF file structure
TLDR
Parse PDF structure
SYNOPSIS
pdf-parser [-s search] [-o id] [-t type] [-f] [options] file
DESCRIPTION
pdf-parser analyzes PDF file structure. It's used for malware analysis and forensics.
Object enumeration shows all PDF objects. Each object's type and contents are displayed.
Searching finds embedded scripts, URLs, or suspicious content. JavaScript and launch actions are common malware vectors.
Stream extraction dumps compressed or encoded data. Filters decompress FlateDecode and other encodings.
Statistics summarize object types present. This quickly identifies files with unusual structures.
Reference following traces object relationships. Cross-references reveal document structure.
PARAMETERS
-s STRING
Search for string.-o ID
Select object by ID.-t TYPE
Filter by type.-f
Apply stream filters.-d FILE
Dump stream to file.-a
Statistics and analysis.-w
Raw output.-r N
Reference object.-c
Content stream.-v
Verbose output.
CAVEATS
Malicious PDFs may crash parsers. Output can be very large. Not all PDF features supported.
HISTORY
pdf-parser was created by Didier Stevens for PDF malware analysis. It's part of his toolkit for analyzing suspicious documents and is widely used in incident response.
