unoconv
Convert documents between office formats
SYNOPSIS
unoconv [OPTIONS] FILE...
PARAMETERS
-f, --format <output_format>
Specifies the desired output format (e.g., pdf, doc, odt, html).
Use unoconv --list to see available formats.
-o, --output <output_file_or_directory>
Defines the output file path or a directory for multiple conversions. By default, output is in the same directory as input.
-e, --export <option=value>
Passes export-specific options to LibreOffice (e.g., 'PageRange=1-2' for PDF export, 'FilterOptions=9,99,76,1,0' for CSV). Can be used multiple times.
-i, --import <option=value>
Passes import-specific options to LibreOffice (e.g., 'FilterName=HTML (StarWriter)' for importing HTML). Can be used multiple times.
-p, --port <port_number>
Specifies the port for the LibreOffice listener (default is 2002).
-l, --listener
Starts an unoconv listener service that processes requests from other unoconv instances. Useful for a dedicated conversion server.
-s, --server <host[:port]>
Connects to a unoconv listener running on a specified host and optional port, rather than a local LibreOffice process directly.
-r, --reuse
If a LibreOffice listener is not found, unoconv will start one and keep it alive for future conversions, improving performance for multiple tasks.
-c, --connection <connection_string>
Directly specifies the LibreOffice connection string (e.g., 'socket,host=localhost,port=2002;urp;StarOffice.ComponentContext').
-v, --verbose
Enables verbose output, showing more details about the conversion process and debugging information.
-h, --help
Displays a help message with command options and exits.
--stdout
Writes the converted output directly to standard output instead of a file. Useful for piping output.
--base-directory <path>
Specifies the base directory for output files when converting multiple documents, preserving input directory structure.
--template
Treats the input file as a LibreOffice template (e.g., .ott, .otp) for conversion, using it to create a new document.
--list
Lists all supported input and output formats.
DESCRIPTION
unoconv is a powerful command-line utility for converting any document format that LibreOffice or OpenOffice can read and write, into any other format supported by these office suites. Written in Python, it acts as a wrapper, communicating with a running LibreOffice or OpenOffice instance (typically in headless mode) to perform the actual conversions. This makes it incredibly versatile for automating document transformations, such as converting DOCX to PDF, ODT to HTML, or XLS to CSV, without requiring a graphical user interface. It's particularly useful in server environments, scripting, and batch processing tasks where consistent and high-quality document conversions are needed. unoconv handles a wide range of formats including word processing documents, spreadsheets, presentations, and drawings, making it a go-to tool for document interoperability.
CAVEATS
unoconv relies heavily on a functioning LibreOffice or OpenOffice installation. For server-side usage, it's crucial to run LibreOffice in headless mode (e.g., libreoffice --headless --accept='socket,host=127.0.0.1,port=2002;urp;' --nodefault --nofirststartwizard --nolockcheck --nologo --norestore).
Without a persistent LibreOffice listener, unoconv might launch a new LibreOffice instance for each conversion, which can be resource-intensive and slow. Using the --reuse option or manually starting a listener is highly recommended for performance.
Export filters and their options can be complex and sometimes undocumented, requiring experimentation. Ensure the appropriate LibreOffice packages are installed for all desired conversion types (e.g., libreoffice-writer, libreoffice-calc, etc.).
COMMON USAGE EXAMPLES
Convert a DOCX file to PDF:
unoconv -f pdf document.docx
Convert an ODT file to HTML in a specific output directory:
unoconv -f html -o /tmp/output/ document.odt
Convert a spreadsheet to CSV, specifying a semicolon (;) as field separator:
unoconv -f csv -e 'FilterOptions=44,34,76,1,0,true,true' spreadsheet.ods
Convert multiple files to PDF, reusing a listener for efficiency:
unoconv -f pdf --reuse file1.doc file2.ppt
STARTING A HEADLESS LIBREOFFICE LISTENER
For optimal performance and reliability, it is highly recommended to start a LibreOffice instance in headless (server) mode once, and let unoconv connect to it for subsequent conversions. This avoids the overhead of launching LibreOffice for each conversion request.
Command to start a listener (often run in a background process or service manager):
libreoffice --headless --accept='socket,host=127.0.0.1,port=2002;urp;' --nodefault --nofirststartwizard --nolockcheck --nologo --norestore &
Then, unoconv will automatically connect to this listener on the default port (2002) or you can specify it using -p.
HISTORY
unoconv was developed as a Python script to provide a robust, command-line interface for document conversions, leveraging the Universal Network Objects (UNO) API of OpenOffice.org (later LibreOffice). Its inception aimed to simplify the process of automating document conversions in headless environments, addressing the need for server-side document processing without a graphical interface. Over time, it gained popularity due to its reliability and the comprehensive format support inherited from LibreOffice, becoming a staple tool for web services, document management systems, and batch processing workflows that require programmatic document transformation.
SEE ALSO
libreoffice(1), pandoc(1), convert(1), soffice(1)


