LinuxCommandLibrary

cutycapt

Convert webpage into image or document

SYNOPSIS

cutycapt [OPTIONS...] --url=URL --out=FILE

PARAMETERS

--url=URL
    Mandatory. The URL of the web page to capture.

--out=FILE
    Mandatory. The output file name and path. The extension determines the output format (e.g., .png, .jpg, .pdf, .svg).

--min-width=WIDTH
    Sets the minimum width of the browser window in pixels.

--min-height=HEIGHT
    Sets the minimum height of the browser window in pixels.

--max-wait=MILLISECONDS
    Maximum time to wait for network and JavaScript activity (default is 90000ms).

--delay=MILLISECONDS
    Delay before capturing the page, allowing JavaScript to execute (default is 0ms).

--zoom-factor=FACTOR
    Sets the zoom level of the page (e.g., 0.5 for 50%, 2.0 for 200%).

--user-agent=STRING
    Sets the User-Agent string sent with HTTP requests.

--javascript=on|off
    Enables or disables JavaScript execution (default is on).

--plugins=on|off
    Enables or disables browser plugins (default is on).

--proxy=HOST:PORT
    Specifies an HTTP proxy to use.

--http-header=HEADER:VALUE
    Adds a custom HTTP header to requests. Can be used multiple times.

--base-url=URL
    Sets the base URL for relative links on the page.

--output-format=FORMAT
    Forces a specific output format (e.g., png, jpeg, pdf, svg). Overrides file extension.

--disable-scrollbars
    Hides the scrollbars in the captured image.

DESCRIPTION

CutyCapt is a cross-platform command-line utility that uses WebKit (or more recent versions might use QtWebEngine, based on Chromium) to render web pages and save them as various image formats (JPEG, PNG, GIF, BMP, TIFF, PDF, SVG) or as pure HTML/MHTML.

It provides a robust way to generate screenshots or PDF documents of web content directly from the command line, making it suitable for scripting, automated website thumbnail generation, archiving, or testing web page rendering across different environments. Its core strength lies in its ability to render web content accurately, including JavaScript, CSS, and dynamic content, similar to how a modern web browser would display it.

CAVEATS

CutyCapt relies on QtWebKit or QtWebEngine, which means its rendering capabilities are tied to the version of Qt and WebKit/Chromium it was compiled against. This can lead to rendering differences compared to the latest versions of modern browsers. It can be resource-intensive, especially for complex pages or when capturing many pages.

For highly dynamic single-page applications or advanced browser automation, headless Chrome/Firefox (e.g., via Puppeteer or Playwright) are often more powerful and actively maintained alternatives.

INTEGRATION WITH SCRIPTING

CutyCapt's command-line nature makes it ideal for integration into shell scripts, cron jobs, or server-side applications for automated tasks like creating website thumbnails, generating reports, or archiving web content regularly.

DEPENDENCIES

It typically requires the Qt framework and its WebKit or WebEngine modules to be installed on the system where it runs. Availability might depend on the specific Linux distribution and its package repositories.

HISTORY

CutyCapt was created by the same developer as wkhtmltopdf, Peter W. G. Kroon. It emerged at a time when command-line web rendering tools were less common and less sophisticated. Its development largely mirrored the evolution of QtWebKit, providing a robust solution for server-side web page rendering and screenshot generation.

While newer headless browser solutions have gained prominence, CutyCapt remains a viable, lightweight option for specific use cases, particularly where Qt dependencies are already present or a simpler, standalone executable is preferred.

SEE ALSO

wkhtmltopdf(1) - Converts HTML to PDF and image using WebKit., phantomjs(1) - Headless WebKit scriptable browser (deprecated, but similar concept)., puppeteer (Node.js library) - High-level API to control headless Chrome/Chromium., playwright (Node.js, Python, Java, .NET) - Cross-browser automation library.

Copied to clipboard