LinuxCommandLibrary

wget2

Download files from the web

TLDR

Download the contents of a URL to a file using multiple threads (default behavior differs from wget)

$ wget2 [https://example.com/foo]
copy

Limit the number of threads used for downloading (default is 5 threads)
$ wget2 --max-threads [10] [https://example.com/foo]
copy

Download a single web page and all its resources (scripts, stylesheets, images, etc.)
$ wget2 [[-p|--page-requisites]] [[-k|--convert-links]] [https://example.com/somepage.html]
copy

Mirror a website, but do not ascend to the parent directory (does not download embedded page elements)
$ wget2 [[-m|--mirror]] [[-np|--no-parent]] [https://example.com/somepath/]
copy

Limit the download speed and the number of connection retries
$ wget2 --limit-rate [300k] [[-t|--tries]] [100] [https://example.com/somepath/]
copy

Continue an incomplete download (behavior is consistent with wget)
$ wget2 [[-c|--continue]] [https://example.com]
copy

Download all URLs stored in a text file to a specific directory
$ wget2 [[-P|--directory-prefix]] [path/to/directory] [[-i|--input-file]] [URLs.txt]
copy

Download a file from an HTTP server using Basic Auth (also works for HTTPS)
$ wget2 --user [username] --password [password] [https://example.com]
copy

SYNOPSIS

wget2 [OPTION]... [URL]...
wget2 --input-file=FILE [OPTION]...

PARAMETERS

--continue / -c
    Resumes getting a partially downloaded file, continuing from where it left off.

--output-file=FILE / -O FILE
    Writes all downloaded documents to a specified FILE instead of saving them to their original filenames.

--directory-prefix=PREFIX / -P PREFIX
    Sets PREFIX as the directory where all retrieved files and directories will be saved.

--input-file=FILE / -i FILE
    Reads URLs to download from a local or external FILE, with one URL per line.

--recursive / -r
    Turns on recursive retrieving, following links within documents to download an entire website.

--level=NUMBER / -l NUMBER
    Specifies the maximum depth of recursion for recursive downloads (e.g., -l 1 for current page only).

--user-agent=AGENTSTRING / -U AGENTSTRING
    Identifies wget2 as AGENTSTRING to the web server, mimicking a specific browser or client.

--no-check-certificate
    Disables server certificate validation. Use with caution as it can expose you to security risks.

--quiet / -q
    Suppresses most of wget2's output, making it run silently.

--verbose / -v
    Turns on verbose output, displaying detailed information about the download process (default behavior).

--version / -V
    Displays the version information for wget2 and exits.

--help / -h
    Displays a summary of command-line options and exits.

DESCRIPTION

wget2 is a free, non-interactive network utility for downloading files from the Web. It is a modern successor to the classic wget, re-engineered from the ground up for improved performance, enhanced security, and robust support for contemporary internet protocols. wget2 efficiently retrieves files using HTTP/1.1, HTTP/2, HTTPS, and FTP.

Key features include highly parallelized downloads leveraging multi-core processors, support for modern compression algorithms like zstd and brotli, and compatibility with Metalink, Gnutella, and BitTorrent magnet links. It offers superior speed for large-scale downloads or retrieving numerous small files compared to its predecessor. Like wget, it can work in the background, retrieve content recursively, resume broken downloads, and operate through proxies, making it an incredibly powerful and versatile tool for web scraping, site mirroring, and general file acquisition needs.

CAVEATS

wget2 is not as widely pre-installed on Linux distributions as wget, often requiring manual installation. While its syntax is largely compatible with wget, some options may behave differently, and new features might require specific syntax. As an actively developed project, its features and stability may evolve.

PERFORMANCE BENEFITS

wget2 is engineered for high performance, utilizing multiple connections for a single file download and parallelizing multiple file downloads concurrently. This design makes it significantly faster than its predecessor in scenarios with high-bandwidth connections or when downloading a large number of files.

MODERN PROTOCOL SUPPORT

It includes native support for cutting-edge web technologies such as HTTP/2, TLS 1.3, and advanced compression algorithms like Brotli and zstd. Furthermore, wget2 supports specialized protocols and link types including Metalink, Gnutella, and BitTorrent magnet links, providing a more comprehensive and up-to-date downloading experience.

HISTORY

wget2 emerged as a successor project to the widely used wget utility, aiming to modernize its codebase and capabilities. Development began around 2014-2015, driven by the need for better performance, native support for newer protocols like HTTP/2, and improved security features that were challenging to integrate into the aging wget codebase. It was rewritten primarily in C and C++, focusing on parallelism, multi-threading, and efficient handling of high-speed network connections. While wget remains ubiquitous, wget2 offers a more robust and significantly faster alternative for contemporary web downloading tasks, especially where performance and modern protocol support are critical.

SEE ALSO

wget(1), curl(1), aria2c(1)

Copied to clipboard