LinuxCommandLibrary

wget2

Download files from the web

TLDR

Download the contents of a URL to a file using multiple threads (default behavior differs from wget)

$ wget2 [https://example.com/foo]
copy

Limit the number of threads used for downloading (default is 5 threads)
$ wget2 --max-threads [10] [https://example.com/foo]
copy

Download a single web page and all its resources (scripts, stylesheets, images, etc.)
$ wget2 [[-p|--page-requisites]] [[-k|--convert-links]] [https://example.com/somepage.html]
copy

Mirror a website, but do not ascend to the parent directory (does not download embedded page elements)
$ wget2 [[-m|--mirror]] [[-np|--no-parent]] [https://example.com/somepath/]
copy

Limit the download speed and the number of connection retries
$ wget2 --limit-rate [300k] [[-t|--tries]] [100] [https://example.com/somepath/]
copy

Continue an incomplete download (behavior is consistent with wget)
$ wget2 [[-c|--continue]] [https://example.com]
copy

Download all URLs stored in a text file to a specific directory
$ wget2 [[-P|--directory-prefix]] [path/to/directory] [[-i|--input-file]] [URLs.txt]
copy

Download a file from an HTTP server using Basic Auth (also works for HTTPS)
$ wget2 --user [username] --password [password] [https://example.com]
copy

SYNOPSIS

wget2 [options] [URL]...

PARAMETERS

-h, --help
    Display help message and exit.

-V, --version
    Show program version and exit.

-q, --quiet
    Turn off wget2's output.

-v, --verbose
    Be verbose (default).

--debug
    Print lots of debugging information.

-o logfile, --output-file logfile
    Log all messages to logfile.

-a logfile, --append-output logfile
    Append to logfile.

-d, --debug
    Turn on debug output.

-i file, --input-file file
    Download URLs from file.

-O file, --output-document file
    Write documents to file.

--limit-rate rate
    Limit download rate to rate.

-w seconds, --wait seconds
    Wait seconds between retrievals.

-c, --continue
    Resume getting a partially downloaded file.

-N, --timestamping
    Only retrieve newer files.

--spider
    Just check if the URL exists without downloading.

-r, --recursive
    Turn on recursive retrieving.

-l depth, --level depth
    Maximum recursion depth (inf or 0 for infinite).

-A accept_list, --accept accept_list
    Comma-separated list of accepted extensions.

--user-agent string
    Identify as string.

--header string
    Add a custom header.

DESCRIPTION

wget2 is a command-line utility for retrieving files using HTTP, HTTPS, and FTP.
It is designed as a successor to wget, offering improved performance, support for modern protocols (like HTTP/2 and HTTP/3), and enhanced features. wget2 excels in downloading large files, mirroring websites, and performing recursive downloads. It provides robust error handling, automatic retries, and support for resuming interrupted downloads. Its multi-threaded architecture allows for faster downloads compared to wget, especially when retrieving multiple files simultaneously. wget2 aims to be a versatile and efficient tool for web content retrieval.

CAVEATS

wget2 is still under active development, so some features may be incomplete or buggy.
Compatibility with all web servers and protocols is not guaranteed. Some advanced features may require specific system libraries or configurations.

EXAMPLES

Download a single file:
wget2 https://example.com/file.txt

Download a file and save it with a different name:
wget2 -O my_file.txt https://example.com/file.txt

Download multiple files from a list:
wget2 -i urls.txt

HISTORY

wget2 emerged as an attempt to address the limitations of the original wget, particularly in the context of modern web technologies.
Development focused on improving download speeds, incorporating support for newer protocols like HTTP/2 and HTTP/3, and providing a more extensible architecture. The project aims to provide a modern and performant alternative for web content retrieval.

SEE ALSO

wget(1), curl(1)

Copied to clipboard