LinuxCommandLibrary

whatweb

Identify technologies used by a website

TLDR

Scan websites/targets for web technologies

$ whatweb [website1 website2 ...]
copy

Read targets/websites from a file
$ whatweb [[-i|--input-file]] [targets_file]
copy

Scan a website/target in verbose mode
$ whatweb [[-v|--verbose]] [example.com]
copy

Run an aggressive scan on a website
$ whatweb [[-a|--aggression]] 3 [example.com]
copy

Scan a network and suppress errors
$ whatweb --no-errors [192.168.0.0/24]
copy

List plugins
$ whatweb [[-l|--list-plugins]]
copy

List plugin details
$ whatweb [[-I|--info-plugins]] [plugin_name]
copy

SYNOPSIS

whatweb [options] target [target ...]

PARAMETERS

-h, --help
    Show help message and exit.

-v, --verbose
    Increase verbosity level.

-i, --input-file
    Read targets from file, one URL per line.

-o, --output-file
    Write output to file. Formats: xml, json, txt, csv.

--log-brief
    Log briefly, URL and plugins found.

-t, --threads
    Number of concurrent threads. Default: 10.

--user-agent
    Set custom user-agent string.

--max-threads
    Maximum threads, regardless of number of targets.

--plugins
    Comma separated list of plugins to use.

--no-plugins
    Disable all plugins.

--list-plugins
    List available plugins.

--aggressive
    Aggression level (1-4). Default: 1.

--url-encode
    URL encode the target.

--version
    Show version information and exit.

--color
    Control color output (auto, always, never). Default: auto.

--debug
    Debug level (0-3).

--proxy
    Use HTTP proxy. Format: http://host:port

--proxy-user
    Proxy username.

--proxy-password
    Proxy password.

--timeout
    Set timeout in seconds. Default: 10.

--open-timeout
    Set open timeout in seconds. Default: 5.

--tries
    Number of tries per target. Default: 1.

DESCRIPTION

whatweb is a command-line tool primarily used for identifying technologies and components employed by a website. It sends HTTP requests to a target URL and analyzes the responses, searching for various fingerprints like headers, cookies, JavaScript code, specific file paths, and content patterns. This allows whatweb to determine what web server software is being used (e.g., Apache, Nginx), what content management system (CMS) is installed (e.g., WordPress, Drupal, Joomla), which analytics platforms are present (e.g., Google Analytics), and many other details about the website's underlying architecture.
It uses a plug-in based system, allowing you to easily extend and modify its detection capabilities. Because of its plugin based architecture it can be easily modified to detect for specific technologies as well, without needing extensive prior knowledge about its internals.
The tool is particularly useful for security researchers, penetration testers, and developers who need to quickly assess the technologies used by a website for security audits, vulnerability assessments, or general information gathering. Its flexibility and breadth of supported technologies make it an invaluable resource for anyone performing web reconnaissance.

CAVEATS

The accuracy of whatweb depends on the quality of its plugins and the availability of identifiable fingerprints. Some websites may be configured to obfuscate their technologies, making accurate identification difficult. Also, excessive usage can be perceived as aggressive scanning and may lead to IP blocking.

PLUGIN STRUCTURE

Whatweb leverages plugins to identify technologies.
Plugins are written in Ruby and use regular expressions, HTTP headers, and other patterns to identify specific technologies. You can add custom plugins to identify specific indicators within a web application. Plugins reside in /usr/share/whatweb/plugins/ by default.

AGGRESSION LEVELS

Aggression levels control the number of requests made to the target server.
Higher aggression levels increase the accuracy of detection but also increase the load on the target server and the risk of being detected. Level 1 is the default and least intrusive.

HISTORY

whatweb was developed as an open-source project to provide a comprehensive tool for identifying web technologies. It has been continuously updated with new plugins and features to keep pace with the ever-evolving web landscape. It is commonly used by security professionals for web application penetration testing and security assessments.

SEE ALSO

nmap(1), curl(1), wget(1)

Copied to clipboard