theharvester
Gather email addresses, subdomains, and names
TLDR
Gather information on a domain using Google
Gather information on a domain using multiple sources
Change the limit of results to work with
Save the output to two files in XML and HTML format
Display help
SYNOPSIS
theharvester -d domain.com -l limit -b source
PARAMETERS
-d domain.com
Domain to search or company name.
-l limit
Limit the number of results to work (integer).
-b source
Data source: baidu, bing, bingapi, censys, crtsh, dnsdumpster, dogpile, google, google-certificates, googlecse, googleplus, hunter, intelx, linkedin, otx, pentesttools, projectdiscovery, qwant, securitytrails, shodan, threatcrowd, twitter, virustotal, yahoo, yandex. (Use all to search in every search engine)
-s start
Start with result number X (integer).
-v
Verify host name via dns resolution and search for virtual hosts.
-n
Do a DNS reverse lookup on all ranges discovered.
-c
Perform a DNS brute force for the domain.
-t
Perform a DNS TLD expansion discovery.
-p
Perform port scanning on discovered hosts.
-g
Run Google dorking module.
-m file
Save the unified output to an XML and JSON file.
-h
Use SHODAN database to query discovered hosts.
-w file
Save only the host found into a file.
-e dns server
Verify host name via DNS resolution and search for virtual hosts.
-f
Save to HTML file.
-u file
Verify host name via DNS resolution and search for virtual hosts.
-z
Do not use DNS resolution.
-ddns
Use passive DNS resolution.
-a
Query All data sources.
DESCRIPTION
TheHarvester is a Python-based open-source intelligence (OSINT) tool used for gathering email addresses, subdomains, hostnames, employee names, open ports and banners from different public sources like search engines, DNS servers, and SHODAN.
It is used by penetration testers and security professionals to gather information about a target organization before performing penetration tests or security assessments. The tool can help map out the attack surface of an organization by discovering publicly exposed assets.
The tool can be helpful during the information gathering phase of a penetration test for identifying potential attack vectors. It's capable of passively gathering information. This avoids directly interacting with the target's infrastructure, minimizing the risk of detection.
TheHarvester is widely used and considered a valuable asset in OSINT and reconnaissance activities.
CAVEATS
The effectiveness of the tool depends on the availability and accuracy of data on the public sources it queries. Some sources may require API keys or have rate limits. Usage should comply with terms of service of data sources.
OUTPUT
TheHarvester provides a wealth of information like employee names, email addresses, host names, and subdomains. Output can be saved in HTML, XML, or JSON.
ETHICAL CONSIDERATIONS
It's essential to use TheHarvester ethically and legally. Always obtain proper authorization before performing reconnaissance activities on a target. Misuse of this tool could lead to legal repercussions.
HISTORY
TheHarvester was developed by Christian Martorella (edge-security) and has been actively maintained and updated. It has become a popular tool within the cybersecurity community due to its ease of use and effectiveness in gathering information about target organizations. It's frequently updated to incorporate new data sources and improve its capabilities.