LinuxCommandLibrary

gau

Fetch known URLs from AlienVault's Open Threat Exchange

TLDR

Fetch all URLs of a domain from AlienVault's Open Threat Exchange, the Wayback Machine, Common Crawl, and URLScan

$ gau [example.com]
copy

Fetch URLs of multiple domains
$ gau [domain1 domain2 ...]
copy

Fetch all URLs of several domains from an input file, running multiple threads
$ gau < [path/to/domains.txt] --threads [4]
copy

Write [o]utput results to a file
$ gau [example.com] --o [path/to/found_urls.txt]
copy

Search for URLs from only one specific provider
$ gau --providers [wayback|commoncrawl|otx|urlscan] [example.com]
copy

Search for URLs from multiple providers
$ gau --providers [wayback,otx,...] [example.com]
copy

Search for URLs within specific date range
$ gau --from [YYYYMM] --to [YYYYMM] [example.com]
copy

SYNOPSIS

gau [-b blacklist] [-c] [-d domain] [-h] [-i] [-insecure] [-mcn] [-p providers] [-r] [-t threads] [-timeout duration] [-v] [-V] [-w output] [domains...]

PARAMETERS

-b, --blacklist TLDs
    Comma-separated blacklist of TLDs to filter from results

-c, --corpus
    Use slower corpus version of Common Crawl

-d, --domain domain
    Fetch URLs only from specified domain and subdomains

-h, --help
    Show help message

-i, --subs
    Include subdomains in results (default)

-insecure
    Skip TLS certificate verification

-mcn, --match-canonical
    Match only canonical (normalized) URLs

-p, --providers providers
    Comma-separated providers: otx,wayback,commoncrawl,urlscan (default: all)

-r, --random
    Use random User-Agent per request

-t, --threads N
    Number of threads (default: 20)

-timeout duration
    Request timeout (default: 10s)

-v, --verbose
    Verbose output

-V, --version
    Show version information

-w, --write file
    Write output to file instead of stdout

DESCRIPTION

gau (Get All Urls) is a command-line tool designed for security researchers, bug bounty hunters, and OSINT practitioners. It retrieves historical and indexed URLs associated with one or more domains from multiple passive sources, including AlienVault's Open Threat Exchange (OTX), the Internet Archive's Wayback Machine, and Common Crawl.

By querying these vast datasets, gau quickly assembles a comprehensive list of URLs without active scanning, helping identify subdomains, paths, parameters, and endpoints that may reveal vulnerabilities or assets. Output includes full URLs, which can be piped to tools like httpx for live checking or gf for pattern matching.

It's written in Go for speed and concurrency, supporting multi-threading and provider selection. Ideal for reconnaissance phases in penetration testing or domain takeovers detection. Note: Requires internet access and may hit rate limits on heavy use.

CAVEATS

Heavy usage may trigger rate limits on providers; large domains produce massive outputs (millions of URLs); requires Go installation or pre-built binary; no built-in deduplication.

INSTALLATION

go install github.com/lc/gau/v2/cmd/gau@latest
Or download binaries from GitHub releases.

EXAMPLE USAGE

gau example.com | httpx -silent
gau -p wayback,commoncrawl -t 50 -d target.com

HISTORY

Developed by Tommy Morris (@lc) in 2020 as a faster alternative to scripting Wayback/OTX queries. Open-sourced on GitHub, gained popularity in bug bounty communities via tools like ReconFTW. Frequent updates improve provider integration and performance.

SEE ALSO

curl(1), wget(1), httpx(1)

Copied to clipboard