httpflow
Capture and reconstruct HTTP traffic
TLDR
Capture traffic on all interfaces
Use a bpf-style capture to filter the results
Use a regex to filter requests by URLs
Read packets from PCAP format binary file
Write the output to a directory
SYNOPSIS
httpflow [options] [interface]
Note: If no interface is specified, httpflow may attempt to listen on all active interfaces or require -i to be provided.
PARAMETERS
-i <interface>
Specifies the network interface to listen on (e.g., eth0, wlan0). This option is often mandatory for live capture.
-p <port>
Specifies the HTTP port(s) to monitor. Multiple ports can be specified separated by commas (e.g., 80,8080). Defaults typically include standard HTTP/HTTPS ports (80, 443).
-f <BPF_filter>
Applies a BPF (Berkeley Packet Filter) expression to filter packets, similar to tcpdump. This allows for fine-grained control over which packets are processed by httpflow (e.g., host 192.168.1.1).
-r <file>
Reads raw packet data from a pcap file (generated by tools like tcpdump or wireshark) instead of a live interface.
-w <file>
Writes captured raw packet data to a pcap file. This allows for later analysis with httpflow or other network analysis tools.
-t <timeout>
Sets a timeout for connection inactivity in seconds. Connections inactive for longer than this period will be closed and removed from tracking.
-s
Displays summary statistics of the captured HTTP flows, often including total requests, errors, and throughput.
-q
Operates in quiet mode, suppressing verbose output.
-v
Increases verbosity of output, showing more detailed information about each HTTP transaction.
-L
Lists all available network interfaces that httpflow can listen on.
--help
Displays a help message with available options and usage information.
--version
Displays the version information of the httpflow utility.
DESCRIPTION
httpflow is a specialized Linux command-line utility designed for capturing and analyzing network traffic specifically at the HTTP application layer. Unlike general-purpose packet sniffers like tcpdump, httpflow parses HTTP requests and responses, providing a high-level view of web application communication.
It displays details such as HTTP methods (GET, POST), URLs, status codes (200 OK, 404 Not Found), content types, and crucial timing information, including request-to-response latency. This makes httpflow an invaluable tool for debugging web applications, identifying performance bottlenecks, verifying API interactions, and understanding client-server communication patterns.
It can operate on live network interfaces or process pre-recorded pcap files, offering flexibility for both real-time diagnostics and post-mortem analysis. Its focus on HTTP semantics simplifies troubleshooting by presenting relevant application-level data without requiring deep packet inspection skills.
CAVEATS
Requires root privileges or appropriate capabilities (e.g., CAP_NET_RAW, CAP_NET_ADMIN) to capture packets from network interfaces.
httpflow primarily focuses on unencrypted HTTP traffic. For HTTPS/SSL/TLS traffic, it can only see the encrypted tunnel. To analyze HTTPS content, tools like sslsplit or pre-configured `SSLKEYLOGFILE` (for tools like Wireshark) are typically needed, which httpflow does not natively support.
High traffic volumes can generate a significant amount of output, potentially overwhelming the terminal or consuming considerable resources.
It might not correctly parse highly fragmented or malformed HTTP packets.
TYPICAL USAGE
To start monitoring HTTP traffic on a specific interface, use:
sudo httpflow -i eth0
To monitor HTTP traffic on port 8080 and filter for a specific host:
sudo httpflow -i eth0 -p 8080 -f 'host 192.168.1.100'
To analyze a previously captured pcap file:
httpflow -r capture.pcap
To capture live traffic and save it to a pcap file for later analysis:
sudo httpflow -i eth0 -w live_capture.pcap
OUTPUT INTERPRETATION
The output of httpflow typically shows each HTTP request-response pair. For each flow, you'll often see:
Source and Destination: IP addresses and ports.
HTTP Method: (e.g., GET, POST).
Requested URL: The path.
Status Code: (e.g., 200 OK, 404 Not Found, 500 Internal Server Error).
Content Type: Of the response.
Timing Information: Total duration of the request-response cycle, and sometimes individual timings.
This structured output helps quickly identify slow requests, failed requests, or unexpected server behavior.
HISTORY
The httpflow utility is typically distributed as part of specialized network monitoring or 'traceflow' toolkits. While not as universally known or long-standing as tools like tcpdump, its development reflects a growing need for application-layer visibility in network diagnostics. It emerged as a simpler, more focused alternative for HTTP-specific analysis, bridging the gap between low-level packet inspection and high-level web server logs. Its design prioritizes ease of use for web developers and network administrators troubleshooting HTTP-based services.