perf
Profile Linux kernel and user space performance
TLDR
Display basic performance counter stats for a command
Display system-wide real-time performance counter profile
Run a command and record its profile into perf.data
Record the profile of an existing process into perf.data
Read perf.data (created by perf record) and display the profile
SYNOPSIS
perf command [options]
PARAMETERS
record
Records performance data based on specified events and options.
stat
Runs a command and gathers summary statistics about its execution.
top
Displays real-time performance data, similar to the 'top' command, but focused on hardware counters.
report
Generates a report from a perf.data file created by the 'record' command.
annotate
Annotates source code or assembly with performance data.
list
Lists available performance events.
-a
System-wide collection from all CPUs.
-p pid
Monitors a specific process by its process ID.
-e event
Specifies the performance event to monitor (e.g., cycles, cache-misses).
-g
Enables call-graph recording.
-o file
Specifies the output file for recorded data.
--call-graph mode
Specifies the call graph recording mode (e.g., fp, dwarf).
DESCRIPTION
perf is a powerful performance monitoring tool in Linux.
It provides a comprehensive way to analyze the performance of applications and the operating system itself. Perf works by sampling system events, such as CPU cycles, cache misses, branch mispredictions, and system calls. It then aggregates these samples to provide insights into where time is being spent and what bottlenecks exist. This information can be used to optimize code, identify performance regressions, and understand how applications interact with the underlying hardware.
Perf supports various profiling modes, including CPU profiling, tracing, and hardware event counting. It can generate detailed reports, call graphs, and visualizations to help developers pinpoint performance issues. It provides a command-line interface and requires root privileges (or specific capabilities) for many of its functions. It is considered a cornerstone tool for Linux performance analysis and debugging.
CAVEATS
Requires root privileges or `kernel.perf_event_paranoid` setting modifications for many functionalities. The overhead of recording can impact the performance of the target application. Interpretation of results requires a solid understanding of hardware architecture and system behavior.
EVENT SELECTION
Choosing the right events to monitor is crucial for effective performance analysis. The `perf list` command shows all available events, which can be broadly categorized into hardware events, software events, and tracepoint events. Hardware events are directly related to CPU and memory operations. Software events are related to kernel operations. Tracepoint events are probes inserted into the kernel code, allowing specific function calls to be monitored.
DATA INTERPRETATION
Interpreting perf data requires understanding of system architecture and the monitored application. High values for certain events (e.g., cache misses, branch mispredictions) can indicate performance bottlenecks. The `perf report` and `perf annotate` commands provide insights into specific code sections contributing to the identified bottlenecks. Correlation with source code is essential for identifying optimization opportunities.
HISTORY
perf evolved from earlier performance analysis tools like OProfile and was integrated into the Linux kernel to provide a more unified and efficient mechanism for performance monitoring. Its development has been ongoing, with continuous improvements in event support, analysis capabilities, and user interface. Initially, its use was mostly confined to kernel developers, but it has since become more widely adopted by application developers and system administrators for general performance tuning.