LinuxCommandLibrary

perf

Profile Linux kernel and user space performance

TLDR

Display basic performance counter stats for a command

$ perf stat [gcc hello.c]
copy

Display system-wide real-time performance counter profile
$ sudo perf top
copy

Run a command and record its profile into perf.data
$ sudo perf record [command]
copy

Record the profile of an existing process into perf.data
$ sudo perf record [[-p|--pid]] [pid]
copy

Read perf.data (created by perf record) and display the profile
$ sudo perf report
copy

SYNOPSIS

perf [options] subcommand [subcommand_options] [arguments]

PARAMETERS

record
    Records performance counter data into perf.data, which can then be analyzed by perf report.

report
    Reads and displays performance counter data from perf.data, providing various views like call graphs and hot spots.

stat
    Runs a command and collects performance counter statistics for its execution, offering a summary of events.

top
    Provides a dynamic, real-time view of the top system performance events, similar to top for processes.

list
    Lists all available performance events (hardware, software, tracepoints, etc.) that can be monitored.

probe
    Defines new dynamic tracepoints (kprobes or uprobes) for custom function entry/exit tracing.

annotate
    Annotates source code with performance data, showing event counts per instruction.

kmem
    Analyzes kernel memory events, such as allocations and frees.

DESCRIPTION

perf is a versatile command-line utility for performance analysis in Linux. It leverages the Linux kernel's performance monitoring unit (PMU), tracepoints, kprobes, and uprobes to collect and analyze performance data.

It can profile various aspects of system behavior, including CPU cycles, cache hits/misses, branch predictions, page faults, and system calls, providing deep insights into both user-space applications and kernel activity. perf is an essential tool for developers and system administrators to identify and resolve performance bottlenecks, optimize code, and understand system resource utilization.

CAVEATS

Many perf operations, especially those involving hardware counters or kernel-level tracing, require root privileges to execute.

While powerful, perf can introduce some overhead to the system being monitored, particularly when tracing high-frequency events or using many counters. The features and accuracy of perf can also depend on the specific kernel version and hardware architecture.

EVENT TYPES

perf can monitor a wide range of event types:

  • Hardware events: CPU cycles, cache references/misses, branch instructions/mispredictions, etc., provided by the PMU.
  • Software events: Context switches, CPU migrations, page faults, CPU clock, etc., provided by the kernel.
  • Tracepoints: Static markers in the kernel code, used for tracing specific kernel operations.
  • Kprobes: Dynamic instrumentation points that can be placed on almost any kernel function.
  • Uprobes: Similar to kprobes, but for user-space functions, enabling dynamic tracing of applications.

OUTPUT DATA FILE

When using perf record, the collected performance data is typically saved to a file named perf.data in the current directory. This binary file contains detailed information about recorded events, including call stacks and timestamps. It can then be analyzed offline using perf report, allowing for detailed post-mortem examination of performance characteristics.

HISTORY

perf was developed primarily by Ingo Molnar and other kernel developers, and it was integrated into the Linux kernel source tree in 2009. It emerged as a unified and highly capable interface for various performance monitoring capabilities previously scattered or difficult to access. Its design allowed it to supersede older, less integrated tools like OProfile for many use cases, thanks to its tighter integration with the kernel and broader event support, including tracepoints and dynamic probes.

SEE ALSO

strace(1), ltrace(1), oprofile(1), valgrind(1), ftrace(5), dtrace(1M)

Copied to clipboard