gprof
Profile program execution time and function calls
TLDR
Compile binary to default a.out with gprof information and run it to get gmon.out
Run gprof on default a.out and gmon.out to obtain profile output
Run gprof on a named binary
Suppress profile field's description
Display routines that have zero usage
SYNOPSIS
`gprof` options executable-file [profile-file ...]
Example: gprof my_program gmon.out
Example: gprof -b -p my_program
PARAMETERS
-b
Don't print the verbose descriptions. This is useful for scripts that want to parse the output of `gprof`.
-p
Print the flat profile. This is the default output that shows the total time spent in each function.
-q
Print the call graph. This is the default output that shows parent/child relationships between functions and their accumulated time.
-s
Summarize the information in the specified profile files into a single gmon.sum file. This is useful for combining profiling data from multiple runs.
-l
Enable line-by-line profiling. `gprof` attempts to discover what parts of a function are responsible for the time spent in it, reporting time per source line.
-z
Display functions that have zero usage (i.e., they were never called or had no time attributed to them). By default, such functions are suppressed.
-A
Print annotated source code. This option displays source code segments, annotated with execution counts and times, for functions that appear in the profile.
-e function_name
Exclude function_name and its children from the call graph and flat profile displays. Time spent in the function is still accounted for, but not displayed.
-f function_name
Include function_name and its children in the call graph and flat profile displays, even if they would otherwise be suppressed. This is the inverse of -e.
-E function_name
Exclude function_name from the flat profile display. This only affects the flat profile, not the call graph.
-F function_name
Include function_name in the flat profile display, even if it would otherwise be suppressed. This is the inverse of -E.
DESCRIPTION
`gprof` is a powerful command-line utility used to analyze the execution profile of programs compiled with the -pg option. It reads profiling data generated by the program at runtime (typically from a file named gmon.out by default) and presents it in a human-readable format. This analysis helps developers understand where their program spends most of its time and which functions call other functions, thereby identifying performance bottlenecks. `gprof` provides two main views: a flat profile, showing the total time spent in each function and the number of times it was called; and a call graph, illustrating the parent-child relationships between functions and their respective contributions to the total execution time. It is an essential tool for performance optimization in C, C++, and Fortran applications.
CAVEATS
`gprof` introduces runtime overhead due to instrumentation, which can alter program behavior and skew results slightly. It primarily uses sampling-based profiling, meaning its accuracy depends on the system's clock resolution and the frequency of samples. It requires the program to be compiled with the -pg flag, linking in profiling libraries. `gprof` is generally not suitable for profiling multithreaded applications accurately, as its call graph analysis and time accounting can be misleading in concurrent environments. Furthermore, it typically only profiles code within the main executable and not shared libraries unless specifically compiled and linked for profiling.
<I>PROFILING WORKFLOW</I>
To use `gprof`, a program must first be compiled with the GCC -pg option (e.g., `gcc -pg -o my_program my_program.c`). This option instruments the executable to record profiling data during execution. After compilation, run the program normally (e.g., `./my_program`). Upon successful completion, a profile file named gmon.out (by default) will be generated in the current directory. Finally, run `gprof` on the executable and the profile file (e.g., `gprof my_program gmon.out`) to analyze the collected data.
<I>OUTPUT INTERPRETATION</I>
`gprof` typically generates two main sections in its output: the Flat Profile and the Call Graph. The Flat Profile lists each function, the percentage of total execution time spent in it, its cumulative time, self time, and the number of calls. This helps identify functions consuming significant CPU cycles. The Call Graph (or Call Tree) shows how functions call each other, indicating the time spent in each function and its descendants. It includes fields like %time, cumulative seconds, self seconds, calls, and name. Analyzing the call graph helps understand function dependencies and propagation of performance issues through the call chain.
HISTORY
The concept of execution profiling dates back to early computing. `gprof` itself was developed as part of the GNU Project, building upon earlier Unix profiling tools like prof(1). Its initial release was in the 1980s, becoming a standard utility within the GNU Binutils collection. While newer, more sophisticated profiling tools like perf and Valgrind have emerged, `gprof` remains a reliable and widely used tool for basic function-level profiling of single-threaded applications due to its simplicity and direct integration with GCC's compilation process. Its development has focused on maintaining compatibility and providing essential profiling capabilities for open-source projects.
SEE ALSO
gcc(1): The GNU C Compiler, used to compile programs with the -pg flag for `gprof`., perf(1): A Linux profiling tool for performance monitoring, often more powerful and flexible than `gprof` for system-wide and multi-threaded profiling., valgrind(1): A suite of dynamic analysis tools, including callgrind for detailed call-graph and cache profiling, often used as an alternative to `gprof`.