Linux Performance Monitoring and Tuning with perf, eBPF and

Performance analysis and tuning are critical skills for Linux system administrators, DevOps engineers, and performance engineers. Understanding where bottlenecks occur and how to optimize system behavior requires deep knowledge of Linux performance tools. This comprehensive guide explores three powerful performance analysis frameworks: perf, eBPF (Extended Berkeley Packet Filter), and ftrace, demonstrating how to diagnose and resolve performance issues in production systems.

Understanding Linux Performance Analysis

Performance analysis in Linux involves understanding multiple subsystems: CPU, memory, disk I/O, network, and application behavior. The key to effective performance tuning is identifying bottlenecks through methodical observation and measurement.

Performance Analysis Methodology

Effective performance analysis follows a systematic approach:

  1. Define the problem: Establish clear performance goals and metrics
  2. Measure current performance: Gather baseline metrics
  3. Identify bottlenecks: Determine limiting factors
  4. Hypothesize causes: Form theories about performance issues
  5. Test hypotheses: Use tools to validate or refute theories
  6. Implement solutions: Apply optimizations
  7. Verify improvements: Measure impact of changes
  8. Repeat: Continue iterative improvement

Common Performance Bottlenecks

Understanding typical bottleneck patterns helps focus analysis efforts:

  • CPU saturation: All CPU cores fully utilized, tasks waiting for CPU time
  • Memory pressure: Insufficient RAM, excessive swapping
  • Disk I/O bottleneck: Storage subsystem cannot keep up with demand
  • Network saturation: Network bandwidth exhausted or high latency
  • Lock contention: Threads waiting for locks, reducing parallelism
  • Context switching overhead: Excessive thread switching degrading performance
  • Cache misses: Poor CPU cache utilization reducing efficiency

Introduction to perf: The Performance Analysis Framework

perf is a powerful performance analyzing tool built into the Linux kernel. It provides hardware and software event sampling, tracing, and profiling capabilities.

Installing perf

Installation varies by distribution:

# Debian/Ubuntu
sudo apt update
sudo apt install linux-tools-common linux-tools-generic linux-tools-$(uname -r)

## RHEL/CentOS/Fedora
sudo dnf install perf

## Arch Linux
sudo pacman -S perf

Verify installation:

perf --version

Basic perf Usage

System-wide CPU profiling:

## Record system-wide for 10 seconds
sudo perf record -a -g sleep 10

## View the recorded data
sudo perf report

## Record specific process
sudo perf record -p <PID> -g sleep 10

The -g flag enables call graph recording, providing stack traces that show function call relationships.

Real-time performance monitoring:

## Top-like interface showing hottest functions
sudo perf top

## Monitor specific CPU
sudo perf top -C 0

## Monitor specific process
sudo perf top -p <PID>

Performance Counter Statistics

perf stat provides high-level performance counter statistics:

## Measure command execution
perf stat ./my-application

## Detailed counter statistics
perf stat -d ./my-application

## Custom event selection
perf stat -e cycles,instructions,cache-references,cache-misses ./my-application

## System-wide statistics for duration
sudo perf stat -a sleep 10

Example output interpretation:

Performance counter stats for './my-application':

    1,234.56 msec task-clock                #    0.995 CPUs utilized
         123      context-switches          #    0.100 K/sec
          12      cpu-migrations            #    0.010 K/sec
       1,234      page-faults               #    1.000 K/sec
   4,567,890      cycles                    #    3.700 GHz
   6,789,012      instructions              #    1.49  insn per cycle
   1,234,567      branches                  #  1000.000 M/sec
      12,345      branch-misses             #    1.00% of all branches

Key metrics to understand:

  • Cycles: CPU clock cycles consumed
  • Instructions: Number of instructions executed
  • IPC (Instructions Per Cycle): Efficiency metric (higher is better, typically 1-4)
  • Cache references/misses: Memory access patterns
  • Branch misses: Branch prediction failures

CPU Flame Graphs with perf

Flame graphs visualize stack traces, making it easy to identify hot code paths:

## Record with call stacks
sudo perf record -F 99 -a -g -- sleep 30

## Generate flame graph (requires FlameGraph scripts)
sudo perf script | stackcollapse-perf.pl | flamegraph.pl > flamegraph.svg

Clone FlameGraph tools:

git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph

Interpret flame graphs:

  • Width: Represents time spent in function (wider = more time)
  • Height: Call stack depth (top of stack is deepest)
  • Color: Typically random, sometimes indicates library/module
  • Plateaus: Functions consuming significant CPU time

Hardware Event Monitoring

Monitor hardware-level events:

## List available hardware events
perf list hardware

## Monitor cache events
sudo perf stat -e cache-references,cache-misses,L1-dcache-loads,L1-dcache-load-misses ./my-app

## Monitor branch prediction
sudo perf stat -e branches,branch-misses ./my-app

## Memory access patterns
sudo perf stat -e dTLB-loads,dTLB-load-misses,iTLB-loads,iTLB-load-misses ./my-app

Tracepoint Analysis

perf can trace kernel and userspace tracepoints:

## List available tracepoints
perf list tracepoint

## Trace system calls
sudo perf trace -p <PID>

## Record specific tracepoints
sudo perf record -e syscalls:sys_enter_* -a sleep 5
sudo perf script

## Trace scheduling events
sudo perf record -e sched:* -a sleep 5

eBPF: Dynamic Kernel Instrumentation

eBPF (Extended Berkeley Packet Filter) allows running sandboxed programs in the kernel without modifying kernel source code or loading kernel modules. It’s revolutionized Linux observability and performance analysis.

Understanding eBPF Architecture

eBPF programs are:

  • Written in restricted C
  • Compiled to eBPF bytecode
  • Verified by kernel for safety
  • JIT-compiled to native code
  • Attached to kernel events (kprobes, uprobes, tracepoints, etc.)

Installing BCC Tools

BCC (BPF Compiler Collection) provides high-level tools for eBPF:

## Debian/Ubuntu
sudo apt update
sudo apt install bpfcc-tools linux-headers-$(uname -r)

## RHEL/CentOS/Fedora
sudo dnf install bcc-tools kernel-devel

## Arch Linux
sudo pacman -S bcc bcc-tools

Verify installation:

ls /usr/share/bcc/tools/

Essential BCC Tools

execsnoop: Trace new process execution:

sudo execsnoop-bpfcc

Shows every new process, useful for understanding system activity and detecting anomalies.

opensnoop: Trace file opens:

sudo opensnoop-bpfcc
sudo opensnoop-bpfcc -p <PID>
sudo opensnoop-bpfcc -n nginx

Identifies which files processes are accessing.

tcpconnect/tcpaccept: Trace TCP connections:

## Outbound connections
sudo tcpconnect-bpfcc

## Inbound connections
sudo tcpaccept-bpfcc

Essential for understanding network activity and debugging connectivity issues.

ext4slower: Trace slow ext4 filesystem operations:

## Show operations slower than 10ms
sudo ext4slower-bpfcc 10

Identifies I/O bottlenecks in filesystem operations.

biolatency: Block I/O latency histogram:

sudo biolatency-bpfcc

## Sample for 10 seconds with histogram
sudo biolatency-bpfcc 10 1

Provides distribution of I/O latencies, revealing storage performance characteristics.

cachestat: Page cache statistics:

sudo cachestat-bpfcc 1

Shows page cache hit ratio, indicating memory caching effectiveness.

funccount: Count function calls:

## Count kernel function calls
sudo funccount-bpfcc 'vfs_*'

## Count user-space function calls
sudo funccount-bpfcc 'c:malloc'

Useful for understanding function call frequency and hot paths.

trace: Trace arbitrary kernel and user functions:

## Trace kernel function with arguments
sudo trace-bpfcc 'do_sys_open "%s", arg2'

## Trace user function in library
sudo trace-bpfcc 'c:malloc "size = %d", arg1'

## Conditional tracing
sudo trace-bpfcc 'do_sys_open (arg2 & 0x40) "O_CREAT flag used"'

profile: CPU profiler using sampling:

## System-wide CPU profiling
sudo profile-bpfcc -F 49 -f 30

## Profile specific process
sudo profile-bpfcc -p <PID> 30

Creates frequency counts of stack traces, similar to perf but using eBPF.

Advanced BCC Tools

offcputime: Analyze off-CPU time (blocked tasks):

sudo offcputime-bpfcc 30

Shows time spent blocked (I/O wait, lock contention, etc.) rather than executing.

wakeuptime: Analyze thread wake-up sources:

sudo wakeuptime-bpfcc 30

Identifies what’s waking up threads, useful for investigating scheduling overhead.

llcstat: LLC (Last Level Cache) statistics:

sudo llcstat-bpfcc

Monitors LLC cache hit/miss rates per process.

tcpretrans: Trace TCP retransmissions:

sudo tcpretrans-bpfcc

Network performance issues often manifest as retransmissions.

runqlat: Run queue latency histogram:

sudo runqlat-bpfcc 10 1

Shows time tasks spend waiting in the CPU run queue before being scheduled.

Writing Custom eBPF Programs with BCC

Simple example tracing system calls:

#!/usr/bin/env python3
from bcc import BPF

## eBPF program
prog = """
#include <uapi/linux/ptrace.h>

BPF_HASH(counts, u32);

int count_syscalls(struct pt_regs *ctx) {
    u32 pid = bpf_get_current_pid_tgid() >> 32;
    u64 *count, zero = 0;
    
    count = counts.lookup_or_try_init(&pid, &zero);
    if (count) {
        (*count)++;
    }
    return 0;
}
"""

## Load and attach
b = BPF(text=prog)
b.attach_kprobe(event="sys_clone", fn_name="count_syscalls")

print("Tracing syscalls... Hit Ctrl-C to end")
try:
    sleep(30)
except KeyboardInterrupt:
    pass

## Print results
print("\nSyscall counts by PID:")
for k, v in b["counts"].items():
    print(f"PID {k.value}: {v.value} syscalls")

This demonstrates eBPF’s power to dynamically instrument the kernel with custom logic.

bpftrace: High-Level eBPF Scripting

bpftrace provides a high-level language for eBPF:

## Install bpftrace
sudo apt install bpftrace  # Debian/Ubuntu
sudo dnf install bpftrace  # Fedora

## One-liners
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat { @[comm] = count(); }'
sudo bpftrace -e 'kprobe:vfs_read { @bytes = hist(arg2); }'

## Trace TCP connections
sudo bpftrace -e 'kprobe:tcp_connect { printf("%s connecting\n", comm); }'

## Profile user stacks
sudo bpftrace -e 'profile:hz:99 /pid == 1234/ { @[ustack] = count(); }'

bpftrace scripts can be saved to files for reuse:

## tcp-accept-lat.bt
#!/usr/bin/env bpftrace

kprobe:inet_csk_accept {
    @start[tid] = nsecs;
}

kretprobe:inet_csk_accept /@start[tid]/ {
    $dur = nsecs - @start[tid];
    @accept_lat_us = hist($dur / 1000);
    delete(@start[tid]);
}

END {
    clear(@start);
}

Run with:

sudo bpftrace tcp-accept-lat.bt

ftrace: Function Tracer Built Into the Kernel

ftrace is a kernel tracing framework built directly into the Linux kernel, providing powerful tracing capabilities without requiring additional tools.

Enabling and Using ftrace

ftrace is controlled through files in /sys/kernel/debug/tracing/ (requires debugfs mounted):

## Ensure debugfs is mounted
sudo mount -t debugfs none /sys/kernel/debug

## Change to tracing directory
cd /sys/kernel/debug/tracing

Basic ftrace Usage

List available tracers:

cat available_tracers

Common tracers include:

  • function: Traces all kernel functions
  • function_graph: Shows call graph with entry/exit
  • nop: No tracing (default)
  • irqsoff: Traces interrupt-disabled sections
  • preemptoff: Traces preemption-disabled sections

Enable function tracing:

## Set current tracer
echo function > current_tracer

## Start tracing
echo 1 > tracing_on

## Run workload
sleep 1

## Stop tracing
echo 0 > tracing_on

## View trace
cat trace | head -50

## Clear trace buffer
echo > trace

Function Graph Tracer

Shows function call hierarchy:

echo function_graph > current_tracer
echo 1 > tracing_on
sleep 1
echo 0 > tracing_on
cat trace | head -100

Output shows entry/exit of functions with duration:

 1)   0.123 us    |  mutex_unlock();
 1)   1.456 us    |  _cond_resched();
 1)               |  __alloc_pages_nodemask() {
 1)   0.234 us    |    get_page_from_freelist();
 1)   2.345 us    |  }

Filtering Functions

Set function filter:

## Trace only specific functions
echo do_sys_open > set_ftrace_filter
echo vfs_read >> set_ftrace_filter

## Use wildcards
echo 'tcp_*' > set_ftrace_filter

## Exclude functions
echo '!kfree' >> set_ftrace_filter

## Clear filter
echo > set_ftrace_filter

Event Tracing

ftrace provides access to kernel tracepoints:

## List available events
cat available_events

## Enable specific event
echo 1 > events/sched/sched_switch/enable

## Enable all events in subsystem
echo 1 > events/syscalls/enable

## View events
cat trace

## Disable events
echo 0 > events/enable

Function Profiling

Profile function execution time:

## Enable profiling
echo 1 > function_profile_enabled

## Run workload
sleep 5

## View results
cat trace_stat/function*

## Disable profiling
echo 0 > function_profile_enabled

Filtering with ftrace

PID filtering:

## Trace only specific PID
echo 1234 > set_ftrace_pid

## Trace multiple PIDs
echo 1234 > set_ftrace_pid
echo 5678 >> set_ftrace_pid

Event filtering:

## Filter events by condition
echo 'common_pid == 1234' > events/sched/sched_switch/filter
echo 'bytes > 1024' > events/syscalls/sys_enter_write/filter

trace-cmd: Front-end for ftrace

trace-cmd provides easier ftrace usage:

## Install trace-cmd
sudo apt install trace-cmd

## Record function trace
sudo trace-cmd record -p function -l do_sys_open sleep 1

## View recording
sudo trace-cmd report

## Record with function graph
sudo trace-cmd record -p function_graph sleep 1

## Record specific events
sudo trace-cmd record -e sched -e syscalls sleep 5

KernelShark: GUI for ftrace

KernelShark visualizes trace data:

## Install KernelShark
sudo apt install kernelshark

## Record trace
sudo trace-cmd record -e all sleep 5

## Open in GUI
kernelshark trace.dat

Practical Performance Analysis Scenarios

Scenario 1: High CPU Usage

Problem: Application consuming excessive CPU.

Analysis approach:

## 1. Identify hot functions with perf
sudo perf record -F 99 -p <PID> -g -- sleep 30
sudo perf report --stdio

## 2. Check CPU cache efficiency
perf stat -e cache-references,cache-misses,instructions,cycles -p <PID> sleep 10

## 3. Profile with eBPF
sudo profile-bpfcc -p <PID> 30

## 4. Examine function call frequency
sudo funccount-bpfcc 'p:<PID>:*' 10

Scenario 2: Slow I/O Performance

Problem: Application experiencing slow disk I/O.

Analysis approach:

## 1. Check I/O latency distribution
sudo biolatency-bpfcc 5 1

## 2. Identify slow operations
sudo ext4slower-bpfcc 10

## 3. Track file opens
sudo opensnoop-bpfcc -p <PID>

## 4. Monitor I/O patterns with ftrace
sudo trace-cmd record -e block sleep 10
sudo trace-cmd report

Scenario 3: Network Performance Issues

Problem: Network throughput lower than expected.

Analysis approach:

## 1. Monitor TCP connections
sudo tcpconnect-bpfcc
sudo tcpaccept-bpfcc

## 2. Check for retransmissions
sudo tcpretrans-bpfcc

## 3. Trace network syscalls with perf
sudo perf trace -e 'syscalls:sys_enter_send*,syscalls:sys_enter_recv*' -p <PID>

## 4. Profile network stack with eBPF
sudo profile-bpfcc -U -p <PID> 30

Scenario 4: Lock Contention

Problem: Application experiencing lock contention.

Analysis approach:

## 1. Analyze off-CPU time
sudo offcputime-bpfcc -p <PID> 30

## 2. Check futex operations with perf
sudo perf trace -e 'syscalls:sys_enter_futex' -p <PID>

## 3. Function-level contention analysis
sudo funclatency-bpfcc 'pthread_mutex_lock' -p <PID>

## 4. Stack traces of blocked threads
sudo bpftrace -e 'kprobe:finish_task_switch /pid == <PID>/ { @[kstack, ustack] = count(); }'

Performance Tuning Best Practices

Measurement and Baselines

  1. Establish baselines: Measure normal performance before issues occur
  2. Consistent methodology: Use same tools and metrics for comparison
  3. Document findings: Record observations and analysis
  4. Reproduce issues: Verify problems are consistent before optimization

Optimization Strategy

  1. Measure first: Never optimize without measurement
  2. Focus on bottlenecks: Optimize the slowest component first
  3. Change one thing: Isolate impact of individual optimizations
  4. Verify improvements: Measure after each change
  5. Consider trade-offs: Balance performance against complexity and maintainability

Tool Selection Guidelines

Use perf when:

  • CPU profiling and optimization
  • Hardware counter analysis needed
  • System-wide performance analysis
  • Detailed call graphs required

Use eBPF/BCC when:

  • Dynamic tracing without system restart
  • Minimal overhead required
  • Custom metrics needed
  • Production system analysis with safety guarantees

Use ftrace when:

  • Kernel function-level tracing needed
  • No external tools available
  • Understanding kernel code paths
  • Low-level kernel debugging

Conclusion

Linux provides an exceptionally powerful suite of performance analysis tools. perf offers comprehensive CPU profiling and hardware event monitoring. eBPF enables safe, dynamic kernel instrumentation with minimal overhead. ftrace provides built-in kernel function tracing capabilities.

Mastering these tools requires practice and understanding of Linux internals, but the investment pays dividends in production troubleshooting and optimization. Modern performance analysis is no longer about guessing—these tools provide concrete data about system behavior, enabling evidence-based optimization decisions.

The key to effective performance analysis is methodical investigation: form hypotheses, gather data, test theories, and verify results. With perf, eBPF, and ftrace in your toolkit, you have the instrumentation needed to understand and optimize complex system behavior at every level from hardware to application.


References

Thank you for reading! If you have any feedback or comments, please send them to [email protected].