Super full finishing! Summary of Linux performance analysis tools


Super full finishing! Summary of Linux performance analysis tools

Out of interest in Linux operating system and strong desire for underlying knowledge, this article is sorted out. This paper can also be used as an index to test basic knowledge. In addition, this paper covers all aspects of a system. Without perfect computer system knowledge, network knowledge and operating system knowledge, it is impossible to fully master the tools in the document. In addition, it is a long-term series for system performance analysis and optimization.

This document is a comprehensive article compiled by Brendan Gregg, Linux Daniel and Netflix senior performance architect, who updated the Linux performance tuning tool, and collected articles related to Linux system performance optimization. It mainly explains the principles and performance testing tools involved in the blog.

Performance analysis tools

First, let’s look at a picture:

Super full finishing! Summary of Linux performance analysis tools

The above figure is a performance analysis shared by Brendan Gregg. All the tools in it can obtain its help documents through man. The following questions briefly introduce the general usage:

Vmstat — virtual memory statistics

Vmstat (virtual memory statistics) is a common tool for monitoring memory in Linux. It can monitor the overall situation of virtual memory, process and CPU of the operating system.

General usage of vmstat: vmstat interval times is to sample every interval seconds for a total of times. If times is omitted, data will be collected until the user stops manually. For example:

Super full finishing! Summary of Linux performance analysis tools

You can use Ctrl + C to stop vmstat collecting data.

The first line shows the average value of the system since startup, the second line starts to show what is happening now, and the next line will show what happens every 5 seconds. The meaning of each column is in the head, as shown below:

  • Procs: the R column shows how many processes are waiting for CPU, and the B column shows how many processes are non interruptible hibernation (waiting for IO).
  • Memory: the swapd column shows how many blocks have been swapped out of disk (page swapping), and the remaining columns show how many blocks are free (Unused), how many blocks are being used as buffers, and how many are being used as operating system caches.
  • Swap: displays swap activity: how many blocks are being swapped in (from disk) and out (to disk) per second.
  • IO: shows how many blocks are read (BI) and written (Bo) from the block device, usually reflecting the hard disk I / O.
  • System: displays the number of interrupts (in) and context switches (CS) per second.
  • CPU: displays the percentage of all CPU time spent on various operations, including executing user code (non kernel), executing system code (kernel), idle and waiting for Io.

Performance of insufficient memory: Free   Memory decreases sharply, and it is useless to recycle buffer and cache. Swpd is widely used, page switching (SWAP) is frequent, the number of read-write disks (IO) increases, page missing interrupts (in) increases, context switching (CS) times increase, the number of processes waiting for IO (b) increases, and a lot of CPU time is spent waiting for IO (WA)

Iostat — used to report CPU statistics

Iostat is used to report central processing unit (CPU) statistics and input / output statistics of the whole system, adapters, TTY devices, disks and CD-ROM. by default, the same CPU usage information as vmstat is displayed. Use the following command to display extended device statistics:

Super full finishing! Summary of Linux performance analysis tools

The first line shows the average value since system startup, and then shows the average value of increment, one line for each device.

Common abbreviations of disk IO indicators in Linux: RQ is request, R is read, W is write, Qu is queue, SZ is size, a is coverage, TM is time, and SVC is service.

  • Rrqm / s and wrqm / s: read and write requests merged every second. “Merged” means that the operating system takes out multiple logical requests from the queue and merges them into one request to the actual disk.
  • R / s and w / s: the number of read and write requests sent to the device per second.
  • RSEC / s and wsec / s: the number of sectors read and written per second.
  • Avgrq – SZ: number of sectors requested.
  • Avgqu – SZ: the number of requests waiting in the device queue.
  • Await: time spent per IO request.
  • Svctm: actual request (service) time.
  • %Util: percentage of time spent with at least one active request.

Dstat — system monitoring tool

Dstat displays CPU usage, disk IO, network contracting and page change. The output is colored and readable. It is more detailed and intuitive than the input of vmstat and iostat. When using, you can directly enter the command, and of course, you can also use specific parameters.

As follows: dstat – cdlmnpsy

Super full finishing! Summary of Linux performance analysis tools

Iotop — Linux process real-time monitoring tool

The iotop command is a command that specifically displays the IO of the hard disk. The interface style is similar to the top command. It can display which process generates the IO load. It is a top class tool used to monitor disk I / O usage. It has a UI similar to top, including PID, user, I / O, process and other related information.

Can be used in a non interactive manner:

iotop  – bod   Interval to view the I / O of each process. You can use pidstat, pidstat  – d   instat

Pidstat — monitoring system resources

Pidstat is mainly used to monitor the occupation of system resources by all or specified processes, such as CPU, memory, device IO, task switching, threads, etc.

usage method:

pidstat –d interval
#Statistics of CPU usage
pidstat –u interval
#Statistics memory information
Pidstat –r interval


Top commandThe summary area of displays five aspects of system performance information:

  • Load: time, number of login users, average system load;
  • Process: run, sleep, stop, zombie;
  • CPU: user status, core mentality, nice, idle, waiting for IO, interrupt, etc;
  • Memory: total, used, idle (system perspective), buffer, cache;
  • Swap partition: total, used, idle

The task area displays by default: process ID, valid user, process priority, nice value, virtual memory used by the process, physical memory and shared memory, process status, CPU utilization, memory utilization, cumulative CPU time, and process command line information.


Htop is an interactive process viewer in Linux system. A text mode application (in the console or X terminal) requires ncurses.

Htop allows users to operate interactively, supports color themes, can scroll horizontally or vertically through the process list, and supports mouse operation.

Compared with top, htop has the following advantages:

  • You can scroll through the list of processes horizontally or vertically to see all processes and the complete command line.
  • On startup, it is faster than top.
  • There is no need to enter the process number when killing the process.
  • Htop supports mouse operation.


Mpstat is the abbreviation of multiprocessor statistics. It is a real-time system monitoring tool. It reports some statistics related to CPU, which is stored in the / proc / STAT file. In a multi CPUs system, it can view not only the average status information of all CPUs, but also the information of specific CPUs. Common usage:

mpstat –P ALL interval times


NetstatIt is used to display statistics related to IP, TCP, UDP and ICMP protocols. It is generally used to check the network connection of each port of the machine.

Common usage:

netstat  – npl    # You can check whether the port you want to open is already open.
netstat  – rn     # Print routing table information.
netstat  – in     # Provide interface information on the system, print MTU of each interface, number of input packets, input errors, number of output packets, output errors, conflicts and the length of the current output queue.

PS — displays the status of the current process

There are too many PS parameters. Please refer to man PS for specific usage,

Common methods:

ps  aux  #hsserver
ps –ef |grep #hundsun
#A method of killing a program
ps  aux | grep mysqld | grep –v grep | awk ‘{print $2 }’ xargs kill -9
#Kill zombie process
ps –eal | awk ‘{if ($2 == “Z”){print $4}}’ | xargs kill -9


Track the system calls and signals received during program execution to help analyze the abnormal conditions encountered in program or command execution.

For example, to view which configuration file mysqld loads on Linux, you can run the following command:

strace –e stat64 mysqld –print –defaults > /dev/null


It can print the total operation time of the system and the average load of the system,Uptime commandThe last three numbers output mean the average load of the system in 1 minute, 5 minutes and 15 minutes respectively.


lsof(list open files)Is a tool that lists open files in the current system. adoptlsofThe tool can view this list for system detection and troubleshooting. Common usage:

#View file system blocking 
lsof /boot
#See which process is using the port number 
lsof  -i : 3306
#View which files the user has open 
lsof –u username
#View which files the process has open 
lsof –p  4838
#View remote open network links 
lsof –i @


Perf is a system performance optimization tool built into the Linux kernel. The advantage lies in its close combination with the Linux kernel. It can be first applied to the new feature added to the kernel to view hotspot functions and the ratio of cash miss, so as to help developers optimize program performance.

The basic principle of performance tuning tools such as perf and oprofile is to sample the monitored object. The simplest case is to sample according to the tick interrupt, that is, trigger the sampling point in the tick interrupt and judge the current context of the program in the sampling point. If a program spends 90% of its time on function foo (), 90% of the sampling points should fall in the context of function foo (). Luck is unpredictable, but I think the above inference is more reliable as long as the sampling frequency is high enough and the sampling time is long enough. Therefore, through tick triggered sampling, we can understand which parts of the program consume the most time, so as to focus on analysis.

Summary: combined with the above commonly used performance test commands and the diagram of performance analysis tools at the beginning of this paper, you can preliminarily understand which aspect of performance and which aspect of tools (commands) are used in the performance analysis process.

Common performance testing tools

Proficient and proficient in the performance analysis command tools in part II. Introduce several performance test tools. Before introduction, briefly understand several performance test tools:


A performance diagnostic tool released and maintained with Linux kernel code, which is maintained and developed by the kernel community. Perf can be used not only for application performance statistics and analysis, but also for kernel code performance statistics and analysis.

eBPF tools

A performance tracking tool using BCC, ebpf map can use customized ebpf programs, which are widely used in kernel tuning, and can also read user level asynchronous code. It is important that this external data can be managed in user space. This K-V format map data body is managed by calling BPF system call to create, add, delete and other operations in user space.


A perf based_ Linux performance analysis and tuning toolset for events (perf) and ftrace. Perf tools has few dependent libraries and is easy to use. Support Linux kernel version 3.2 and above.

bcc(BPF Compiler Collection)

A perf performance analysis tool using ebpf. A toolkit for creating efficient kernel traces and operators, including several useful tools and examples. Using the extended BPF (Berkeley packet filter), officially known as ebpf, a new feature was first added to Linux 3.15. Multi purpose requires BCC above Linux 4.1.


A new Linux script dynamic performance tracking tool. Allows users to track Linux kernel dynamics. Ktap is designed to be interoperable, allowing users to adjust operational insights, troubleshoot and extend kernels and applications. It is similar to Linux and Solaris DTrace systemtap.

Flame Graphs

It is a graphic software visualized by perf, system tap and ktap, which allows the most frequent code paths to be identified quickly and accurately. It can be generated by using the development source code in

Linux observability tools | Linux performance observation tool

Super full finishing! Summary of Linux performance analysis tools

The first basic tools to learn are as follows:

uptime、top(htop)、mpstat、isstat、vmstat、free、ping、nicstat、dstat。10 basic commands commonly used in Linux performance testing

The advanced commands are as follows:

sar、netstat、pidstat、strace、tcpdump、blktrace、iotop、slabtop、sysctl、/proc。Worth collecting! Quick reference manual of common commands in Linux system

Linux benchmarking tools | Linux performance evaluation tool

Super full finishing! Summary of Linux performance analysis tools

It is a performance evaluation tool. You can use corresponding tools for performance testing of different modules. For in-depth understanding, please refer to the attachment documents below.

Linux tuning tools | Linux Performance Tuning Tools

Super full finishing! Summary of Linux performance analysis tools

It is a performance tuning tool, mainly from the Linux kernel source code layer. For an in-depth understanding, please refer to the attached documents below.

Linux observability SAR | Linux performance observation tool

Super full finishing! Summary of Linux performance analysis tools

SAR (system activity reporter system activity report) is one of the most comprehensive system performance analysis tools on Linux. It can report system activities from many aspects, including file reading and writing, system call usage, disk I / O, CPU efficiency, memory usage, process activity and IPC related activities.

Common usage of SAR:

sar  [options] [-A] [-o file]  t [n]

Of which:

t          # Is the sampling interval, n is the sampling times, and the default value is 1;
-o   file    # Indicates that the command results are stored in a file in binary format. File is the file name.
options    # For command line options


Super full finishing! Summary of Linux performance analysis tools