Performance diagnostics for Linux in 60000 milliseconds

Time:2021-9-23

When you find a system performance problem on a Linux server, what system indicators will you check in the first minute?
Linux学习
Netflix has a large-scale EC2 cluster on AWS and a variety of performance analysis and monitoring tools. For example, we use atlas to monitor the whole platform and vector to analyze the performance of EC2 instances in real time. These tools have been able to help us solve most of the problems, but sometimes we still have to log in to the machine and use some standard Linux performance analysis tools to locate the problems.

图片

In this article, the Netflix performance engineering team will introduce some standard Linux command-line tools we use to analyze and locate problems within the first 60 seconds. Within 60 seconds, you can use the following 10 command lines to understand the overall operation of the system and the resource usage of the currently running process. Among these indicators, we first focus on the indicators related to errors and resource saturation rate, and then look at the resource utilization rate. Errors and resource saturation rates are relatively easy to understand. Saturation means that the load on a resource (CPU, memory, disk) exceeds its processing capacity. At this time, we observe that the request queue begins to accumulate or the request waiting time becomes longer.

uptime dmesg | tail vmstat 1 mpstat -P ALL 1 pidstat 1 iostat -xz 1 free -m sar -n DEV 1 sar -n TCP,ETCP 1 top

Some command lines rely on the sysstat package. Through the use of these command lines, you can be familiar with a set of methods or processes commonly used to analyze system performance problems: use. This method mainly analyzes all resources (CPU, memory, disk, etc.) from three aspects: resource utilization, resource saturation and error. In the process of this analysis, we should always pay attention to the resource problems we have eliminated, so as to narrow the scope of our positioning and provide a clearer direction for the next positioning.

The following sections describe each command line and use our data in the production environment as an example. For a more detailed description of these command lines, please check the corresponding help documents.

1. uptime

$ uptime  23:51:26 up 21:31, 1 user, load average: 30.02, 26.43, 19.02

This command can quickly check the average load of the system. You can think that the value of this load shows how many tasks are waiting to run. In Linux systems, this includes tasks that want or are using the CPU, as well as tasks that are blocked on Io. This command can give us a general understanding of the global state of the system, but we still need to use other tools to get more information.

These three values are the exponentially weighted dynamic average values of 1 minute, 5 minutes and 15 minutes calculated by the system, which can be simply regarded as the average value in this time period. According to these three values, we can understand the change of system load over time. For example, if there is a problem with the system, you check these three values and find that the load value of 1 minute is much smaller than that of 15 minutes, then you are likely to have missed the time point of the system problem.

In the above example, the average load value shows that 30 in 1 minute, which is more than 19 in 15 minutes. There are many reasons for the increase of load, perhaps the CPU is not enough; Vmstat or mpstat can further confirm where the problem is.

2. dmesg | tail

$ dmesg | tail [1880957.563150] perl invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0 [...] [1880957.563400] Out of memory: Kill process 18694 (perl) score 246 or sacrifice child [1880957.563408] Killed process 18694 (perl) total-vm:1972392kB, anon-rss:1953348kB, file-rss:0kB [2320864.954447] TCP: Possible SYN flooding on port 7001. Dropping request.  Check SNMP count

This command displays the latest system logs. Here we mainly look for whether some system errors will lead to performance problems. The above examples include oom killer and TCP packet loss.

Don’t skip this step! Dmesg is always worth seeing.

3. vmstat 1

$ vmstat 1 procs ---------memory---------- ---swap-- -----io---- -system-- ------cpu----- r  b swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st 34  0    0 200889792  73708 591828    0    0     0     5    6   10 96  1  3  0  0 32  0    0 200889920  73708 591860    0    0     0   592 13284 4282 98  1  1  0  0 32  0    0 200890112  73708 591860    0    0     0     0 9501 2154 99  1  0  0  0 32  0    0 200889568  73712 591856    0    0     0    48 11900 2459 99  0  0  0  0 32  0    0 200890208  73712 591860    0    0     0     0 15898 4840 98  1  1  0  0 ^C

Vmstat shows some information about virtual memory and CPU. In the above example, 1 on the command line means that it is displayed every 1 second. In this version of vmstat, the first line represents the indicators since the startup. We can ignore the first line for the time being.

Indicators to view:

  • r: Tasks in runnable state include running tasks and tasks waiting to be run. This value is a better indicator of CPU saturation than the average load. This value does not include waiting IO related tasks. When the value of R is larger than the current number of CPUs, the system is in a saturated state.
  • Free: free memory size in kilobytes.
  • Si, so: swap in and swap out memory pages. If these two values are non-zero, there is not enough memory.
  • Us, sy, ID, WA, ST: various indicators of CPU time (taking the average value for all CPUs), respectively indicating: user state time, kernel state time, idle time, waiting IO, stealing time (system overhead on other tenants in the virtualized environment)

By adding the user state CPU time (US) and kernel state CPU time (SY), we can further confirm whether the CPU is busy. If the waiting time (WA) for IO is high, it indicates that the disk is the bottleneck; Note that this is also included in the idle time (ID). The CPU is also idle at this time, and the task is blocked on the disk IO at this time. You can think of the waiting time for IO (WA) as another form of CPU idle, which can tell you why the CPU is idle.

When the system processes IO, it will certainly consume kernel state time (SY). If the kernel state takes more time, such as more than 20%, we need further analysis. Maybe the processing efficiency of IO by the kernel is not high.

In the above example, most of the CPU time is consumed in the user mode, indicating that the application layer code is mainly using the CPU. CPU utilization (US + SY) also exceeds 90%, which is not necessarily a problem; We can determine the saturation of CPU by R and the number of CPUs.

4. mpstat -P ALL 1

$ mpstat -P ALL 1 Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015  _x86_64_ (32 CPU)07:38:49 PM  CPU   %usr  %nice   %sys %iowait   %irq  %soft  %steal  %guest  %gnice  %idle 07:38:50 PM  all  98.47   0.00   0.75    0.00   0.00   0.00    0.00    0.00    0.00   0.78 07:38:50 PM    0  96.04   0.00   2.97    0.00   0.00   0.00    0.00    0.00    0.00   0.99 07:38:50 PM    1  97.00   0.00   1.00    0.00   0.00   0.00    0.00    0.00    0.00   2.00 07:38:50 PM    2  98.00   0.00   1.00    0.00   0.00   0.00    0.00    0.00    0.00   1.00 07:38:50 PM    3  96.97   0.00   0.00    0.00   0.00   0.00    0.00    0.00    0.00   3.03 [...]

This command prints out the time of each CPU to see whether the CPU processes tasks evenly. For example, if the utilization rate of a single CPU is very high, it indicates that it is a single threaded application.

5. pidstat 1

$ pidstat 1 Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015    _x86_64_    (32 CPU)07:41:02 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command 07:41:03 PM     0         9    0.00    0.94    0.00    0.94     1  rcuos/0 07:41:03 PM     0      4214    5.66    5.66    0.00   11.32    15  mesos-slave 07:41:03 PM     0      4354    0.94    0.94    0.00    1.89     8  java 07:41:03 PM     0      6521 1596.23    1.89    0.00 1598.11    27  java 07:41:03 PM     0      6564 1571.70    7.55    0.00 1579.25    28  java 07:41:03 PM 60004     60154    0.94    4.72    0.00    5.66     9  pidstat07:41:03 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command 07:41:04 PM     0      4214    6.00    2.00    0.00    8.00    15  mesos-slave 07:41:04 PM     0      6521 1590.00    1.00    0.00 1591.00    27  java 07:41:04 PM     0      6564 1573.00   10.00    0.00 1583.00    28  java 07:41:04 PM   108      6718    1.00    0.00    0.00    1.00     0  snmp-pass 07:41:04 PM 60004     60154    1.00    4.00    0.00    5.00     9  pidstat ^C

Pidstat is very similar to top. The difference is that it can print every other interval instead of clearing the screen every time like top. This command can easily view the possible behavior patterns of the process. You can also directly copy past, which can easily record the changes in the running status of each process over time.

The above example shows that two Java processes consume a lot of CPU. The% CPU here indicates the value of all CPUs. For example, 1591% identifies that this java process consumes almost 16 CPUs.

6. iostat -xz 1

$ iostat -xz 1 Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015  x86_64 (32 CPU) avg-cpu:  %user   %nice %system %iowait  %steal   %idle 73.96    0.00    3.73    0.03    0.06   22.21 Device:   rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util xvda        0.00     0.23    0.21    0.18     4.52     2.08    34.37     0.00    9.98   13.80    5.42   2.44   0.09 xvdb        0.01     0.00    1.02    8.94   127.97   598.53   145.79     0.00    0.43    1.78    0.28   0.25   0.25 xvdc        0.01     0.00    1.02    8.86   127.79   595.94   146.50     0.00    0.45    1.82    0.30   0.27   0.26 dm-0        0.00     0.00    0.69    2.32    10.47    31.69    28.01     0.01    3.23    0.71    3.98   0.13   0.04 dm-1        0.00     0.00    0.00    0.94     0.01     3.78     8.00     0.33  345.84    0.04  346.81   0.01   0.00 dm-2        0.00     0.00    0.09    0.07     1.35     0.36    22.50     0.00    2.55    0.23

Iostat is an important tool for understanding the current load and performance of block devices (disks). Meaning of several indicators:

  • R / s, w / s, RKB / s, WKB / s: the number of reads per second, the number of writes per second, the amount of data read per second, and the amount of data written per second sent by the system to the device. These indicators reflect the workload of the system. The performance problem of the system is probably that the load is too large.
  • Await: average response time of system requests to IO devices. This includes the time the request is queued and the time the request is processed. The average response time exceeding the empirical value indicates that the equipment is saturated or there is a problem with the equipment.
  • Avgqu-sz: the average length of the device request queue. Queue length greater than 1 indicates that the device is saturated.
  • %Util: device utilization. The busy degree of the equipment indicates the proportion of the equipment processing IO in each second. Utilization greater than 60% usually leads to performance problems (as can be seen from await), but each device will be different. Near 100% utilization indicates that the disk is saturated.

If the block device is a logical block device and there are many physical disks behind the logical fast device, 100% utilization can only indicate that the processing time of some IOS has reached 100%; The back-end physical disks may be far from saturated and can handle more load.

It should also be noted that poor disk IO performance does not necessarily mean that there will be problems with the application. Applications can have many methods to execute asynchronous IO without blocking on Io; Applications can also use technologies such as pre read and write buffer to reduce the impact of IO latency.

7. free -m

$ free -m          total       used       free     shared    buffers     cached Mem:        245998      24545     221453         83         59        541 -/+ buffers/cache:      23944     222053 Swap:

The two columns on the right are explicit:

  • Buffers: buffer cache for block device I / O.
  • Cached: page cache for the file system.

We just want to check that these sizes are not close to zero, which may lead to higher disk I / O (confirmed by iostat) and worse performance. The above example looks good. Each column has many M sizes.

Compared with the first line, -/ + buffers / cache provides more accurate memory usage. Linux will use temporarily unused memory as a cache and reallocate it as soon as the application needs it. Therefore, part of the memory used as cache is actually idle memory. To explain this, someone even built a website:http://www.linuxatemyram.com/。

If you use ZFS, you may be a little confused. ZFS has its own file system cache, which can’t be seen in free -m; The system seems to have little free memory, but it is possible that ZFS has a lot of cache available.

8. sar -n DEV 1

$ sar -n DEV 1 Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015     _x86_64_    (32 CPU)12:16:48 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s   %ifutil 12:16:49 AM      eth0  18763.00   5032.00  20686.42    478.30      0.00      0.00      0.00      0.00 12:16:49 AM        lo     14.00     14.00      1.36      1.36      0.00      0.00      0.00      0.00 12:16:49 AM   docker0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.0012:16:49 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s   %ifutil 12:16:50 AM      eth0  19763.00   5101.00  21999.10    482.56      0.00      0.00      0.00      0.00 12:16:50 AM        lo     20.00     20.00      3.25      3.25      0.00      0.00      0.00      0.00 12:16:50 AM   docker0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 ^C

This tool can check the throughput of the network interface: rxkb / s and txkb / s can measure the load and see whether the network traffic limit has been reached. In the above example, the throughput of eth0 reaches about 22 Mbytes / s, about 176 Mbits / sec, much less than 1 Gbit / sec.

In this example, there is also% ifitil to identify the device utilization. We also use Brenan’s nicstat tool to measure it. Like nicstat, the utilization of this device is difficult to measure correctly. In the above example, there seems to be a problem with this value.

9. sar -n TCP,ETCP 1

$ sar -n TCP,ETCP 1 Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015    _x86_64_    (32 CPU)12:17:19 AM  active/s passive/s    iseg/s    oseg/s 12:17:20 AM      1.00      0.00  10233.00  18846.0012:17:19 AM  atmptf/s  estres/s retrans/s isegerr/s   orsts/s 12:17:20 AM      0.00      0.00      0.00      0.00      0.0012:17:20 AM  active/s passive/s    iseg/s    oseg/s 12:17:21 AM      1.00      0.00   8359.00   6039.0012:17:20 AM  atmptf/s  estres/s retrans/s isegerr/s   orsts/s 12:17:21 AM      0.00      0.00      0.00      0.00      0.00 ^C

This is a summary of some important TCP indicators, including:

  • Active / s: the TCP connection that is actively opened locally every second, that is, the local program uses the connect() system call
  • Passive / s: TCP connection initiated from the source every second, that is, the connection accepted by the local program using accept()
  • Retrans / s: number of TCP retransmissions per second
  • The number of atactive and passive can usually be used to measure the load of the server: the number of accepted connections (passive) and the number of downstream connections (active). You can simply think of active as the connection to the outgoing host and passive as the connection to the incoming host; But this is not a very strict statement, such as the connection between loadhost and localhost.

Retransmission indicates a problem with the network or server. Maybe the network is unstable, maybe the server is overloaded and begins to lose packets. The above example shows that only 1 new connection is established per second.

10. top

$ top top - 00:15:40 up 21:56,  1 user,  load average: 31.09, 29.87, 29.92 Tasks: 871 total,   1 running, 868 sleeping,   0 stopped,   2 zombie %Cpu(s): 96.8 us,  0.4 sy,  0.0 ni,  2.7 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st KiB Mem:  25190241+total, 24921688 used, 22698073+free,    60448 buffers KiB Swap:        0 total,        0 used,        0 free.   554208 cached MemPID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND 20248 root      20   0  0.227t 0.012t  18748 S  3090  5.2  29812:58 java 4213 root      20   0 2722544  64640  44232 S  23.5  0.0 233:35.37 mesos-slave 66128 titancl+  20   0   24344   2332   1172 R   1.0  0.0   0:00.07 top 5235 root      20   0 38.227g 547004  49996 S   0.7  0.2   2:02.74 java 4299 root      20   0 20.015g 2.682g  16836 S   0.3  1.1  33:14.42 java  1 root      20   0   33620   2920   1496 S   0.0  0.0   0:03.82 init  2 root      20   0       0      0      0 S   0.0  0.0   0:00.02 kthreadd  3 root      20   0       0      0      0 S   0.0  0.0   0:05.35 ksoftirqd/0  5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H  6 root      20   0       0      0      0 S   0.0  0.0   0:06.94 kworker/u256:0  8 root      20   0       0      0      0 S   0.0  0.0   2:38.05 rcu_sched

The top command covers many of the metrics we described earlier. We can use it to see if the results are very different from those we have seen before. If so, it means that the load of the system is changing.

The disadvantage of top is that it is difficult to find some behavior patterns of these indicators over time. In this case, vmstat or pidstat, which can provide scrolling output, is a better way. If you don’t pause the output fast enough (ctrl-s pause, ctrl-q continue), some clues of intermittent problems may also be lost due to screen clearing.

The above isLiangxu tutorial networkThe performance diagnosis of Linux in 60000 milliseconds shared by all friends.

This article is composed of blog one article multi posting platformOpenWriterelease!

Recommended Today

The selector returned by ngrx store createselector performs one-step debugging of fetching logic

Test source code: import { Component } from ‘@angular/core’; import { createSelector } from ‘@ngrx/store’; export interface State { counter1: number; counter2: number; } export const selectCounter1 = (state: State) => state.counter1; export const selectCounter2 = (state: State) => state.counter2; export const selectTotal = createSelector( selectCounter1, selectCounter2, (counter1, counter2) => counter1 + counter2 ); // […]