The load state of the operating system reflects the resource utilization of the application, from which the bottleneck of application optimization can be found.
The average system load refers to the average number of processes in running or undisturbed state\
It is in the running state, indicating the running state, occupying CPU, or ready state, waiting for CPU scheduling\
Non interruptible indicates blocking and waiting for I / O
In Linux system, the uptime command is usually used to check the load condition (W command and top command are also available)*
$ uptime\ 16:33:56 up 69 days, 5:10, 1 user, load average: 0.14, 0.24, 0.29
The above information is analyzed as follows:
- 16: 33:56: current time
- Up 69 days, 5:10: the system has been running for 69 days, 5 hours and 10 minutes
- 1 user: one user is currently logged into the system. Load average: 0.14, 0.24, 0.29: the average load of the system in the past 1 minute, 5 minutes and 15 minutes
- Load average: 0.14, 0.24, 0.29: the average load of the system in the past 1 minute, 5 minutes and 15 minutes
Average load analysis
View the number of logical CPU cores:
$ grep 'model name' /proc/cpuinfo | wc -l\ 1\
The running results show that there is a logical CPU core. Taking one CPU core as an example, assume that the CPU can process up to 100 processes per minute –
- Load = 0, no process needs CPU
- Load = 0.5, the CPU processes 50 processes
- Load = 1, the CPU processes 100 processes. At this time, the CPU is full, but the system can operate smoothly
- Load = 1.5, the CPU has processed 100 processes, and 50 processes are being excluded waiting for CPU processing. At this time, the CPU has been overloaded
For the smooth operation of the system, the load value should not exceed 1.0, so that no processes need to wait, and all processes can be processed at the first time\
Obviously, 1.0 is a key value. Beyond this value, the system will not be in the best state. Generally, 0.7 is an ideal value\
In addition, the health status of the load value is also related to the number of CPU cores in the system. If the number of CPU cores is 2, the health value of the load value should be 2, and so on. \
The average load value within 15 minutes is generally used to evaluate the load of the system.
2、 W command
$ w\ 17:47:40 up 69 days, 6:24, 1 user, load average: 0.46, 0.26, 0.25\ USER TTY FROM [email protected] IDLE JCPU PCPU WHAT\ lvinkim pts/0 18.104.22.168 15:55 0.00s 0.02s 0.00s w
Line 1: same as uptime. \
Below line 2, a list of currently logged in users.
3、 Top command
$ top\ top - 17:51:23 up 69 days, 6:28, 1 user, load average: 0.31, 0.30, 0.26\ Tasks: 99 total, 1 running, 98 sleeping, 0 stopped, 0 zombie\ Cpu(s): 2.3%us, 0.2%sy, 0.0%ni, 97.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st\ Mem: 1922244k total, 1737480k used, 184764k free, 208576k buffers\ Swap: 0k total, 0k used, 0k free, 466732k cached\ \ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND \ 1 root 20 0 19232 1004 708 S 0.0 0.1 0:01.17 init \ 2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd \ ...
Line 1: same as uptime.
Line 2: process count information.
- Tasks: 99 total: there are 99 processes in total
- 1 running: one process is occupying CPU
- 98 sleeping: 98 sleep processes
- 0 stopped: 0 stopped processes
- 0 Zombie: 0 zombie processes
Line 3: CPU usage
- Us (user): the ratio of CPU occupied by non nice user processes
- Sy (system): ratio of CPU occupied by kernel and kernel process
- Ni (NICE): CPU utilization ratio of processes with changed priority in user process space
- ID (idle): CPU idle ratio. If the system is slow and this value is very high, it indicates that the reason for the slow system is not the high CPU load
- Wa (iowait): the time ratio of the CPU waiting for I / O operations. This indicator can be used to troubleshoot disk I / O problems. It is usually judged in combination with WA and ID
- Hi (hardware IRQ): the ratio of time taken by the CPU to process hardware interrupts
- Si (software interrupts): the ratio of CPU processing software interrupts
- St (steal): elapsed time, the ratio of CPU time occupied by other tasks in the virtual machine
Some situations needing attention:
- High user process US ratio and low I / O operation Wa: it indicates that the system is slow because the process occupies a lot of CPU, which is usually accompanied by a low idle ratio ID, indicating that the CPU idle time is very little.
- Low I / O operation Wa and high idle ratio ID: it can eliminate the possibility of CPU resource bottleneck.
- I / O operation wa high: it means that I / O takes up a lot of CPU time, so it is necessary to check the use of switching space. The switching space is located on the disk, and its performance is far lower than that of memory. When the memory runs out and starts to use the switching space, it will have a serious impact on the performance. Therefore, it is generally recommended to close the switching space for servers with high performance requirements. On the other hand, if the memory is sufficient but the Wa is high, it indicates that it is necessary to check which process is consuming a lot of I / O resources.
More load cases can be judged flexibly in practice.
4、 Iostat command
The iostat command can view the IO usage of the system partition
$ iostat \ Linux 2.6.32-573.22.1.el6.x86_64 (sgs02) 01/20/2017 _x86_64_ (1 CPU)\ \ avg-cpu: %user %nice %system %iowait %steal %idle\ 2.29 0.00 0.25 0.04 0.00 97.41\ \ Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn\ vda 1.15 3.48 21.88 21016084 131997520
Some noteworthy IO indicators:
- Device: disk name
- TPS: number of I / O transfer requests per second
- Blk_ Read / s: how many blocks are read per second? To view the block size, refer to the command tune2fs
- Blk_ Wrtn / s: how many blocks are written and fetched per second
- Blk_ Read: how many pieces have you read
- –Blk_ Wrtn: how many pieces did you write
5、 Iotop command
The iotop command is similar to the top command, but it shows the I / O status of each process. It plays a great role in locating processes with heavy I / O operations\
# iotop\ Total DISK READ: 0.00 B/s | Total DISK WRITE: 774.52 K/s\ TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND \ 272 be/3 root 0.00 B/s 0.00 B/s 0.00 % 4.86 % [jbd2/vda1-8]\ 9072 be/4 mysql 0.00 B/s 268.71 K/s 0.00 % 0.00 % mysqld\ 5058 be/4 lvinkim 0.00 B/s 3.95 K/s 0.00 % 0.00 % php-fpm: pool www\ 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init
You can see the reading and writing intensity of different tasks.
6、 Sysstat tool
Many times, when the historical high load condition is detected or known, it may be necessary to play back the historical monitoring data. At this time, the SAR command is used. The SAR command is also from the sysstat toolkit, which can record the CPU load, I / O status and memory usage of the system, so as to facilitate the playback of historical data.
The configuration file of sysstat is in the / etc / sysconfig / sysstat file, and the storage location of historical logs is / var / log / SA\
Statistics are recorded every 10 minutes. The statistics file is split at 23:59 every day. The frequency of these operations is configured in the / etc / cron.d/sysstat file\
7、 SAR command
Use the SAR command to view the CPU usage of the day:
$ sar\ Linux 2.6.32-431.23.3.el6.x86_64 (szs01) 01/20/2017 _x86_64_ (1 CPU)\ \ 10:50:01 AM CPU %user %nice %system %iowait %steal %idle\ 11:00:01 AM all 0.45 0.00 0.22 0.40 0.00 98.93\ Average: all 0.45 0.00 0.22 0.40 0.00 98.93
Use the SAR command to view the memory usage of the day:
$ sar -r\ Linux 2.6.32-431.23.3.el6.x86_64 (szs01) 01/20/2017 _x86_64_ (1 CPU)\ \ 10:50:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit\ 11:00:01 AM 41292 459180 91.75 44072 164620 822392 164.32\ Average: 41292 459180 91.75 44072 164620 822392 164.32
Use the SAR command to view the current day IO statistics:
$ sar -b\ Linux 2.6.32-431.23.3.el6.x86_64 (szs01) 01/20/2017 _x86_64_ (1 CPU)\ \ 10:50:01 AM tps rtps wtps bread/s bwrtn/s\ 11:00:01 AM 3.31 2.14 1.17 37.18 16.84\ Average: 3.31 2.14 1.17 37.18 16.84
For more SAR usage, see man SAR.
This work adoptsCC agreement, reprint must indicate the author and the link to this article