Analysis of the concept of average load in Linux system


1、 What is load average?
In Linux system, uptime, W, top and other commands will have the output of system average load average, so what is the system average load?
System average load is defined as the average process tree running in the queue within a specific time interval. A process is in the run queue if it meets the following conditions:
– it is not waiting for the results of I / O operations
– it does not actively enter the wait state (i.e. it does not call ‘wait’)
– not stopped (e.g. waiting for termination)
For example:

Copy code

The code is as follows:

  [[email protected] root]# uptime
  7:51pm up 2 days, 5:43, 2 users, load average: 8.13, 5.90, 4.94

The last content of the command output represents the average number of processes in the running queue in the past 1, 5, and 15 minutes.
Generally speaking, as long as the current number of active processes per CPU is not greater than 3, then the performance of the system is good. If the number of tasks per CPU is greater than 5, then it means that the performance of the machine is seriously affected. For the above example, if the system has two CPUs, the current number of tasks per CPU is 8.13/2 = 4.065. This means that the performance of the system is acceptable.

2、 Algorithm of load average
The output data above is to check the number of active processes every 5 seconds, and then calculate it according to this value. If this number is divided by the number of CPUs, if the result is higher than 5, the system is overloaded. The algorithm (from the kernel code of Linux 2.4) is as follows:

File: Include / Linux / sched. H:

Copy code

The code is as follows:

#define FSHIFT 11 /* nr of bits of precision */
#define FIXED_1 (1<#define LOAD_FREQ (5*HZ) /* 5 sec intervals */
#define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point, 2048/pow(exp(1), 5.0/60) */
#define EXP_5 2014 /* 1/exp(5sec/5min), 2048/pow(exp(1), 5.0/300) */
#define EXP_15 2037 /* 1/exp(5sec/15min), 2048/pow(exp(1), 5.0/900) */
#define CALC_LOAD(load,exp,n) \
load *= exp; \
load += n*(FIXED_1-exp); \
load >>= FSHIFT;

File: kernel / timer. C:

Copy code

The code is as follows:

unsigned long avenrun[3];
static inline void calc_load(unsigned long ticks)
unsigned long active_tasks; /* fixed-point */
static int count = LOAD_FREQ;
count -= ticks;
if (count < 0) {
count += LOAD_FREQ;
active_tasks = count_active_tasks();
CALC_LOAD(avenrun[0], EXP_1, active_tasks);
CALC_LOAD(avenrun[1], EXP_5, active_tasks);
CALC_LOAD(avenrun[2], EXP_15, active_tasks);

File: FS / proc / proc_ misc.c:

Copy code

The code is as follows:

#define LOAD_INT(x) ((x) >> FSHIFT)
#define LOAD_FRAC(x) LOAD_INT(((x) & (FIXED_1-1)) * 100)
static int loadavg_read_proc(char *page, char **start, off_t off,
int count, int *eof, void *data)
int a, b, c;
int len;
a = avenrun[0] + (FIXED_1/200);
b = avenrun[1] + (FIXED_1/200);
c = avenrun[2] + (FIXED_1/200);
len = sprintf(page,”%d.%02d %d.%02d %d.%02d %ld/%d %d\n”,
nr_running(), nr_threads, last_pid);
return proc_calc_metrics(page, start, off, count, eof, len);

3、 / proc / loadavg
/The proc file system is a virtual file system, which does not take up disk space. It reflects the current operating system running in memory. You can view the files under / proc and send them to the running status of the system. Check the average load of the system and use the command “cat / proc / loadavg”. The output results are as follows:
0.27 0.36 0.37 4/83 4828/
As we all know, the first three numbers are the average number of processes in 1, 5, and 15 minutes (some people think it is the percentage of the system load, but it is not. Sometimes we can see 200 or more). For the latter two, the numerator of one is the number of running processes, and the denominator is the total number of processes; the other is the ID number of the recently running process.

4、 Common commands for viewing system average load

Copy code

The code is as follows:

cat /proc/loadavg

Name: uptime
Permission: all users
Usage: uptime [- v]
Note: uptime provides the user with the following information without additional parameters:
The current time system is up and running. The number of users connected to the current time is the last minute, five minutes and fifteen minutes of system load
Parameter: – V displays version information.
Example: uptime
The results are as follows

Copy code

The code is as follows:

10:41am up 5 days, 10 min, 1 users, load average: 0.00, 0.00, 1.99

Function Description: display the current login system user information.
Grammar: w [- fhlsuv] [user name]
Supplementary note: execute this instruction to know who are currently logged in to the system and the program they are executing. Separate execution w
The command displays all users, or you can specify a user name to display only information about a user.
– f turns on or off the display of where the user logs in.
– H does not display the header information column for each field.
– L uses the detailed format list, which is the default value.
– s uses a concise format list and does not display the user login time, CPU time spent by terminal phase jobs and programs.
– U ignores the name of the executing program, as well as information about the CPU time consumed by the program.
– V displays version information.
Function Description: display and manage the program in execution.
Grammar: top [bciqss] [D < interval seconds >] [n < execution times >]
Supplementary note: executing the top command can display the program currently being executed in the system, and manage it with hotkeys through its interactive interface.
B use batch mode.
C shows the complete instructions of each program, including instruction name, path and parameters.
D < interval seconds > set the interval time of top monitoring program execution status, the unit is calculated in seconds.
I ignore idle or zombie programs when executing the top instruction.
N < execution times > set the update times of monitoring information.
Q continuously monitor the status of program execution.
S uses the security mode to eliminate the potential crisis under the interactive mode.
S uses the cumulative mode, and its effect is similar to the “- s” parameter of the PS instruction.
Function Description: display system load status.
Syntax: tload [- v] [- d < interval seconds >] [- s < scale size >] [terminal number]
Additional note: the tload instruction uses ASCII characters to simply display the system load status in text mode. Assuming no terminal number is given, the load condition is displayed at the terminal where the tload instruction is executed.
– d < interval seconds > set the interval time for tload to detect the system load in seconds.
– s < scale size > sets the vertical scale size of the chart in columns.
– V displays version information.

4、 System average load – Advanced Interpretation
To better understand the system load, we use the traffic flow analogy.

1. Single core CPU – Single Track – numbers between 0.00 and 1.00 are normal

The traffic controller will tell the driver that if there is a lot of traffic ahead, the driver will have to wait. If the road ahead is smooth, the driver can drive directly.
2015123103450043.png (418×173)


The number between 0.00 and 1.00 indicates that the road condition is very good at this time, there is no congestion, and the vehicles can pass through without hindrance.

1.00 means that the road is still normal, but may deteriorate and cause congestion. At this time, the system has no redundant resources, the administrator needs to optimize.

1.00 – * * * indicates that the road condition is not very good. If it reaches 2.00, it means that there are multiple vehicles on the bridge and the vehicles are waiting. You have to check this out.

2. Multi core CPU – Multi Lane – Digital / CPU cores are between 0.00 and 1.00
2015123103521166.png (478×65)

In the case of multi-core CPU, the number of full load state is “1.00 * CPU cores”, that is, the double core CPU is 2.00, and the four core CPU is 4.00.

3. Safe system average load

The author thinks that a single core load below 0.7 is safe, and it needs to be optimized if it exceeds 0.7.

4. Which number should we look at, 1 minute, 5 minutes or 15 minutes?

The author thinks that it is better to look at 5 minutes and 15 minutes, that is, the last two figures.

5. How to know how many cores my CPU is?

Use the following command to get the number of CPU cores directly

Copy code

The code is as follows:

grep ‘model name’ /proc/cpuinfo | wc -l


Get the number of CPU cores n, observe the next two numbers, and use the number / N. if the value obtained is less than 0.7, you can be carefree.

Recommended Today

mediaaccess.exe What process is used to query mediaaccess process

Process file: mediaaccess or mediaaccess.exeProcess name: windupdate media access adwareProcess category: processes with security risks English Description:mediaaccess.exe is an advertising program by Windupdate. This process monitors your browsing habits and distributes the data back to the authors servers for analysis. This also prompts advertising popups. This program is a registered security risk and s Chinese […]