[20200312] do not set net.ipv4 .tcp_ Tw_ recycle=1.txt


[20200312] do not set net.ipv4 .tcp_ Tw_ recycle=1.txt

–//I read two blogs carefully yesterday
–//Chinese Translation:

–//It says not to set it net.ipv4 .tcp_ Tw_ recycle=1。 TIME_ The wait status issue is not too serious and can be ignored at all.
–//I read it carefully and I really don’t understand many details. I started to contact mainly because I didn’t understand why there were so many state = time in the RAC environment of production system_ WAIT.
–//Now it seems that I’m just painting the lily and wasting time, but I still feel that I have learned a lot

–//Now I have a general idea of the reasons and make some conclusions

1. In fact, for normal database applications, normally exit the application, and actually state = time_ Wait happens on the client side, which is also my initial confusion. The server should not be so many
  state=TIME_ WAIT。 My question actually begins here^_ ^.


TCP state diagram

Only the end closing the connection first will reach the TIME-WAIT state. The other end will follow a path which usually
permits to quickly get rid of the connection.
–//The time-wait state is reached only when the end of the connection is closed first.. the other end of the connection will normally be allowed to get rid of the connection quickly.

2. We use dblink to connect a lot of short links. At this time, the server actually becomes the client, resulting in state = time_ There are many waits.

3. Exadata factory settings net.ipv4 .tcp_ Timestamps = 0, even if I set net.ipv4 .tcp_ Tw_ Recycle = 1 also has no effect, which is what I encountered in the test
To the second puzzle. I think Oracle has its own reason to do so, and it will not use special methods to avoid other DBA settings net.ipv4 .tcp_ Tw_ The problem of recycle = 1.

–//Exadate’s / etc/ sysctl.conf Setting, I don’t know what the preceding ා 12650500, what does Oracle internal mean?
# 12650500
net.ipv4.tcp_timestamps = 0

–//Again, it is suggested not to set it net.ipv4 .tcp_ Tw_ Recycle = 1 solves a large number of state = time_ Wait issues.

4. There is another reason that the server state = time_ With the increase of wait, the client will jump to VIP IP when connecting to the RAC database using scan IP
Login is to generate a large number of state = time on the server_ Wait connection.

5. And the foreground clients are basically XP machines. My test MS about TCP_ Timestamps is off by default, at least in Windows 7, windows 2003 server version, XP
I don’t know.
In this way, even if the server is set net.ipv4 .tcp_ Timestamps = 1 and net.ipv4 .tcp_ Tw_ Recycle = 1 still produces a large number of state = time on the server side_ WAIT.
However, the default settings for Linux machines are net.ipv4 .tcp_ timestamps = 1。

–//Again, it is suggested not to set it net.ipv4 .tcp_ Tw_ Recycle = 1 solves a large number of state = time_ Wait issues.  
–//Note: I made another mistake when testing here, thinking that the machines across network segments can’t reclaim time quickly_ Wait, in fact, is my test machine, Windows 7
–//tcp_ Timestamps is off by default

6. There is a feasible way to reduce a large number of state = time_ Wait connection means that in some middleware servers, you can use the old VIP IP instead of scan IP to connect to the database
Reduce that. This is a feasible and cost-effective improvement method, which is only effective for RAC environment

–//In other words, if it is not RAC environment, you may not see a large number of state = time_ Wait status

7. If the network link is converted through public network or NAT, it cannot be set net.ipv4 .tcp_ Tw_ Because some of our applications pass through the public network, when I set
–//It is true that a small number of clients (in fact, only two) can’t connect to the server. Even one machine in the intranet appears. I don’t know why. Finally restore the original settings

8. Memory CPU resource consumption is small. Reference link test: https://vincent.bernat.ch/en/blog/2014-tcp-time-wait-state-linux

9.state=TIME_ Wait has little impact on the application. Four elements form a connection. Oracle recommends setting kernel parameters net.ipv4 .ip_ local_ port_ range = 1024 65000
The number of ports supported is 64000, so if it does not reach 60000 / 60 = 1000 / connection per second, the basic impact can be ignored.

I don’t feel like I can reach this number, at least I haven’t seen it myself. Our database peak is 200 connections per second.

10.TIME_ The disappearance time of wait state is written dead in Linux. In the net / TCP. H header file, many links prompt modification net.ipv4 .tcp_ fin_ Timeout is wrong:
–//Unless you modify the header file content, recompile the kernel!!
#define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT
                  * state, about 60 seconds */
                                 /* BSD style FIN_WAIT2 deadlock breaker.
                  * It used to be 3min, new value is 60sec,
                  * to combine FIN-WAIT-2 timeout with
                  * TIME-WAIT timer.

11. Time can be modified under windows_ Disappearance time of wait state:
–//In HKEY_ LOCAL_ Add a DWORD type value tcptimedwaitdelay to machine / system / currentcontrolset / services / TCPIP / parameters
–//It is the number of seconds, the minimum is 30 seconds, and cannot be lower than this value


12. Setting TCP under Windows_ timestamps:
–//According to https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-2000-server/cc938205 (v=technet.10)?redirectedfrom=MSDN
–//Note: 1323 should be associated with RFC 1323 document


Data type  Range            Default value
REG_DWORD  0 | 1 | 2 | 3    3

Value  Meaning
0 (00) Timestamps and window scaling are disabled.
1 (01) Window scaling is enabled.
2 (10) Timestamps are enabled.
3 (11) Timestamps and window scaling are enabled.

–//According to the introduction, MS should turn on TCP by default_ Timestamps. However, I found that this is not the case under Windows 7 and windows 2003. Change to 2 or 3. I only test change 3
–//The situation


13. At the end of the paper, we will talk about the test method, using tnspin server_ IP count > / dev / null, which is relatively simple and fast. At this time, state = time_ Wait appears on the server side and uses the
Netstat – ntop or SS – nto observation

# seq 10000 | xargs -IQ bash -c “ss -tano state time-wait | ts.awk ; sleep 1”
# seq 10000 | xargs -IQ bash -c “netstat -nto | grep TIME-WAIT | ts.awk ; sleep 1”

$ cat $(which ts.awk)
# /bin/bash
gawk ‘{ print strftime(“[%Y-%m-%d %H:%M:%S]”), $0 }’

On the server side net.ipv4 .tcp_ Tw_ recycle=1, net.ipv4 .tcp_ Timestamps = 1, client net.ipv4 .tcp_ When timestamps = 1, time appears on the server_ Wait soon disappeared
On the server side net.ipv4 .tcp_ Tw_ recycle=1, net.ipv4 .tcp_ Timestamps = 1, client net.ipv4 .tcp_ When timestamps = 0, time appears on the server_ Wait for 60 seconds to disappear
On the server side net.ipv4 .tcp_ Tw_ recycle=1, net.ipv4 .tcp_ Timestamps = 0, time appears on the server_ Wait for 60 seconds to disappear
On the server side net.ipv4 .tcp_ Tw_ recycle=0, net.ipv4 .tcp_ Timestamps = n (n = 1,0) time appears on the server_ Wait for 60 seconds to disappear

–//You can test by yourself. The test results will not be posted, which is a little cumbersome
–//It can also be seen from this that if a large number of clients are windows, TCP is not turned on_ Timestamp, which is set on the server side
–// net.ipv4 .tcp_ Tw_ recycle=1, net.ipv4 .tcp_ Timestamps = 1 is also useless

14. Demonstrate that a server generates a large number of time_ An example of wait:
–//Or write another one, or it will be too long

–//Finally, we remind you not to set it net.ipv4 .tcp_ Tw_ Recycle = 1 to solve a large number of state = time_ Wait problem, important problem said three times, a large number of domestic links like this
–//It may lead to some network failures, which are very difficult to check, unless the application is in a very pure Intranet environment.

Recommended Today

Swift advanced 08: closure & capture principle

closure closurecanCapture and storageOf any constants and variables defined in their contextquote, this is the so-calledClose and wrap those constants and variablesTherefore, it is called“closure”Swift can handle everything for youCaptured memory managementOperation of. Three forms of closure [global function is a special closure]: a global function is a closure that has a name but does […]