The author of this paper is Ma Yingying, a web front-end development engineer of Netease smart enterprise. In order to improve the quality of the content, there are revisions and changes from time to time.
In a perfect IM application of instant messaging, websocket is a key link. It provides a full duplex communication mechanism for instant messaging applications based on Web. However, in order to improve the immediacy and reliability of messages in practical application scenarios such as Im, we need to overcome the instability of websocket and its underlying TCP connection for complex networks. Developers of instant messaging usually need to design a complete set of connection protection, live testing and disconnection network reconnection scheme.
In terms of disconnection and reconnection, the response speed of reconnection will seriously affect the “immediacy” and user experience of the upper application. Imagine if the wechat network can’t immediately sense the recovery of socket connection and send and receive instant chat messages after opening the network for one minute, is it a crash?
Therefore, how to perceive the network changes more quickly and quickly in the complex network environment, and quickly restore the usability of websocket becomes particularly important. Based on the author’s development practice, this paper will share how websocket can realize fast disconnection and reconnection under different network conditions.
*Readers:This paper is suitable for developers who have practical experience in the development of IM underlying network, or have a deep understanding of the implementation of the underlying network. If you know little about the underlying network, it is recommended to skip this article and read the basis of the appendix at the end of this article.
*Content comments:The content of this paper is not big, but it is dry goods, practical and popular. It is recommended to read it in detail. Although websocket is discussed in this paper, the idea can be extended to the similar technologies based on TCP protocol.
This article has been released in official account of the “instant messaging technology circle”.
The link on the official account is: click here to enter. http://www.52im.net/thread-3098-1-1.html
2. Preparatory knowledge
The content to be shared in this article is based on the summary of practice. If you are still confused about the instant messaging knowledge on the web, you must first read: “novice’s post: the most complete explanation of the principle of instant messaging technology on the web side”, “inventory of instant messaging technology on the Web: short polling, comet, websocket, SSE”.
Due to the limited space, this paper will not go into the technical details of websocket. If you are interested, please study systematically
- Quick start for beginners: a concise tutorial on websocket
- Detailed explanation of websocket (1): preliminary understanding of websocket Technology
- Websocket (2): technical principles, code demonstration and application cases
- Detailed explanation of websocket (3): details of websocket communication protocol
- Detailed explanation of websocket (4): the relationship between HTTP and websocket (Part 1)
- Detailed explanation of websocket (5): the relationship between HTTP and websocket (Part 2)
- Detailed explanation of websocket (6): the relationship between websocket and socket
3. Get to know websocket quickly
Websocket was born in 2008 and became an international standard in 2011. All browsers now support websocket (see quick start for beginners: a brief tutorial on websocket). It is a new application layer protocol. It is a real full duplex communication protocol specially designed for web client and server. We can understand websocket protocol by analogy with HTTP protocol.
(picture quoted from “websocket explanation (4): the relationship between HTTP and websocket (Part 1)”
Their differences are as follows:
- 1) The protocol identifier of HTTP is HTTP and that of websocket is WS;
- 2) HTTP requests can only be initiated by the client, and the server cannot actively push messages to the client, but websocket can;
- 3) There is no restriction on the communication between WebSockets across the same source domain.
Their similarities are as follows:
- 1) They are communication protocols of application layer;
- 2) The default ports are 80 or 443;
- 3) Can be used for communication between browser and server;
- 4) Both are based on TCP protocol.
The relationship between the two and TCP
(image quoted from “getting started: a brief tutorial on websocket”)
For the relationship between HTTP and websocket, please read:
- Detailed explanation of websocket (4): the relationship between HTTP and websocket (Part 1)
- Detailed explanation of websocket (5): the relationship between HTTP and websocket (Part 2)
The relationship between websocket and socket can be read in detail: detailed explanation of websocket (6): the relationship between websocket and socket
4. Disassembly of websocket reconnection process
First consider the question, when do you need to reconnection?
The easiest thing to think of is that the websocket connection is broken. In order to send and receive messages, we need to initiate a connection again.
But in many scenarios, even if the websocket connection is not disconnected, it is actually not available.
For example, the following scenario:
- 1) Equipment switching network;
- 2) Link intermediate route crash (common sense is that there will be many routing devices on the network path corresponding to a socket connection);
- 3) The front-end exit of the link is not available (for example, in home WiFi, the network connection is normal, but the actual operator’s broadband has been in arrears and has been shut down);
- 4) The server load is too high to respond.
The WebSockets in these scenarios are not disconnected, but for the upper layer, there is no way to send and receive data normally.
Therefore, before reconnection, we need a mechanism to sense whether the connection is available or not, whether the service is available, and to be able to quickly sense, so that we can quickly recover from the unavailable state.
Once you feel that the connection is not available, you can discard the old connection, discard it and disconnect it, and then initiate a new connection. These two steps seem simple, but if you want to achieve fast, and not so easy.
first:It is to disconnect the old connection. For the client, how to disconnect quickly? According to the protocol, the client must negotiate with the server to disconnect the websocket. But when the client cannot contact the server and cannot negotiate, how to disconnect and recover quickly?
secondly:Is to quickly initiate new connections. This fast is not that fast. The fast one here is not to initiate a connection immediately, which will have an unpredictable impact on the server. When reconnecting, some backoff algorithms are usually used, and the reconnection is initiated after a period of delay. But how to make a trade-off between reconnection interval and performance consumption? How to quickly initiate a connection at the “right point in time”?
With these questions, let’s take a closer look at the three processes
5. Fast reconnection key 1: fast sensing when reconnection is needed
Scenes that need to be reconnected can be subdivided into three categories:
- 1) The connection was definitely disconnected;
- 2) The connection is not broken, but it is not available;
- 3) The service on the opposite end of the connection is not available.
For the first scenario:It’s very simple. The connection is directly disconnected. It must be reconnected.
For the latter two: whether the connection is unavailable or the service is not available, the impact on the upper application is that instant messaging can no longer be sent and received.
5.2 heartbeat packet active detection network availability
Therefore, from the above point of view, a simple and crude way to sense when reconnection is needed is through heartbeat packet timeout: Send a heartbeat packet. If the server does not receive a packet back after a certain period of time, the service is considered unavailable, as shown in the left side of the figure below (this method is the most direct).
If you want to quickly sense it, you can only send more heartbeat packets to speed up the heart rate. However, if the heartbeat is too fast, it will consume too much traffic and power on the mobile terminal. Therefore, this method can not achieve fast sensing, and can be used as a cover mechanism for detecting connections and services.
5.3 passive monitoring network state change
In addition to the fact that TCP connection is not available, it is necessary to detect whether the network is disconnected or not. In addition, if the network connection is not available, it is necessary to detect whether the network is disconnected or not Sharp perception of the application layer network changes, so sometimes even if the network is disconnected for a short time, the websocket connection will not be affected. After the network is restored, it can still communicate normally.
Therefore, when the network is disconnected from the connection, the next connection can be judged immediately by sending a heartbeat packet. If the heartbeat packet from the server can be received normally, the connection is still available. If the heartbeat back packet is not received after the waiting time-out, the connection needs to be reconnected, as shown on the right side of the above figure. This method has the advantage of fast speed. It can sense whether the connection is available at the first time after the network is restored. If it is not available, it can quickly recover. However, it can only cover the situation that the websocket is not available due to the change of application layer network.
- 1) The scheme of sending heartbeat packet detection regularly is stable and can cover all scenes, but the speed is not real-time (the heartbeat interval is fixed);
- 2) The scheme to judge the network status is fast and sensitive without waiting for heartbeat interval, but the coverage scenario is limited.
Therefore, we can combine two options:
- 1) The heartbeat packet is sent at a slow rate, such as 40s / time, 60s / time, etc., which can be determined according to the application scenario;
- 2) Then, when the network state changes from offline to online, a heartbeat is sent immediately to detect whether the current connection is available. If it is not available, it will be recovered immediately.
In this way, in most cases, the application communication of the upper layer can recover quickly from the unavailable state. For a small number of scenarios, there is a timed heartbeat as the background, which can be recovered in a heartbeat cycle.
6. Fast reconnection key 2: fast disconnect old connection
Usually, before initiating the next connection, if the old connection still exists, you should disconnect the old connection.
The purpose of this is:
- 1) First, it can release the resources of client and server;
- 2) Second, it can avoid sending and receiving data from the old connection by mistake.
We know that the underlying layer of websocket is based on the TCP protocol to transmit data, and the two ends of the connection are the server and the client, and the time of TCP_ The wait state is maintained by the server side, so in most normal cases, the server should initiate the disconnection of the underlying TCP connection, not the client.
in other words:
- 1) To disconnect the websocket, if the server receives an instruction to disconnect the websocket, it should immediately initiate the TCP disconnection;
- 2) If the client receives an instruction to disconnect the websocket, it should signal the server and wait for the underlying TCP connection to be disconnected by the server or until it times out.
If the client wants to disconnect the old websocket, it can be divided into two cases: the websocket connection is available or not.
The details are as follows:
- 1) When the old connection is available, the client can send the disconnection signal to the server directly, and then the server initiates the disconnection;
- 2) When the old connection is not available, such as when the client switches WiFi, the client sends a disconnect signal, but the server cannot receive it. The client can only wait until the timeout before it is allowed to disconnect.
The process of time-out disconnection is relatively long. Is there any way to quickly disconnect?
The upper layer application can not change the protocol level rule that the server can only initiate the disconnection, so it can only start from the application logic. For example, the upper layer can guarantee the complete failure of the old connection through the business logic, simulate the disconnection of the connection, and then initiate a new connection to restore communication.
This method is equivalent to trying to disconnect the old connection. If it is not possible to do so, you can directly abandon it, and then you can quickly enter the next process. Therefore, you must ensure that the old connection has completely failed in business logic.
- 1) Ensure that all data received from the old connection is lost;
- 2) The old connection cannot prevent the establishment of a new connection
- 3) The new connection and the upper business logic cannot be affected after the old connection is disconnected.
7. Fast reconnection key 3: fast initiation of new connections
Students with IM development experience should understand that when reconnecting due to network reasons, it is absolutely impossible to initiate a new connection immediately. Otherwise, when there is network jitter, all devices will immediately connect to the server at the same time. This is no different from a denial of service attack caused by a hacker who consumes network bandwidth by launching a large number of requests It’s a disaster.
Therefore, when reconnecting, some backoff algorithms are usually used to initiate the reconnection after a period of delay, as shown in the flow chart on the left.
What if you want to connect quickly? The most direct way is to shorten the interval between retries. The shorter the interval is, the faster the communication can be restored after the network is restored. However, too frequent retrying will cause serious consumption of performance, bandwidth and power.
How to make a better balance between them?
- 1) A reasonable way is to increase the interval of retries with the increase of the number of retries;
- 2) On the other hand, monitor the network changes, and appropriately reduce the reconnection interval when the network state changes from offline to online.
The second scheme mentioned above, as shown on the right side of the figure above, will increase the reconnection interval with the increase of the number of retries. The combination of these two methods is more reasonable.
In addition, it is also possible to adjust the interval according to the possibility of successful reconnection in combination with the business logic. For example, when the network is not connected or is applied in the background, the reconnection interval can be adjusted a little more, and so on, so as to speed up the reconnection.
8. Summary of this paper
Finally, let’s sum up.
This paper divides websocket disconnection and reconnection logic into three steps
- 1) Determine when reconnection is required;
- 2) Disconnect the old connection;
- 3) Initiate a new connection.
Then it analyzes how to quickly complete these three steps in different states of websocket and different network states.
The specific summary of the process is as follows:
- 1) First of all, it detects whether the current connection is available by sending the heartbeat packet regularly, and monitors the network recovery events. After the recovery, it sends a heartbeat immediately to quickly sense the current state and judge whether it needs to be reconnected;
- 2) Secondly, under normal circumstances, the old connection is disconnected by the server. When the server loses contact with the server, the old connection is discarded directly, and the upper layer simulates the disconnection to realize the fast disconnection;
- 3) Finally: when a new connection is initiated, the backoff algorithm is used to delay the connection for a period of time. Meanwhile, considering the waste of resources and the speed of reconnection, the reconnection interval can be increased when the network is offline, and the reconnection interval can be reduced when the network is normal or when the network changes from offline to online.
The above is my technology sharing about how to realize websocket fast reconnection. Please leave a message to discuss with me.
 RFC 6455 documentation
 Quick start for beginners: a brief tutorial on websocket
 Detailed explanation of websocket (4): the relationship between HTTP and websocket (Part 1)
 Detailed explanation of websocket (5): the relationship between HTTP and websocket (Part 2)
 Detailed explanation of websocket (6): the relationship between websocket and socket
Appendix: more information about instant messaging on the web
“Beginner’s post: the most complete explanation of the principle of instant messaging on the web in history”
An inventory of instant messaging technology on Web: short polling, comet, websocket, SSE
Detailed explanation of SSE Technology: a new HTML5 server push event technology
Detailed explanation of comet Technology: real time communication technology of Web terminal based on HTTP long connection
” socket.io Practice and ideas of message push
LinkedIn’s web side instant messaging practice: realizing hundreds of thousands of long connections on a single machine
The development of Web instant messaging technology and websocket Socket.io Technical practice of
Instant messaging security on Web: a detailed explanation of cross site websocket hijacking vulnerability (including sample code)
Open source framework pomelo practice: building high performance distributed IM chat server on Web
Using websocket and SSE technology to push messages on Web
Explain the evolution of Web Communication: from Ajax and jsonp to SSE and websocket
Why is the network layer framework of mobileimsdk web used Socket.io Not netty? “
Integrating theory with practice: understanding the communication principle, protocol format and security of websocket from zero
How to use websocket to realize long connection (including complete source code) in wechat applet
Eight questions about websocket protocol: quick answers to popular websocket questions
Getting to know electron quickly: a new generation of web based cross platform desktop Technology
Understanding the evolution of front end technology
“Web instant messaging basic knowledge make-up lesson: understand all cross domain problems! “
Web instant messaging practice dry goods: how to make your websocket disconnected and reconnected faster? “
>>More similar articles
(this article was published at: http://www.52im.net/thread-3098-1-1.html )