Lay a solid foundation

Time:2021-5-13

preface

Countdown to Chinese New Year

Today is the last part of the network, network knowledge is also the content of the interview, so we must lay a solid foundation.

Network 12 questions, for you.

Can you answer these questions

I have summarized some questions that will be involved in the network. If you can answer them, this article can be omitted.

  • What protocol is used in the process of network communication?
  • TCP connection process, three handshakes and four waves, why?
  • Common status code.
  • Talk about the differences and scenarios between TCP protocol and UDP protocol
  • Socket and websocket
  • Link building process of HTTP
  • Explain why digital signature is true and reliable
  • Certificate chain security mechanism
  • The establishment process is time-consuming, so how to optimize it?
  • Talk about the difference between HTTP and HTTP
  • How does HTTP transmit pictures
  • How to realize block transmission and breakpoint continuation?

The process of network communication, and what protocol is used in the middle

I made a special animation for this question before. You can turn to the previous article to have a look

Network data was originally transmitted in this way (combined with animation analysis)

In a brief summary:

client:

  • 1. Enter the web address in the browser
  • 2. The browser parses the web address and generates thehttpRequest message
  • 3. The browser calls the system parser and sends a message to the DNS server to query the domain nameip
  • 4. After getting the IP, it will be sent to the operating system protocol stack together with the request messageTCP module
  • 5. Divide the data into packets, and add the TCP header to formTCP packet
  • 6. TCP header includes sender slogan, receiver slogan and packet headerSerial number, ACK number
  • 7. And thenTCP messageGive it to the IP module.
  • 8. IP modules will be addedIP headerandMAC header
  • 9. IP header includesIP address, used for IP module, MAC header including MAC address, used for data link layer.
  • 10、IP moduleThe whole message packet will be sent to the network hardware, that is, the data link layer, such as Ethernet, WiFi, etc
  • 11. The NIC then converts the packets intoElectrical or optical signalsIt is sent out through network cable or optical fiber, and then sent to the receiver by router and other forwarding devices.

Server side:

  • 1. How long does the packet arrive at the serverdata link layer, such as Ethernet, which is then converted into packets (digital signals) for theIP module
  • 2、IP moduleThe content behind the MAC header and IP header, that is, TCP packets, will be sent to the TCP module.
  • 3、TCP moduleIt will parse the TCP header information, and then communicate with the client to indicate that it has received the packet.
  • 4、TCP moduleAfter receiving all the packets of the message, it will encapsulate the message, generate the corresponding message and send it to the application layer, that is, the HTTP layer.
  • 5、HTTP layerAfter receiving a message, such as HTML data, it will parse the HTML data and finally draw it to the browser page.

TCP connection process, three handshakes and four waves, why?

Connection phase (three handshakes)

  • Create socketSocket, the server will be created when it starts, and the client will create socket when it needs to access the server
  • And then initiate the connection operation, which is actually the function of socketconnectmethod
  • At this time, the client will generate a TCP packet. The TCP header of this packet contains several important information:SYN、ACK、Seq、Ack

Syn, the synchronization sequence number, is the handshake signal used when TCP / IP establishes a connection. If the value is 1, it means the connection message.
ACK, acknowledgement flag. If the value is 1, it means the acknowledgement message.
Seq, packet number, is a sequence number of the data sent.
ACK number, confirmation number, is a sequence number of received data.

  • So the client generates such a packet, in which the control bit of the header informationSYNSet to 1 for connection.SEQSet a random number to represent the initial serial number, such as 100.
  • Then the server receives this message and knows that the client is coming to connect(SYN=1)The initial serial number of the transmitted data is known(SEQ=100)
  • The server also needs to generate a packet and send it to the client. The TCP header of the packet will contain: indicating that I want to connect with you, tooSYN(SYN=1), I have received your confirmation number of the last packetACK=1(Ack=Seq+1=101), and a sequence number randomly generated by the serverSeq (for example, SEQ = 200)
  • Finally, after the client receives this message, it indicates that the connection between the client and the server is correct, and then sends a packet to confirm that it has received the packet from the server. The header of the packet is mainly aACK=1(Ack=Seq+1=201)
  • At this point, the connection is successful, the three handshakes are over, and the data will be transmitted normally, and each time, the TCP header will be carriedSeq and ACK

Here’s a question about whyThree handshakes

The main reason is that both sides need to confirm that their message has been accurately conveyed.

A sends a message to B, and B returns a message to indicate that I have received it. This process ensures a’s communication ability.
B sends a message to a, and a returns a message to indicate that I have received it. This process ensures B’s communication ability.

In other words, the four messages can ensure that the messages sent by both sides are normal. The message returned by B and the message sent by B can be integrated into one message, so there is a problemThree handshakes

Data transmission stage:

There is a change in the data transmission phaseACK confirmation numberNo longerSeq+1It’s not like I’m here, it’s like I’m hereSeq + data length. For example:

  • Packet sent by a to B (SEQ = 100, length = 1000 bytes)
  • Packets returned by B to a (ACK = 100 + 1000 = 1100)

This is the header information of a data transmission,AckRepresents which byte the next packet should start from, so it is equal to the length of SEQ + of the previous packet, and seq is equal to the ack of the previous packet.

Of course, TCP communication is bi-directional, so the actual data is available for every messageSeq and ACK

  • Packets sent by a to B (ACK = 200, SEQ = 100, length = 1000 bytes)
  • Packets returned by B to a (ACK = 100 + 1000 = 1100, SEQ = ack of the previous packet = 200, length = 500 bytes)
  • A sends packets to B (SEQ = 1100, ACK = 200 + 500 = 700)

Disconnection phase (four waves)

As in the connection phase, the TCP header also has a special value called fin for closing the connection.

  • When the client is ready to close the connection, it will send aTCP packetThe header information includes(fin = 1 means to disconnect)

  • The server receives the message and replies a packet to the client. The header information includesACK confirmation number. But at this time, the normal business of the server side may not be completed, and we have to process the data and close the end.

  • The client receives the message.

  • The server continues to process the data.

  • When the server has finished processing the data and is ready to close the connection, it will send aTCP packetTo the client, the header information includes (fin = 1 means to disconnect)

  • The client receives the message and replies a packet to the server. The header information includesACK confirmation number

  • After receiving the message, the server completes the connection closing.

  • The client automatically enters after a period of time (2msl)Closed stateTo close the connection to this client.

MSL is the maximum segment lifetime, the maximum lifetime of a message. It is the longest time that any message exists on the network. Beyond this time, the message will be discarded.

Here’s a question about why you need to wave four times?

A sends a disconnection message to B, and B returns a message to indicate that I have received it. This process ensures the success of a disconnection.
B sends a disconnection message to a, and a sends back a message to indicate that I have received it. This process ensures the success of B’s disconnection.

In fact, the difference from the connection phase is that B’s confirmation message and disconnection message hereNo fusion. When a wants to disconnect, B may still have data to process and send, so it is necessary to wait for the normal business processing to finish before sending the disconnect message.

Common status codes

  • 1XX-Temporary news. The server receives the request and needs the requester to continue the operation.
  • 2XX-The request was successful. Request received successfully, understood and processed.
  • 3XX-Redirection. Further action is required to complete the request.
  • 4XXClient error. The request contains a syntax error or cannot complete the request.
  • 5XX-Server error. The server encountered an error while processing the request.

Common status codes:

200 OK-Client request successful
301-Resources (web pages, etc.) are permanently transferred to other URLs
302-Temporary jump
400 Bad Request-The client request has syntax error and cannot be understood by the server
404 - request resource does not exist, bad URL.
500-An unexpected error occurred inside the server.
503 Server Unavailable-The server can’t process the client’s request at present. It may return to normal after a period of time.

Talk about the differences and scenarios between TCP protocol and UDP protocol

I’ll start with two scenes, which may be more understandable.

1) The first scene,Browse the web。( TCP scenario)

  • When we visit a web page, we must display all the data correctly. If the packet is lost in the process, the packet will be retransmitted. It is impossible to display only a part of the web page
  • Similarly, the content of a web page must also beorderYes. For example, if I draw a lottery, I can’t give it to you without drawing( Ensure the order of data)
  • Again, in this process of strict data requirements, we certainly need to establish a relationship between the two sidesreliableIn other words, as we mentioned above, it takes three handshakes to start data transmission, and each time a packet is sent, it needs a return receipt (connection oriented)
  • And this connection is used to transmit dataByte streamIn other words, there is a pipeline. You can transfer data as you like and accept data as you like, as long as it is in this pipeline.

So that’s the needAccurate data, correct sequence, stable and reliable requirementsIn this scenario, we need to use TCP.

2) The second scene is playing games( UDP scenario)

The most important thing in playing games is instant. Otherwise, if I send this skill to you, it will not be played.

  • thereforeUDPWe need to guarantee the dataImmediacyIt does not guarantee that every packet is received correctly. Even if the packet is lost, it will not find what packet is lost, because it needs to display the current packet at the current time( Data correctness and data order are not guaranteed, and packet loss may occur.)
  • Similarly, for the sake of data immediacy,UDPYou don’t need to make a connection. You don’t need to shake hands three times. You need to confirm whether you receive it or not every time. Whether you receive it or not, I just need to quickly throw each packet to you( (for connectionless)
  • the reason being thatNo connectionSo you don’t need to use byte stream, just drop one at a timeDatagramHere you are. The receiver can only accept one datagram( (based on datagram)

If you are still a little dizzy, you can read this article (Adam and Eve), a very vivid metaphor:
https://www.zhihu.com/question/51388497?sort=created

Socket and websocket

Although the two goods have similar names, they are not a hierarchical concept.

  • socket, socket. As mentioned above, in the process of TCP establishing a connection, the socket API is called to establish the connection channel. So it’s just an interface, a class.

  • WebSocket, which is the same level as HTTP and belongs to the application layer protocol. It is to solve the problem of long-time communication. It is introduced from HTML5 specification. It is a full duplex communication protocol based on TCP protocol. Similarly, the lower layer also needs TCP to establish a connection, so it also needs socket.

Popular science: after establishing a TCP connection, websocket needs to shake hands with HTTP, that is, send a get request message to the server through HTTP, and tell the server that I want to establish a websocket connection. You are ready. The specific method is to add relevant parameters to the header information. Then the server responds that I know, changes the connection protocol to websocket, and starts to establish a long connection.

If we insist that there is a relationship between the two, it isWebSocketThe protocol is also usedTCP connectionAnd TCP connection usesSocketAPI.

Connection establishment process of HTTP

Let’s talk about HTTP and TCP / IPHTTPS

The last article talked about how to ensure the safe transmission of data using HTTPS,link:https://mp.weixin.qq.com/s/dbmwBVxHkvQ0fzWaSdtPYg

The main one is to use itdigital certificate

Now let’s look at it in its entiretyHTTP connection establishment (also called TLS handshake process)

  • 1. The client sends the client Hello packet message.

This message contains aRandom number (randomc),Encryption family(key exchange algorithms are asymmetric encryption algorithm, symmetric encryption algorithm and hash algorithm),Session ID(used as a resuming reply).

If the client wants to establish communication, after the TCP handshake, it will send the first message, also known asClient HelloNews. This message mainly sends some of the above contents. The ciphertext family sends some algorithms supported by the client side to the server, and then the server compares them with the algorithms supported by the server to get the optimal algorithm supported by both sides.

  • 2. The server replies to three packet messages: server Hello, certificate and server Hello done.

Server HelloThe message content includes a random number (randoms), the encryption family obtained after comparison, and the session ID (used as the recovery call).

Up to now, both sides already have two random numbers. Let’s see what these two random numbers are for later. Then the encryption algorithm has just said that the server has negotiated three algorithms and sent them back to the client.

CertificateThe message is to send a digital certificate. I won’t elaborate here.

Server Hello DoneA message is an end flag, which means that all the messages that should be sent have been sent to you.

  • 3. Symmetric key generation process

1) First, the client will verify the certificateverificationFor example, digital signature, certificate chain, certificate validity period and certificate status.
2) After the certificate is verified, the client will send an encrypted message with the server’s public key in the certificateRandom number pre master secret, the server decrypts with its private key after receiving it.
3) At this point, both the client and the server have three random numbers:randomC、randomS、pre—master secret。
4) Then the client and server are generated with three random numbers according to the fixed algorithmSymmetric key

  • 4. Generate session ID

This step is the same as the beginning of the two Hello messagesSession IDIt’s corresponding.

The ID of the session will be generated. If the subsequent session is disconnected, the ID of the session will be generatedSession IDThe dialog can be resumed without the need to send the certificate and generate the key again.

  • 5. Transmitting data with symmetric key

After getting the symmetric key, the two sides can use the symmetric key to encrypt and decrypt the data and communicate normally.

extendWhy use asymmetric encryption algorithm to negotiate symmetric encryption?

First of all, the speed of network data transmission is relatively high. On the premise of ensuring security, the symmetric encryption method is adopted instead of the time-consuming asymmetric encryption algorithm.
Secondly, on the premise of symmetric encryption transmission, if the transmission of symmetric encryption key is related to security, the asymmetric encryption algorithm with higher security and certificate chain mechanism are adopted to ensure the transmissionSymmetric key related dataThe security of the system.

Please explain to me why the digital signature is true and reliable

digital signature , that is, the electronic signature mentioned above. Let’s review briefly:

Digital signature, in fact, is also a kind ofAsymmetric encryptionThe usage of.

Its usage is as follows:

A uses the private key to access the dataHash valueTo encrypt, the encrypted ciphertext is calledautographAnd then transfer the ciphertext and the data itself to B.

B. when you get it, sign itPublic keyDecrypt it, and then compare it with the hash value of the passed data. If it is the same, it means that the signature is indeed signed by a, and only a can sign, because only a can signA has a private key

The actual situation is as follows:

On the server side, the data, that is, the data we want to transfer (public key), will be transferred by anotherPrivate key signatureThe hash value of the data, and then theData (public key)Pass it along.
Then the client decrypts the signature with another public key. If the decrypted data and the hash value of the data (public key) are consistent, it can be provedThe source is correctIt’s not faked.

  • Reliable source. The digital signature can only be signed by the party with the private key, so its existence ensures that the source of the data is correct
  • The data is reliable. The hash value is fixed. If the hash value of the data decrypted by the signature is consistent with that of the data itself, it means that the data has not been modified.

Certificate chain security mechanism

Ca (certificate authority) is the organization that issues digital certificate. It is the authority responsible for issuing and managing digital certificates, and as a trusted third party in e-commerce transactions, it undertakes the responsibility of public key legitimacy test in public key system.

In fact, the server will take its own public key and some information from the server to the serverCA, and thenCAWill return to the server adigital certificate This certificate includes:

  • The public key of the server
  • signature algorithm
  • Server information, including the host name, etc.
  • The signature of the CA’s own private key to the certificate

The server then passes the certificate to theclientHow to verify the client?

Careful little partners must know that every client, whether it is a computer or a mobile phone, has its ownSystem root certificate, which will include theIssuing agency. So the root certificate of the system will use theirsPublic keyHelp us decrypt the signature of the digital certificate, and then compare it with the hash value of the data in the certificate. If it is the same, it meanssourceThat’s right,dataIt has not been modified.

Of course, the middleman can also apply for a certificate through Ca, but there will be the host name of the server in the certificateHost name (domain name, IP)You can verify which host your source is from.

Expand:

In fact, there is another layer between the server certificate and the root certificate: calledIntermediate certificate, we can open a web page at will, and click the button in the upper left corner to see the details of the certificate:

You can see the general integritySSL/TLSThe certificate has three layers

  • Layer 1: root certificate. That is to say, the root certificate of the client is self signed, that is, the signature is made and verified with its own public key and private key.
  • Level 2: Intermediate Certificate. Generally, the root certificate will not issue the server certificate directly, because this kind of behavior is more dangerous. If the root certificate is found to be wrong, it will be very troublesome to issue the certificate, and the modification of the root certificate will be involved. Therefore, the intermediate certificate is usually referenced, the root certificate signs the intermediate certificate, and then the intermediate certificate signs the server certificate, one layer at a time.
  • Layer 3: server certificate. This is the certificate related to our server.

The establishment process is time-consuming, so how to optimize it?

  • 1. Upgrade http2.0

HTTP 2.0 was tested for the first time in August 2013. On the open Internet, HTTP 2.0 will only be used for the http: / / url, not theHttp: / / will continue to use HTTP / 1. The purpose is to increase the use of encryption technology on the open Internet to provide strong protection against active attacks

HTTP2The main characteristics are as follows:

  • Binary framing. Compared with text transmission, binary data transmission is more conducive to analysis and optimization.

  • Multiplexing. All communications under the same domain name are completed on a single connection, and a single connection can also carry any number of two-way data streams.

  • Head optimization. Http / 2 uses hpack (compression format specially designed for HTTP / 2 header) to compress the header, which can save the network traffic occupied by the header.

  • 2. Using sessionid

As mentioned earlier, in order to repeat the connection process after disconnection and reconnection, theSessionIDRecord the session ID, and then reuse which session to locate.
Thus, the process of repeatedly sending certificate and generating key is subtracted.

  • 3、TLS False Start

This is an optimization scheme proposed by Google

In the second phase of TLS handshake negotiation, the client is verifying the certificate and sending thepre—master secretAfter that, you can directly bring the application data, such as requesting web page data.

Then the server receivespre—master secretAfter that, the symmetric key is generated, and then the symmetric key is directly used to decrypt the application data, and the response message is sent to the client.

In fact, the two steps are mixed into one step. The client does not need to wait for the server to confirm and then send the application data. Instead, it directly communicates with the server in the second stagepre—master secretThe handshake process is reduced and the time consumption is reduced.

  • 4、OCSP Stapling

OCSPIt is an online query service to verify and check the revocation status of certificates.

One step in the process of certificate verification is to verify the validity of the certificate. We can let the server pass firstOCSPQuery whether the certificate is legal, and then send the result to the client together with the certificate, so the client does not need to verify the validity of the certificate separately, so as to improve the efficiency of TLS handshake. This function is called OCSP stacking.

Extension:

If you don’t consider the establishment process, start from the whole processHttpsWhat are the optimization points for the transmission process?

You can look at this articlehttps://www.cnblogs.com/evan-blog/p/9898046.html

Talk about the difference between HTTP and HTTPS

After the above long explanation, the difference between the two should be very clear

  • HTTPIt’s hypertext transfer protocol, and information is transmitted in plaintext,HTTPSIt is to add a layer of SSL / TLS encrypted transport protocol with security under the HTTP layer, and use CA certificate.
  • HTTPThere is no identity authentication, the client can not know the other party’s real identity.HTTPSCA certificate is added to confirm the other party’s information.
  • HTTPThe default port is 80,HTTPS443.
  • HTTPBecause of plaintext transmission, it is easy to be attacked or hijacked.

How to realize block transmission and breakpoint continuation?

Block transmission

Under normal circumstances, after the data is sent, the server will break the link.

So it is usually set in the request headerConnectionThe value of the field is:keep-alive , indicating that the connection should not be disconnected until the end of a packetConnectionThe value of the field is close.

There’s another way to maintain itTCP connection, which is to transmit the request data in blocks.

Block transmission means that the data sent by the server to the client can be divided into multiple parts for transmission.

usage method:

  • Message header settings Transfer-Encoding: chunked
  • Each piece indicates the length
  • It ends with a chunk of length 0

Objective: To investigate the mechanism of the disease

Let the client respond quickly and reduce the waiting time. Maintain a long connection.

But, but, this block transmission onlyHTTP1.1Only then.HTTP2.0It supports multiplexing. A single connection can carry any number of two-way data streams, that is, it can carry out two-way transmission in any connection, without the need for block transmission.

Breakpoint continuation

It means that the client wants to start downloading or uploading from the place where the file was interrupted last time, so that even if the download or upload is interrupted due to network problems, it’s OK to ensure a good user experience.

usage method:

  • Add theRangeField, indicating from which byte to download to which byte to end (range: bytes = 0-499)
  • Add theContent-Range, indicating the range of data currently sent and the total file size (content range: bytes 0-499 / 22400).
  • ETagField represents the uniqueness of the file.

Actual use process:

  • for the first timeWhen the client requests to download, the server will return the file content and Etag mark, and the status code is 200.
  • The second timeWhen the client requests breakpoint continuation, it will send two header messages (R ange:bytes=200-499 ,If-Range:Etag)。
  • Then the server will judgeEtagWhether it matches. If it matches, this part of the data (content range: bytes 200-499 / 22400) will be returned. The status code is 206, indicating that this is part of the data you requested. Otherwise, all the data of the file will be returned, and the status code is 200.

How does HTTP transmit pictures

In fact, this kind of question is aboutContent-TypeThere are three ways to understand it

  • multipart/form-data

Form type transfer file request. By settingcontent-typebymultipart/form-dataTo send binary format files.
Support multiple file upload, can also bring text parameters.

This is the most common practice.

  • image/png,image/jpeg

This method is to convert the image directly toBinary streamTransmission, the server is also a direct read stream of data can be converted into pictures.

But this method has a disadvantage that it can only transfer one picture at a time.

  • application/x-www-form-urlencoded,text/plain

Another way is to turn the picture intoBase64Format string, and then transfer, and ordinary text parameters, setapplication/x-www-form-urlencodedperhapstext/plainWait for content type.

reference resources

https://wetest.qq.com/lab/view/110.html
https://www.zhihu.com/question/271701044
https://www.cnblogs.com/wqhwe/p/5407468.html
http://www.ruanyifeng.com/blog/2017/06/tcp-protocol.html
https://network.51cto.com/art/201909/602938.htm
https://www.dazhuanlan.com/2019/11/21/5dd5aeeff1d0b/
https://zhuanlan.zhihu.com/p/26559480
How the Internet connects

bye-bye

Thank you for reading. A little buddy can learn about my official account code blocks. ❤️❤️
Every day a knowledge point, add up to build a knowledge system architecture.
Here are a group of good Android friends, welcome to join ~

Recommended Today

Looking for frustration 1.0

I believe you have a basic understanding of trust in yesterday’s article. Today we will give a complete introduction to trust. Why choose rust It’s a language that gives everyone the ability to build reliable and efficient software. You can’t write unsafe code here (unsafe block is not in the scope of discussion). Most of […]