Front end engineers must understand what happens after the browser enters the URL

Time:2020-10-26

This question is a special classic interview question, involving the front-end personnel should master the basic network knowledge, I believe many front-end students understand. But if you ask deeply, you can also ask a lot of sub questions, and also test the depth of knowledge and flexible use of the interviewer.

There are several stages after the browser enters the URL:

  • DNS domain name resolution
  • Establish TCP connection
  • Send HTTP request
  • The server processes the request
  • The server returns the response result
  • Close TCP connection
  • The browser parses HTML and renders the layout

1. DNS domain name resolution
  • When we visit a website, we can use theHost name or domain nameTo visit. But most of the timeUse domain name to accessWebsite. Because relative toThe IP address of the host nameDomain names are more memorable.
  • howeverTCP / IP protocol is accessed through IP addressSo we need onemechanism, willDomain name to IP address
  • The DNS service does this, and it providesdomain namereachIP addressBetween the resolution services
DNS resolution process
  • 1. Visitwww.baidu.comreachDNSThe server
  • 2 . DNSServer returnwww.baidu.comIP address of115.182.4X.18X
  • 3. Then request IP address as115.182.4X.18XServer for
  • 4. Successful access towww.baidu.comReal server for
Priority of DNS

The local computer willFrequently used domain name and corresponding IP addressCreate a mapping relationship and save it locallyhostFile. When DNS is resolved, it will take priority from the localhostFind in fileMapped IP addressmapping

  • 1. If localhostThe corresponding domain name was found in the fileIPAddress mapping, will be used directlyhostIP address in file
  • 2. If localhostNo corresponding domain name was found in the fileIPAddress mapping fromLocal DNS serverFind in
  • 3. IfLocal DNS serverNo corresponding domain name was found inIPAddress mapping will continue toDNS server on the next levelSend request untilDNS root server。 If you find it, go aheadReturn, return to the browser.

2. Establish TCP link

First of all, let’s understand one knowledge pointTCP / IP protocol family
TCP / IP protocol family is composed ofLayer 4 protocolSystem. namely:

  • 1. Application layer (HTTP)
  • 2. Transport layer (TCP)
  • 3. Network layer (IP)
  • 4. Link layer (network hardware)

My understanding is: before we visit a website, we should first make sure thatThere is a networkRight? There’s a cable / router, right? So at this point,link layerIt’s OK. After the first step above, I already knowThe IP address mapped by this domain name, and can be accessed, then at this timeThe network layer is OKYes.
So the next thing isTransport layer, TCP。 In fact, what is said here is not very rigorous. The transport layer is not onlyTCPAn agreement, andUDPagreement.UDPThe agreement isConnectionlessBecause it does not need to be connected, soThe efficiency is relatively high, but also because there is no connection verification requiredSecurity and reliability are not guaranteed。 andTCP is connection orientedThere is a verification mechanism. So it is widely used. But there are also disadvantages: because the connection is established in advance, soLess efficient。 The transport layer in this paper refers toTCP protocol
To ensure thatReliability of both sides of the connectionWhen the two sides establish a connection, TCP uses theTriple handshake strategy

TCP triple handshake establishes connection
  • 1. The first handshake

Client send withSyn logoConnection request forMessage segmentAnd then the client entersSYN_ Send statusWaiting for server confirmation

  • 2. Second handshake

The server received theSyn logoAfter the message segment, it needs to be sentACK acknowledgement message segmentYes, this oneSyn segmentTo confirm. At the same time, it will happen to the clientSYNRequest information. The server will put the above information into aIn the message segment (syn + ack segment)Send to the client together. At this point, the server entersSYN_RECVstage

    1. The third handshake

The client browser received theSyn + ack segment (request + return)After, the new is sent to the serverACK acknowledgement message segment。 After the message segment is sent, both the client and the server enterESTAB-LISHEDThe (establish connection) state is completedThree handshakes

To know what it is, we need to know why,Why do you have to shake hands three times to establish a TCP connection?

The root cause isTo ensure that the receiving and sending capabilities of both the client and the server are OK
After the first handshake, the server receives theSYNRequest flag, then the server can know:The sending ability of the client is OK, and the receiving ability of the server itself is OK
After the second handshake, the client received theSYN+ACKRequest + return flag, then the client can know:The sending and receiving capability of the client is OK, while that of the server is OK
After the third handshake, the server receives theACKReturn the flag. At this time, the server can know:The server's own receiving and sending capabilities are OK


3. Send HTTP request

After the TCP connection is established, the HTTP request can be sent. When it comes to HTTP requests, the front-end students have to learn aboutHTTP messageRelated knowledge points.

Components of HTTP message
  • 1. Request message
    Request line:Request method / page address / protocol and version
    Request header:Message header key / value value
    Blank line:As a separator for request header and request body
    Request body:Newspaper style transmission parameters, etc
  • 2. Response message
    Response line:Protocol and version / status description
    Response head:Message header key / value value
    Blank line:As the separator of response header and response body
    Response body:The report style returns the response result

It’s not intuitive. We can go through itcurl(command line network test tool) to test the message structure of the request.
Open terminal, inputcurl -v www.baidu.comYou can see:Front end engineers must understand what happens after the browser enters the URL

AboveRequest messageThere are few blank spaces on the side, so I’d like to talk about them separately

Get / HTTP / 1.1 // request line (method / path / protocol)

//Here is the request header
Host: www.baidu.com
User-Agent:curl/7.54.0
Accept: */*
//Empty line

//The request body is empty because it has no parameters
Some common header

Request header and response headerAll belong toMessage header
Common header
  – 1. Accept: Media types acceptable to clients
    – Accept:text/html: indicates that the browser can accept the type returned by the servertetx/html。 In other words, the HTML document is often called, but if the server cannot return ittext/htmlThe server should return a406 (non acceptable) error
    – Accept:*\*: represents that the browser can handleAll types
– you can use theq=?To express in additionweightThe weight Q ranges from 0 to 1The default weight is 1
– 2. Accept encoding: the browser declares its own encoding method
– usuallySpecifies whether the compression method supports compression and what compression methods are supported (gzip / deflate)
– 3. Accept language: the browser declares its own language
    – Accept-Language:zh-cn,zh;q=0.7,en-us,en;q=0.3
– weight information can also be indicated
  – 4. Connection:keep-alive
– when a web page is opened,The TCP connection between the client and the server for transmitting HTTP data is not closedNote thatTCP connection。 The advantage of this is thatAvoid repeated 3 handshakes and 4 waves
    – Connection:close
– when a web page is opened,The TCP connection between the client and the server for transmitting HTTP data is closed
– when the client sends the request again,TCP connection needs to be reestablished
– 5. Host: the ‘inter’ of the requested resourceHost name and port number
– it is usually extracted from the URL
    – www.xxx.com:8080
– 6. Referer: source
– forTell the server which link came in fromThe server can obtain some information for processing
– 7. User agent: tells the server the name and version of the operating system used by the client
– the server can use theUser-AgentTo determine the browser type and make different compatible processing
– 8. Content type: indicates the media type of the object in the message body
    – text/html: HTML format
    – text/plainFormat: plain text
    – text/xml: XML format
    – image/gif: gif image format
    – text/jpeg: JPG picture format
    – text/png: PNG picture format
    – application/json: JSON data format
    – application/xxml: XML data format
    – application/msword: word document format
    – application/octet-stream: binary data stream (common in file downloads)
    – application/x-www-form-urlencoded: form submission


4. The server processes the request

After receiving the request from the client, the server begins to process the request. Adjust database processing resources. I don’t want to elaborate on it here


5. The server returns the response result

If there is a request, there will be a response, even if it is an error message. aboutresponse messageIt has been explained above. Let’s talk about the status code.

Status code

Object that represents the response status of the hypertext transfer protocol of a web server3-digit code

First digit of status code

– 1xx: Indication - indicates that the request has been accepted but needs to continue processing the temporary response
– 2xx: Success - indicates that the request has been successfully received
– 3xx: Redirection - further action is required to complete the request
– 4xx: Client error - the request has a syntax error or the request cannot be implemented
– 5xx: Server error - the server failed to implement a legitimate request

Common status code

-200 OK: clientRequest successful
-202 accepted: acceptedThe request was accepted but not processed
– 206 Partial Content: Breakpoint continuationThe client sends a get request with a range < range > header and the server completes it<When the requested video or audio file is very large, the server will return some [range] files to you>

-301 moved permanently: the requested page has been transferred to the new URL<Permanent Redirect >
-302 found: the requested page has been temporarily transferred to the new URL<temporary redirect >
– 304 Not Modified: Previous cache availableThe client has a cached document and makes a conditional request. The server tells the client that the original cached document can still be used

-400 bad request: client requestSyntax error, can not be understood by the server
– 401 Unauthorized:Request unauthorized user authentication is requiredThis status code must be associated withWWW-AuthenticateHeader fields are used together
-403 Forbidden: ServerUnderstands the request from the requesting client, but refuses to execute the request
– 404 Not Found:The request resource does not exist

-500 Internal Server Error: ServerInternal error, unable to complete request
– 502 Bad Gateway: An error occurred on the server for the gateway or proxyAn invalid request was received from the remote server
– 503 Server Unavailable: The request is not completed, and the server is temporarily overloaded or downIt may return to normal after a period of time


6. Close the TCP connection

When both client and browser request and respond, either party can initiate a request to disconnect TCP connection. To disconnect a TCP connection throughFour wavesTo achieve.

TCP four waves to disconnect

Before passing byThree handshakesBoth the client and the server have enteredESTAB-LISHEDstate

  • 1. The first wave

Client send withFin logoDisconnect request forMessage segment, and then the client entersFin-wait-1 status(terminate wait status 1) wait for server confirmation

  • 2. Second wave

The server received theFin logoAfter the message segment, it needs to be sentACK acknowledgement message segmentYes, this oneFin segmentTo confirm, the server entersClose-wait status(turn off wait state). The client received theACK message segmentAfter that, the client entersFin-wait-2 status(terminate wait state 2). (at this timeTCP connection from client to server has been closed。 But the TCP connection from the server to the client has not been closed)

  • 3. The third wave

After waiting for the server to send no data to the client, the server will send it to the clientFin + ACK message segment(close + confirm). Then the server enters theLast-ack status(final confirmation status)

  • 4. The fourth wave

Client sends to serverACK confirmation mark。 The server will enter the service after receiving the confirmation informationClosed status(TCP off state). The client is waiting2msl (maximum message lifetime)After time, also enteredClosed status(TCP off state). hereFour wavesSuccess. Both client and server have closed the TCP connection

Interview aboutFour wavesWhat I often ask isWhy does the client end up waiting for a period of time before it goes down?

becauseOne last waveThe server may not receive the confirmation message sent by the client to the server. If the server does not receive the final confirmation sent by the client, then the server will think thatThe shutdown request I sent to the client was not received by the client, so the client did not send me a confirmation message, so I could not receive it. Therefore, the server will send the shutdown request to the client again. In order to avoid that the client does not receive the shutdown request from the server, the client will wait for a period of time (maximum message lifetime) before entering the shutdown state


7. The browser parses HTML and renders the layout

After receiving the response result from the server, the client browser will start parsing and rendering,
WebKit engine rendering process:
Front end engineers must understand what happens after the browser enters the URL

  • 1 . HTMLafterHTML ParserIntoDOM Tree(DOM tree)
  • 2 . CSSaccording toCSS rulesandCSS ParserIntoCSS Tree
  • 3 . DOM TreeandCSS TreeCombination formationRender Tree
  • 4. AdoptionLayoutCalculate exactly what you want to displayDOMReal location
  • 5. Browser throughPaintShow the final page effect
There are several interview questions that are often asked
What is redrawing(Repaint)?

Redrawing isRepaintIt’s in aThe appearance of the element is changed, but the layout is not changed(changing the element’sNon geometric properties)In the case of a changevisibility、outline、backgroundEtc. When a repaint occurs, the browser validates the visibility property of all other nodes in the DOM tree.
When all kinds ofThe size of the box, and other attributesFor example, after the color and font size are determined, the browser will take these elements according to various characteristicsDraw it again, and the content of the page appears. This process is calledrepaint

How to trigger redraw(Repaint)?

Changing elementsNon geometric attributes (appearance attributes)。 For example:color background-coloretc.

What is rearrangement(Reflow)?

DOMEach element in the structure has its ownBox (model), all of which require a browserCalculate according to various styles and place the element where it should be displayed according to the calculation results(changing the element’sGeometric properties)This process is calledreflow

How to trigger rescheduling(Reflow)?
  • 1 . When adding, deleting and modifying DOM nodesWill lead toRefloworRepaint
  • 2 . Move the position of DOMperhapsDOM animationWhen
  • 3 . Modify CSS StyleWhen
  • 4 . Size of the rescale window(mobile terminal does not have this problem), orWhen rollingCan triggerReflow
  • 5 . Modify the default font for web pagesWhen
Suggestions to reduce reflow’s impact on performance?

Reflow is one of the key factors leading to low efficiency of DOM script executionAny node on the page is triggeredReflowWill cause its child nodes and ancestor nodes to re render

  • 1. NoModify the style of DOM one by oneWell definedclass, and then modify itDOMOfclassName
  • 2DOM can be modified after offlineFor example: firstDOMtodisplay:none(onceReflow), then modify it 100 times, and then change it againdisplay:block
  • 3. Don’t putThe property values of DOM nodes are placed in a loop as variables in the loop
  • 4. As far as possibleDo not modify DOM with a larger impact
  • 5. ForAnimation elements use positioning absolute / fixed
  • 6 . Choose high consumption style carefullyFor example:box-shadows | border-radius | transparency | transforms |CSS filters (performance killer)
  • 7 . Don't use the table layoutMaybe a small change willThis causes the entire table to be rearranged
The relationship between rearrangement and redrawing?

Redrawing must cause redrawing, but redrawing does not necessarily cause redrawing