Interview questions (2020) front end HTTP browser related interview questions
The information involved in this article comes from Internet collation and personal summary, which means personal learning and experience summary. If there is any infringement, please contact me to delete it. Thank you!
1. HTTP 1.1 is the most widely used HTTP protocol
- Comparison between HTTP 1.0 and HTTP 1.1
- HTTP 1.0 defines three request methods: get, post and head. HTTP1.1 adds six new request methods: options, put, patch, delete, trace and connect.
- Cache processing: in http1.0, if modified since and expires in the header are mainly used as the criteria for cache judgment. HTTP1.1 introduces more cache control strategies, such as entity tag, if unmodified since, if match, if none match and so on.
- Bandwidth optimization and the use of network connection: in HTTP 1.0, there are some phenomena of wasting bandwidth. For example, the client only needs a part of an object, but the server sends the whole object, and does not support the function of breakpoint continuation. In HTTP 1.1, a range header field is introduced in the request header, which allows only a part of the resource to be requested, that is, the return code is 206 (partial) In this way, developers can choose freely to make full use of bandwidth and connection.
- Error notification management: 24 error status response codes are added in HTTP 1.1, for example, 409 (conflict) indicates that the requested resource conflicts with the current state of the resource; 410 (Gone) indicates that a resource on the server is permanently deleted.
- Host header processing: in HTTP 1.0, it is considered that each server is bound with a unique IP address, so the URL in the request message does not pass the host name. However, with the development of virtual host technology, there can be multiple virtual hosts (multi homed web servers) on a physical server, and they share an IP address. Both request message and response message of HTTP 1.1 should support host header domain, and if there is no host header domain in the request message, an error (400 bad request) will be reported.
- Long connection: http 1.1 supports persistent connection and pipelining processing of requests. Multiple HTTP requests and responses can be transmitted on a single TCP connection, which reduces the consumption and delay of establishing and closing connections. Connection: keep alive is turned on by default in HTTP 1.1, which makes up for the disadvantage of creating connections for each request in HTTP 1.0 to a certain extent. By setting the request header and response header of HTTP, this channel can be reused in the next request after the end of this data request to avoid re handshaking.
- Http2.0 vs. http1. X
- New binary format: http1. X parsing is based on text. There are natural defects in format parsing based on text protocol. There are various forms of text. There must be many scenarios to consider in order to achieve robustness. Binary is different. It only recognizes the combination of 0 and 1. Based on this consideration, http2.0 protocol parsing adopts binary format, which is convenient and robust.
- Multiplexing: connection sharing, that is, every request is used as a connection sharing mechanism. A request corresponds to an ID. in this way, there can be multiple requests on a connection. The requests of each connection can be mixed up randomly. The receiver can assign the requests to different server requests according to the ID of the request.
- Header compression: as mentioned above, http2.0 uses the hpack algorithm specially designed for header compression to reduce the size of the header to be transmitted, and both sides of the communication cache a header The fields table not only avoids the transmission of repeated headers, but also reduces the size of the required transmission.
- Server push: server push can bring the resources needed by the client with it index.html Send them to the client together, eliminating the need for the client to repeat the request. Because there are no requests, connections and other operations, static resources can be pushed by the server to greatly improve the speed. For example, my web page has one sytle.css The request is received at the client sytle.css At the same time, the server will sytle.js When the client tries to get it again sytle.js You can get it directly from the cache when you send it, so you don’t need to send any more requests.
- Comparison of HTTPS and http
- HTTPS protocol needs to apply for certificate from ca. generally, there are few free certificates, so it needs to pay.
- HTTP protocol runs on top of TCP, all transmitted contents are plaintext, HTTPS runs on top of SSL / TLS, SSL / TLS runs on top of TCP, all transmitted contents are encrypted.
- HTTP and HTTPS use completely different connection modes and different ports. The former is 80 and the latter is 443.
- HTTPS can effectively prevent operators from hijacking and solve a big problem of anti hijacking.
2. Introduction to HTTPS
Before transmitting data, HTTPS needs a handshake between the client (browser) and the server (website). In the handshake process, the password information of encrypted transmission data will be established. TLS / SSL protocol is not only a set of encrypted transmission protocol, TLS / SSL uses asymmetric encryption, symmetric encryption and hash algorithm
3. The simple description of the process of the HTTPS handshake is as follows:
1. The browser sends a set of encryption rules it supports to the website.
2. The website selects a group of encryption algorithm and hash algorithm, and sends its identity information back to the browser in the form of certificate. The certificate contains the website address, encrypted public key, and the certificate authority and other information.
3. After obtaining the website certificate, the browser should do the following work:
A. verify the validity of the certificate (whether the certificate issuing authority is legal, whether the website address contained in the certificate is consistent with the address being visited, etc.). If the certificate is trusted, a small lock will be displayed in the browser bar, otherwise, a prompt that the certificate is not trusted will be given.
B. if the certificate is trusted or the user accepts the untrusted certificate, the browser will generate a string of random number passwords and encrypt them with the public key provided in the certificate.
C. use the agreed hash to calculate the handshake message, use the generated random number to encrypt the message, and finally send all the previously generated information to the website.
4. After receiving the data from the browser, the website should do the following operations:
A. use your private key to decrypt the information, take out the password, use the password to decrypt the handshake message sent by the browser, and verify whether the hash is consistent with the message sent by the browser.
B. encrypt a handshake message with a password and send it to the browser.
5. The browser decrypts and calculates the hash of the handshake message. If it is consistent with the hash sent by the server, the handshake process ends. After that, all the communication data will be encrypted by the random password generated by the browser and the symmetric encryption algorithm.
4. What is the mechanism of cookie and session? What’s the difference?
Session is based on cookie.The cookie is saved in the client browser, while the session is saved on the server. Cookie mechanism is to check the “pass” on the customer to determine the customer’s identity, then session mechanism is to check the “customer details” on the server to confirm the customer’s identity. Session is equivalent to a client file created by the program on the server. When a client visits, he only needs to query the client file table.
The difference between cookie and session:
- Location of existence:
Cookies exist in the client and temporary folder; sessions exist in the server’s memory, and a session domain object serves a user browser
The cookie is stored in the client in plaintext, which has low security and can be encrypted by an encryption algorithm; the session is stored in the server’s memory, so it has good security
- Life cycle (take 20 minutes as an example)
The life cycle of a cookie is cumulative. The timer starts when it is created, and the cookie life cycle ends after 20 minutes;
The life cycle of a session is interval. When the session is created, the time starts. For example, if the session is not accessed within 20 minutes, the session life cycle is destroyed. However, if a session has been accessed within 20 minutes, such as at the 19th minute, the lifetime of the session is recalculated. Shutting down will cause the end of the session life cycle, but it has no effect on cookies
- Access scope
The cookie is shared by multiple user browsers, and the session is exclusive to one user browser
5. Storage of browser
|Data lifecycle||Generally generated by the server, you can set the expiration time||It’s always there unless it’s cleaned up||Clean up when the page is closed||It’s always there unless it’s cleaned up|
|Data storage size||4K||5M||5M||infinite|
|Communication with server||It will be carried in the header every time, which will affect the request performance||No participation||No participation||No participation|
Supplement: cookies are not originally used to store, but to communicate with the server. If you need to access them, please encapsulate the API yourself.
Localstorage comes with getitem and setitem methods, which are very convenient to use.
Note for localstorage:
- Localstorage can only store strings, and access to JSON data requires cooperation JSON.stringify () and JSON.parse ()
- In case of browser with setitem disabled, you need to use try… Catch to catch exception
6. What happens when you enter a URL?
- DNS domain name resolution (domain name resolution to IP address, follow UTP protocol, so there will be no handshake process): the browser resolves the URL to the corresponding server’s IP address (1. Search in the DNS cache of the local browser 2. Send query request to the system DNS cache 3. Cache to the router DNS 4. Network operator DNS cache 5. Recursive search), and resolves the port number from the URL
- The browser establishes a TCP connection with the target server (three handshakes)
- The browser sends an HTTP request message to the server
- The server returns an HTTP response message to the browser
- Browser rendering
- Close TCP connection (four waves)
7. Browser rendering steps
- HTML parses DOM tree
- Parsing style rules from CSS
- They are associated to generate render tree
- Layout calculates the information of each node according to the render tree
- Painting renders the whole page according to the calculated information
In the process of parsing a document, if the browser encounters a script tag, it will immediately parse the script and stop parsing the document (because JS may change Dom and CSS, which will cause waste if it continues to parse).
If it is an external script, it will wait for the script to download and continue to parse the document. Now the attribute “defer” and “async” have been added to the script tag. Script parsing will parse out the places where Dom and CSS are changed in the script and add them to DOM tree and style rules
8. What is homology strategy and its restricted content?
Homology policy is a kind of convention, which is the core and basic security function of browser. Without homology policy, browser is vulnerable to XSS, CSRF and other attacks. The so-called homology means that “protocol + domain name + port” are the same, even if two different domain names point to the same IP address, they are not homologous.
9. The content of homology policy restriction is as follows:
- Cookie, localstorage, indexeddb and other storage content
- DOM node
- After the Ajax request was sent, the result was intercepted by the browser
However, there are three tags that allow cross domain loading of resources:
< H4 id = "10, cross domain solution" > 10, cross domain solution < / H4 >
<p>< strong > using the vulnerability of < code > < script > < / code > tags without cross domain restrictions, web pages can get JSON data dynamically generated from other sources. Jsonp requests must be supported by the server of the other party. </strong></p>
<p>< strong > CORS needs both browser and back-end support. IE 8 and 9 need to achieve < / strong > through xdomainrequest. </p>
<p>The browser will communicate with CORS automatically. The back end is the key to realize CORS communication. As long as CORS is implemented in the back end, cross domain is realized. </p>
<p>The PostMessage () method allows scripts from different sources to communicate with each other in an asynchronous way, which can realize cross text file, multi window and cross domain message delivery. </p>
<p>Websocket is a two-way communication protocol. After the connection is established, the server and client of websocket can send or receive data to each other actively</p>
< H5 id = "5nginx reverse proxy" > 5. Nginx reverse proxy < / H5 >
<p>The implementation principle is similar to the node middleware agent. You need to build a transit nginx server to forward requests. </p>
< H4 id = 11, page rendering optimization > 11, page rendering optimization < / H4 > 11
<p>Based on the understanding of the rendering process, the following optimizations are recommended:</p>
<li>The level of HTML document structure should be as few as possible, preferably not deeper than 6 levels</li>
<li>Try to put the script behind to avoid organizing the page loading</li>
<li>A small number of first screen styles can be put in the note</li>
<li>The style structure should be as simple as possible</li>
<li>Scripts reduce DOM operations, reduce backflow, and try to cache the style information of accessing dom</li>
<li>Try to reduce JS modification styles, which can be solved by modifying the class name</li>
<li>Reduce DOM search and cache DOM search results</li>
<li>Try to stop the animation as it scrolls off the screen or on the page</li>
< H4 id = "12, forced cache and negotiated cache" > 12, forced cache and negotiated cache < / H4 >
<li>Mandatory caching is that we set an expiration time in the HTTP response header when we request resources for the first time, which will be directly obtained from the browser within the time limit. Common HTTP response header fields, such as cache control and expires</li>
<li>The negotiation cache is used to determine whether the resources on the server have been modified through the HTTP response header fields Etag or last modified. If it has been modified, it will be retrieved from the server. If it has not been modified, it will be retrieved from the browser cache</li>
< H4 id = "13, the difference between get and post requests" > 13, the difference between get and post requests < / H4 >
<li>The get parameter is passed through the URL, and the post is placed in the body. (according to the HTTP protocol, the URL is in the request header, so the size limit is very small.)</li>
<li>The parameters passed by get request in URL are limited in length, while post does not. </li>
<li>Get is harmless when the browser goes back, and post submits the request again</li>
<li>Get requests will be actively cached by the browser, while post will not, unless manually set</li>
<li>Get is more insecure than post, because parameters are directly exposed in the URL, so they cannot be used to pass sensitive information</li>
<li>For data types of parameters, get only accepts ASCII characters, while post has no restrictions</li>
<li>Get requests can only be encoded by URL (x-www-form-urlencoded), while post supports multiple encoding methods</li>
<li>< strong > get generates one TCP packet; post generates two TCP packets < / strong >. For get requests, the browser will send the HTTP header and data together, and the server will respond to 200 (return data). For post, the browser sends the header first, the server responds to 100 continue, the browser sends the data, and the server responds to 200 OK (return data)</li>
< H4 id = 14, introduce the next 304 process > > 14, introduce the next 304 process < / H4 >
<p>a. When a browser requests a resource, it first hits the resource's expires and cache control. Expires is limited by the local time. If the local time is modified, the cache may fail. The maximum life cycle can be specified by cache control: Max age, and the status still returns 200. However, it will not request data. The word from cache can be seen clearly in the browser. </p>
<p>b. When the strong cache fails, enter the negotiation cache phase. First, verify that etagetag can ensure that each resource is unique, and resource changes will lead to Etag changes. The server determines whether to hit the cache according to the if none match value sent by the client. </p>
<p>c. In the last modify / if modify since phase of negotiation cache, when the client requests a resource for the first time, last modify will be added to the header returned by the service service. Last modify is a time to identify the last modification time of the resource. When the resource is requested again, if modify since will be included in the request header of the request, which is the last modify returned before caching. After receiving if modify since, the server determines whether the cache is hit or not according to the last modification time of the resource. </p>
< H4 id = "15, HTTP status code" > 15, HTTP status code < / H4 >
<li>1XX (temporary response) indicates the status code of the temporary response and requires the requester to continue the operation
<li>100 - the continuing requester shall continue to make the request. The server returns this code to indicate that it has received the first part of the request and is waiting for the rest</li>
<li>101 - the handover protocol requester has asked the server to switch the protocol, and the server has confirmed and is ready to switch</li>
<li>2XX (success) indicates that the status code of the request was successfully processed
<li>200 - success the server has successfully processed the request. Usually, this means that the server provided the requested web page</li>
<li>201 - the created request succeeded and the server created a new resource</li>
<li>202 - accepted the request has been accepted by the server but has not yet been processed</li>
<li>203 - the unauthorized information server has successfully processed the request, but the information returned may come from another source</li>
<li>204 - the no content server successfully processed the request but returned no content</li>
<li>205 - reset content server successfully processed the request, but no content was returned</li>
<li>206 - some get requests were successfully processed by some content servers</li>
<li>3xx (redirection) indicates that further action is required to complete the request; usually, these status codes are used for redirection
<li>300 - multiple choices the server can perform a variety of operations for requests. The server can select an operation according to the user agent or provide an operation list for the requester to select</li>
<li>301 - the requested page has been permanently moved to a new location. When the server returns this response (a response to a get or head request), it will automatically move the requester to the new location</li>
<li>302 - the temporary mobile server is currently responding to requests from web pages in different locations, but the requester should continue to use the original location for future requests</li>
<li>303 - to view other locations, the server returns this code when the requester should use a separate get request for different locations to retrieve the response</li>
<li>304 - not modified the requested page has not been modified since the last request. The server returns this response and does not return the contents of the web page</li>
<li>305 - using a proxy, a requester can only use a proxy to access the requested web page. If the server returns this response, it also indicates that the requester should use a proxy</li>
<li>307 - the temporary redirection server currently responds to requests from web pages in different locations, but the requester should continue to use the original location for future requests</li>
<li>4xx (request error) these status codes indicate that there may be an error in the request, which hinders the processing of the server
<li>400 - error the request server does not understand the syntax of the request</li>
<li>401 - unauthorized requests require authentication. The server may return this response for a web page that needs to be logged in</li>
<li>403 - prevent server from rejecting requests</li>
<li>404 - server not found, requested page not found</li>
<li>405 - Method disable the method specified in the request</li>
<li>406 - do not accept pages that cannot respond to requests using the requested content feature</li>
<li>407 - proxy authorization required. This status code is similar to 401 (not authorized), but specifies that the requester should be authorized to use the proxy</li>
<li>408 - Request timeout the server timed out while waiting for the request</li>
<li>409 - the conflict server was in conflict while completing the request. The server must include information about the conflict in the response</li>
<li>410 - deleted if the requested resource has been permanently deleted, the server returns this response</li>
<li>411 - a valid length is required. The server does not accept requests without a valid content length header field</li>
<li>412 - precondition not met - the server does not meet one of the preconditions set by the requester in the requester</li>
<li>413 - the request entity is too large for the server to process the request because the request entity is too large for the server to process</li>
<li>414 - the requested URI is too long. The requested URI (usually URL) is too long for the server to process</li>
<li>415 - media type is not supported. The requested format is not supported by the requested page</li>
<li>416 - the request scope does not meet the requirements. If the page cannot provide the requested scope, the server will return this status code</li>
<li>417 - expectations not met the server did not meet the requirements of the expected request header field</li>
<li>5xx (server error) these status codes indicate that an internal error occurred while the server was trying to process the request. These errors may be caused by the server itself, not by the request
<li>500 - server internal error the server encountered an error and was unable to complete the request</li>
<li>501 - not yet implemented, the server does not have the capability to complete the request. For example, this code may be returned when the server does not recognize the request method</li>
<li>502 - error gateway server as gateway or proxy cannot receive invalid response from upstream server</li>
<li>503 - server unavailable the server is currently unavailable (due to overload or downtime maintenance). Usually, it's just a temporary state</li>
<li>504 - gateway timeout server acts as gateway proxy, but does not receive request from upstream server in time</li>
<li>505 - the HTTP version is not supported. The server does not support the HTTP protocol version used in the request</li>
<p>< strong > thank you < / strong ></p>
<p>As well as industrious oneself, < A=“ https://blog.guizimo.top/ "> personal blog < / a >, < a https://tangleia.github.io/ ">GitHub</a></p>
<p><img src=" https://img2020.cnblogs.com/other/1973296/202010/1973296-20201029150559940-324323597.png "Alt=" WeChat official account "loading=" lazy "></p>"