7. “Illustrated HTTP” – HTTP Header and HTTP Collaboration Server

Time:2022-11-23

tjhttp 7. “Illustrated HTTP” – HTTP header and HTTP collaborative server

knowledge points

  1. There are many types of request header fields. This chapter introduces the following headers, which contain a lot of content. Just be familiar with common request headers.

    1. Introduction to the first field
    2. Non-HTTP1.1 header fields
    3. General header
    4. request header
    5. response header
    6. Payload header (entity header)
    7. other header fields
  2. Collaboration server refers to some middleware set up for HTTP to speed up access. The content introduction is relatively scarce, and there is no personal supplement. Simply browse

7.

7.

7-1. HTTP header

Although you can’t feel it at ordinary times, it is something that is used on the Internet every day. This book spends more than 50 pages introducing it, which shows its importance.

The HTTP header consists of three parts, the message header, the blank line and the message body. The message header contains the important transmission information of the client, while the message body is the “load data”, which contains the data that needs to be transmitted to obtain the server information.

7.

The HTTP message is composed of method, URI, HTTP version, HTTP header field and other parts.

7.

The following is the case information of the request message:

GET / HTTP/1.1
Host: hackr.jp
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*; q=0.8
Accept-Language: ja,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip, deflate
DNT: 1
Connection: keep-alive
If-Modified-Since: Fri, 31 Aug 2007 02:02:20 GMT
If-None-Match: "45bae1-16a-46d776ac"
Cache-Control: max-age=0

The response message structure is as follows:

7.

Response message content:

HTTP/1.1 304 Not Modified
Date: Thu, 07 Jun 2012 07:21:36 GMT
Server: Apache
Connection: close
Etag: "45bae1-16a-46d776ac"

7.0 Introduction to header fields

Header fields are an important part of HTTP.

HTTP header field structure

The first field consists of key/value field names and field values ​​separated by colons. Field values ​​can be a single value or multiple values, and multiple values ​​will be separated by commas.

What if header fields overlap? It is not clearly stipulated in the specification, and it depends on how the browser and the implementer handle it. For example, some browsers will give priority to the header field that appears for the first time, while others will give priority to the last header field.

Classification of header fields

  • General Header Fields: Request and response general headers.
  • Request Header Fields: The header used when sending a request message from the client to the server.
  • Response Header Fields: The header used when returning a response message from the server to the client.
  • load(entity)Entity Header Fields: The header information used in the payload may exist on both the client and the server.

HTTP/1.1 header fields

The following are several tables about the header fields, and the four categories corresponding to the header field categories:

General header field

7.

request header field

7.

response header field

7.

payload header field

7.

7.1 Non-HTTP/1.1 header fields

In addition to the header fields defined above, the unofficial header fields used in HTTP protocol communication are summarized inRFC4229 HTTP Header Field RegistrationsIf you are interested, you can go directly to the webpage to see the relevant white paper information.

Caching Proxy Behavior

Caching proxy behavior passes two fields:End-to-end HeaderandHop-by-hop Header

For the first end-to-end header (End-to-end Header) will forward the request and response information to the final destination and must exist in the response generated by the cache, the requirement is that it must be forwarded at the same time.

The second hop-by-hop header (Hop-by-hop Header) is only valid for a single forwarding, and will not be forwarded if it passes the cache or proxy. In addition, when using the hop-by-hop header, the Connection header field needs to contain the following content:

Connection
Keep-Alive
Proxy-Authenticate
Proxy-Authorization
Trailer
TE
Transfer-Encoding
Upgrade

7.2 Common header fields

The general header field information includes the following contents:

7.2.1 Cache-Control

As the name implies, it is used to operate the first field of the cache, the caseCache-Control: private, max-age=0, no-cache, the cache header field basically has the following values. It is necessary to specify the maximum response age and the maximum valid time of the cache to prevent the cache from being valid for too long and invalidating it too short.

Cache request tableandResponse command reference tableas follows:

7.

7.

public directive(Cache-Control: public)

Cache-Control: public, such a header statement indicates that other users can also use this cache, which means that this is public cache information.

private directive(Cache-Control: private)

Cache-Control: privateContrary to the public command, it can only be used as an object for specific users, and the cache server will cache data for specific users, while other users do not use this behavior.

no-cache directive(Cache-Control: no-cache)

The purpose is to prevent expired resources from being returned from the cache. Indicates that cached data will not be accepted for each request. If this instruction is carried in the request, the returned content cannot be cached data.

Note ⚠️: It is easy to misunderstand no-cache as no caching in the literal meaning, but in factno-cache means not to cache expired resources, the cache will process the resource after confirming the validity period with the source server.

Cache-Control: no-cache=Location

If a specific parameter value is specified in cache-Control, the client cannot cache after receiving the header corresponding to the specified parameter value. The difference of this instruction is that the server specifies that the client is not allowed to perform caching operations.

Directives that control objects that can be cached

no-store directive(Cache-Control: no-store)

​ Indicates that the request or response has confidential information. This directive specifies that caches must not store any part of the request or response locally.

s-maxage directive(Cache-Control: s-maxage=604800 (unit: second)

​ Same as the max-age directive, the difference is that the s-maxage directive is only applicable to public caching servers used by multiple users, and it is invalid for the same user to repeatedly return a response to this field.

Note ⚠️: It will be ignored after using s-maxageExpirefield.

max-age directive

​ Client: Specifies the resource that accepts the maximum cache time. Resources higher than this time do not accept cached data. If it is 0, it means that the source server needs to be requested every time.

max-stale directive(Cache-Control: max-stale=3600 (unit: second))

​ max-stale Indicates cache resources, and expiration should be accepted as usual. If the command does not specify a parameter value, the client will receive the response. Even if the specified parameter expires, it can still be received by the client as long as it is within the specified value.

only-if-cached directive(Cache-Control: only-if-cached)

​ Indicates that only the cached resources of the target server are obtained on the cache server, and returns if the cache server has no data504 status code

504Gateway timeout: When the server acts as a gateway or proxy, no response is received. The difference from 408 is that 408 means that the server accepts the client timeout, and 504 means that the agent receives the server timeout.

must-revalidate directive(Cache-Control: must-revalidate)

​ Indicates that the proxy will re-verify to the source server whether the response cache to be returned is still valid. If it is invalid, the cache server is required to return a 504 status code.

Note ⚠️: the must-revalidate directive ignores the requested max-stale directive.

proxy-revalidate Directive(Cache-Control: proxy-revalidate)

​ Requires all cache servers to verify cache validity before returning a response to a request from a client with instructions.

no-transform directive(Cache-Control: no-transform)

Requests and responses cannot accept changing payload media types.

Cache-Control extensioncache-extension token Cache-Control: private, community="UCI"This way of writing means to extend the command to change the header field through the token mark, such ascommunityThis directive does not exist, but compatibility is achieved through such extensions. But this kind of compatibility can only be responded by the cache server that understands it, and other cache servers will directly ignore it.

7.2.2 Connection

The role of this header field is as follows:

  • Controls which header fields are not forwarded to proxies.
  • Manage persistent connections.

Controls which fields are no longer forwarded to the proxy

​ Can control the header fields that are no longer forwarded to the proxy (ie the Hop-by-hop header).

Manage persistent connections

​ If the server side wants to explicitly disconnect, this operation can be done by specifying the value of the Connection header field as Close. But it should be noted that HTTP1.1 defaults toKeep-Alivepersistent connection.

​ On the contrary, the previous versions are all non-persistent connections. If you want to achieve the same effect as HTTP1.1, you needConnection:Keep-AliveThis is done.

7.2.3 Date(Date: Tue, 03 Jul 2012 04:40:59 GMT)

​ Indicates the date and time when the HTTP message was created.

​ The HTTP/1.1 protocol will use the date and time format specified in RFC1123 by default:

Date: Tue, 03 Jul 2012 04:40:59 GMT

​ Versions before HTTP1.1 use the following content, the protocol used is RFC850, and the main content is as follows:

Date: Tue, 03-Jul-12 04:40:59 GMT

​ Another way is to use the C standard libraryasctime() functionThe output format is consistent:

Date: Tue Jul 03 04:40:59 2012

7.2.3 Pragma(Pragma: no-cache)

Pragma isHTTP/1.1The historical legacy fields of previous versions exist for backward compatibility after HTTP1.0, and the standardized content form only exists, such as the following content:Pragma: no-cache

Mainly used by the client to inform the server not to accept cached content, this field andCache-Control:no-cacheIt is ideal to specify cache handling.

Cache-Control: no-cache
Pragma: no-cache

7.2.4 Trailer(Trailer: Expires)

Indicates what kind of header field is recorded after the message body, mainly used for chunked transfer encoding of HTTP1.1.

HTTP/1.1 200 OK
Date: Tue, 03 Jul 2012 04:40:56 GMT
Content-Type: text/html
...
Transfer-Encoding: chunked
Trailer: Expires
...(message body)...
0
Expires: Tue, 28 Sep 2004 23:59:59 GMT

The above example usesExpiresfield specifies the expiration date of the resource.

7.2.5 Transfer-Encoding(Transfer-Encoding: chunked)

Specifies the encoding method used when transmitting messages. HTTP1.1 transfer encoding can only be applied to block transfer encoding.

7.2.6 Upgrade

It means trying to use a higher version of the protocol to communicate with the server, but it is not necessarily the HTTP protocol, and a completely different protocol can be specified.

7.

The examples in the book use the TLS protocol only for verification. Pay attention to the details of the transmitted message. For example, if Upgrade is specified in the Connection, the scope of action is the client and the adjacent server, so you need to specifyConnection: Upgradeto take effect.

In addition, when the service encounters a request with Upgrade, it can use the return code 101 as the response code to return.

The classic usage scenario of Upgrade is the WebSocket upgrade protocol.

7.2.7 Via

It is mainly used for the transmission path between the request and response message from the final client to the server. When the message passes through the agent and gateway, the server information will be added to the Via and then forwarded. The header field Via is not only used to track the forwarding of the message, but also to avoidrequest loopbackhappened.

7.

Every time the request passes through the proxy server, the Via field in the header will be increased once. The VIa field is used to track the propagation path, usually withTRACEmethod is used together ifMax-ForwardIf it becomes 0, the forwarding operation between proxy servers will be stopped.

7.2.8 Warning

The Warning header of HTTP/1.1 evolved from the response header (Retry-After) of HTTP/1.0.

The following is the corresponding composition format:

Warning: [Warning Code][Warning Host: Port Number]"[Warning Content]" ([Date Time])

Seven types of warning codes are defined in HTTP1.1. The warning codes are usually only used as a reference and may be expanded later.

7.

7.3 Request header fields

The request header is the field passed from the client to the server.

7.

7.3.1 Accept(Accept: text/html,application/xhtml+xml,application/xml;q=0.)

The header field can inform the server of the media types that the user agent can handle and the relative priority of the media types.

  • text file

    • text/html, text/plain, text/css …
    • application/xhtml+xml, application/xml …
  • picture file

    • image/jpeg, image/gif, image/png …
  • video file

    • video/mpeg, video/quicktime …
  • Binaries used by the application

    • application/octet-stream, application/zip …

case:

such as usingtype/subtypeIn this form, multiple media types are specified at once, viaq=?Specify the weight value, the default weight is 1, you can set the weight to three decimal places. Assuming that the server can provide multiple types of information at one time, the media type data with the highest weight value will be provided first.

7.3.2 Accept-Charset(Accept-Charset: iso-8859-5, unicode-1-1;q=0.8)

The main function is to inform the server of the character set supported by the user agent and the relative priority of the character set. The same as the header field Accept, the weight q value can be used to indicate the relative priority.

The main function of this field is the content negotiation mechanismserver-driven negotiation

7.3.3 Accept-Encoding(Accept-Encoding: gzip, deflate)

The main function is to inform the server of the request encoding and priority order supported by the user agent, and supports specifying multi-level encoding at one time. The related cases of encoding are as follows:

gzip: The encoding format (RFC1952) generated by the file compression program gzip (GNU zip), using the Lempel-Ziv algorithm (LZ77) and 32-bit cyclic redundancy check (Cyclic Redundancy Check, commonly known as CRC).

compress: The encoding format generated by the UNIX file compression program compress, using the Lempel-Ziv-Welch algorithm (LZW).

deflate: A combination of the zlib format (RFC1950) and the encoding format generated by the deflate compression algorithm (RFC1951).

identity: The default encoding format that does not perform compression or does not change.

Note that you can also use q=? to represent the weight value, the meaning is consistent with the effect of Accept, and finally pay attention to use*number as a wildcard.

7.3.4 Accept-Language(Accept-Language: zh-cn,zh;q=0.7,en-us,en;q=0.3)

The main function is to inform the server of the natural language set supported by the user agent and the order of priority, and supports specifying multiple language levels at one time.

You can also use q=? Indicates the weight value, and returns the final supported language set according to the supported languages ​​as the result.

7.3.5 Authorization(Authorization: Basic dWVub3NlbjpwYXNzd29yZA==)

Like the name, the main function is to inform the server of user authentication information. This request header is often used for interface docking and development. Usually, a return code of 401 will be returned to users who do not have permission, telling them that they do not have permission to access the server.

7.3.6 Expect(Expect: 100-continue)

The client informs the server of some expected behavior, but if the server cannot understand the client’s response, it will return 417 bad. Clients use this field to indicate their expectations. But HTTP1.1 actually only specifiesExpect: 100-continue, indicating that the client whose status code response is 100 needs to specify this field.

417expresses the failure of expectations

The purpose of designing the 100 (Continue) HTTP status code in the HTTP/1.1 protocol is that before the client sends a Request Message, the HTTP/1.1 protocol allows the client to first determine whether the server is willing to accept the message body sent by the client (based on Request Headers ).

The main situation is that if the client wants to send a data packet to the server, but if the server cannot process it or refuses to process it, this field is similar to a notification in advance.

The meaning of this field is to add “state” to HTTP1.X, but this state cannot be regarded as a standard in the strict sense, so HTTP1.X is still stateless.

7.3.7 From

Indicates the email address of the user agent. Note that sometimes e-mail addresses are recorded in theUser-Agentheader field.

7.3.8 Host(Host: www.hackr.jp

The Host header field is the only header field that must be included in the request in the HTTP/1.1 specification.

Indicates the IP address and port number information of the requester.

Why must there be a Host header? This is closely related to the working mechanism of a virtual host that assigns multiple domain names to a single server.

7.3.9 If-Match

7.

Take it like thisIfThe request header fields of the prefix are all conditional requests. After the server receives the attached conditions, it needs to determine that they are true before executing the request.

7.

As shown above onlyif-matchandEtagThe server will accept the request only when the value matches, and return a 412 response status code if it does not match. Alternatively you can use an asterisk to ignoreEtagA value of , which is accepted as long as the resource is available.

7.3.10 If-Modified-Since(If-Modified-Since: Thu, 15 Apr 2004 00:00:00 GMT)

If the resource is later than the time specified in this field, it is hoped that the server can process the resource request. Otherwise, if the resource time has not changed, a 304 response needs to be returned.

If-Modified-Since is used to confirm the validity of local resources owned by the proxy or client

7.3.11 If-None-Match

andIf-MatchOn the contrary, only inEtagvalue andIf-None-MatchThe request is processed only when the value of is different. The function of this method is to obtain real-time information in GET and HEAD requests, similar to the header fieldIf-Modified-Since

7.3.12 Proxy-Authorization(Proxy-Authorization: Basic dGlwOjkpNLAGfFY5)

The challenge request returned by the proxy server contains the authentication of the client, which is similar to the HTTP authentication between the client and the server.

7.3.13 Range(Range: bytes=5001-10000)

The header Range can inform the server of the specified range of resources, and the above bytes include resource content from 5001 to 10000 bytes.

Returns if the associated request can be processed206 Partial Content response, if not, return 200 normally.

206Partial Content: The server sends only part of the resource.

7.3.14 Referer(Referer: http://www.hackr.jp/index.htm

The header field Referer tells the server the URI of the original resource requested.

Note that the URL of the original resource may contain some sensitive information such as ID and password, which may be leaked if it is written to Reffer and passed to other servers.

The correct spelling of Referer should be Referrer. The reason is probably that Laomei thought the words were more difficult to read when designing it.

7.3.15 TE(TE: gzip, deflate;q=0.5)

Indicates the encoding method and priority that the server client can handle the response, similar to the Accept-Encoding field, but mainly used for transmission encoding. You can also specifyTE: trailersDo chunked transfer encoding.

7.3.16 User-Agent(User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;)

User-Agent is used to convey the type of browser, and the header field will pass the creation request browser and user agent information to the server for processing.

7.4 Response header fields

​ The response header field refers to the header used when returning a response message from the server to the client.

7.

7.4.1 Accept-Ranges(Accept-Ranges: bytes)

When range requests cannot be handled, it is necessary to specifyAccept-Ranges: none

Mainly inform the client of the range of requests that the server can handle, such as specifying Byte to process bytes.

7.4.2 Age(Age: 600)

Indicates how long ago the origin server created the response, the field value is seconds. If you create a responsive cache server, this time is after the Age cacheResponse re-initiatedThe time from certification to certification completion. The proxy server needs to add the header fieldAge

7.4.3 ETag(ETag: “82e22293907ce725faf67773957acd12″)

The load flag that can inform the client, in a way that can uniquely identify the resource as a string. The server will assign each resourceEtagIn addition, it is necessary to pay attention to the resource update requirements andEtagKeep it updated as well.

soEtagIt is used to distinguish different access resources with the same URI but different languages. In addition,EtagThere are strong and weak, strongEtagIt will be refreshed immediately when the resource changes, and the weakEtagAfter the resource is changed, add it to the resource headerW/An ID that identifies a resource change.

7.4.4 Location(Location: http://www.usagid…)

It is used to indicate that the response receiver is directed to a resource that is at a different location from the request URL. will also cooperate3xx RedirctionRedirection returns, almost all browsers will try to complete the behavior of resource redirection after receiving this field.

7.4.5 Proxy-Authenticate(Proxy-Authenticate: Basic realm=”Usagidesign Auth”)

header fieldProxy-AuthenticateThe authentication information required by the proxy server will be sent to the client. Note that it is different from the HTTP access authentication between the server and the client. This isAuthentication between proxy server and client.

7.4.6 Retry-After(Retry-After: 120)

This field indicates how long after the request can be retried, it can be used with status code 503, or used with 3XX Redirect. The field value can be a number or a specific datetime, or the number of seconds since the response was created.

7.4.7 Server(Server: Apache/2.2.17 (Unix))

Inform the client of the application information of the current server, which may include software version number information, etc.

7.4.8 Vary(Vary: Accept-Language)

Indicates that if the specified resource request is usedAccept-languageIf the contents of the fields are the same, the response will be returned directly from the cache, otherwise it needs to be returned only from the source server.

So this field is suitable for controlling caching, and the origin server will pass the local cache usage method and call command to the proxy server.

If you want to get the cache you need and includeVaryOnly the request specified by the field content can be obtained, so even if this request is exactly the same as the previous one, as long as the Vary of the request is inconsistent, it still needs to be obtained from the source server.

7.4.9 WWW-Authenticate(WWW-Authenticate: Basic realm=”Usagidesign Auth”)

It is mainly used for HTTP access authentication to inform the client of the authentication method applicable to the resource specified in the access request. If a 401 response code is returned, this field will be returned together. Note the case hereBasic realm="Usagidesign Auth"Used to indicate the protection policy that the resource is subject to.

401Unauthorized: The client needs authorization to access the requested resource. Response content needs to containwww-AuthnticateHeader information and query information, if there is already a certificate access or 401 means the certificate has not been accepted, if the 401 is the same as the previous authentication request, and the browser has made at least one retry, the browser should display the entity information contained in the response (aka diagnostic information).

7.5 Payload header fields

Because of the new HTTP 2.0 protocol, we want to call it the payload header here, and the concept of the entity header has been abandoned. The payload header indicates the request header information of the entity content, which can be considered as the cargo information of the courier list on the courier.

7.

7.5.1 Allow(Allow: GET, HEAD)

​ Notify the client of all HTTP methods of the specified resource. If not supported, a 405 response will be returned.

405 Method Not Allowed: The server received and recognized the request, but rejected the specific request method. The response must return an Allow header indicating the list of request methods accepted by the current resource.

Some request methods such as PUT and DELETE that modify server resource data are generally not allowed.

7.5.2 Content-Encoding(Content-Encoding: gzip)

Indicates the content encoding used by the server for the body of the payload, and should be compressed without loss of content.

The main supported encoding methods are as follows:

  • gzip
  • compress
  • deflate
  • identity

7.5.3 Content-Language(Content-Language: zh-CN)

Informs the client of the language principal used by the server.

7.5.4 Content-Length(Content-Length: 15000)

Tell the entity body part size (in bytes), but this field cannot be used once content-encoded transmission is used.

Can refer tohttps://tools.ietf.org/html/rfc72314.4 Understand the content-length calculation of the encoding format.

7.5.5 Content-Location(Content-Location: http://www.hackr.jp/index-ja.html

Give the URI corresponding to the message payload part. This field indicates the URI corresponding to the resource returned by the message payload.

For example, the actual URI that appears in the Accept-Language field may be different from the returned URI, so it needs to be marked in this field.

7.5.6 Content-MD5(Content-MD5: OGFkZDUwNGVhNGY3N2MxMDIwZmQ4NTBmY2IyTY==)

The client performs MD5 encryption on the content of the received message payload to ensure the integrity of the message during transmission.

However, it should be noted that Base64 encryption is required after the MD5 encryption of the packet payload. This is because the HTTP header cannot record binary content. When the packet is accepted, the MD5 algorithm is also used to decrypt it, and the payload content is verified to be complete.

However, it should be noted that this field cannot verify whether the MD5 encryption has been tampered with while verifying the integrity, so the security guarantee is not good.

7.5.7 Content-Range(Content-Range: bytes 5001-10000/10000)

Inform the client which part of the payload returned as a response conforms to the range request, and which part conforms to the request. The unit of the field value is byte, indicating the current sending part and the entire entity size.

7.5.8 Content-Type(Content-Type: text/html; charset=UTF-8)

Describes the media type of the object in the payload body, and the header fieldAcceptSimilarly, the field value is usedtype/subtypeformal assignment.

The parameter charset usesiso-8859-1oreuc-jpand other character sets for assignment.

7.5.9 Expires

The header field Expires will inform the client of the expiration date of the resource. If you do not want the resource to be cached, it should be the same as the first field Date in the first field.

It should be noted that when Cache-Control specifies the max-age instruction,Compared with the header field Expires, max-age processing will be prioritized

7.5.10 Last-Modified(Last-Modified: Wed, 23 May 2012 09:59:55 GMT)

Last-Modified Indicates when the resource was last modified, actually passedRequest-URI Specifies when the resource was modified. The actual case is that it is possible to change this time when using CGI for dynamic data processing.

7.6 The first field of Cookie service

Although Cookie is not a specification of HTTP1.1, it is widely used in the WEB field. The basic function of cookies is to save the user’s access information and state management. At the same time, writing some data to the client can simplify user operations during the next visit and reduce some pressure on the server.

7.6.1 Cookie(Cookie: status=enable)

This header field will inform the server that it wants to obtain HTTP status support management. At this time, the request will contain multiple cookies and can be sent according to the cookies.

For officially released cookies, since the validity period, domain name and path of the sender, protocol information, etc. can be verified, it is relatively safe from external attacks.

By the way, let me talk about the history of cookies. Cookies were originally developed and standardized by Netscape, but the following protocol specifications appeared in the subsequent development:

  • Netscape Standard (de facto standard)

    Released around 1994, the currently popular standard is basically the model at this time. The Netscape standard is determined by a 5-page paper written by a 24-year-old master. Currently, I can’t find any relevant normative links. You can refer to RFC6265 to see some The first clues.

  • RFC2109 (Troublemaker No. 1)

    Surprisingly, this is a standard released by W3C. Its original intention is to be compatible with the standard set by Netscape (in fact, it wants to replace it). However, because the standard is too strict, and many service implementers have wrongly implemented this standard, it is still Changed back to Netscape standards.

  • RFC2965 (Troublemaker No. 2)

    RFC2965 defines Cookie2 and tries to solve the shortcomings of RFC2109 about Cookie1. RFC2965 aims to replace RFC2109.

    Servers sending RFC2965 cookies will use the Set-Cookie2 header in addition to the Set-Cookie header. Note that RFC2965 cookies are very port sensitive.

    RFC2965 is available athttp://www.w3.org/Protocols/rfc2965/rfc2965.txt, but actually belonged to W3C black history was deleted,

    Finally pass:RFC 2965 – HTTP State Management Mechanism (ietf.org)can read and understand

    Unfortunately, the W3C still didn’t succeed, because there were basically not many servers in use.

  • RFC6265: W3C finally gave up competing for the standard. RFC6265 is the product of redefining the standard according to the Netscape standard, and finally becomes the industry’s de facto standard. (Inheriting the big brother, unifying everything)

    But the result is still not using any protocol of RFC, the standard of Netscape.

    From the results, we can think that RFC6265 is a standard that is implemented first and then supplemented with design documents. Although RFC6265 is not an actual standard, it is a standard specification publicly recognized by the white paper, that is, from the original oral negotiation to The difference between the standard in black and white.

    RFC 6265 – HTTP State Management Mechanism (ietf.org)

    Tucao: Therefore, only the standards that meet the market can be accepted by the public. Even a huge organization like W3C cannot shake a recognized standard.

Finally, I would like to thank IETF, which can be said to be the library of the Internet, or the development of the Internet.cornerstone. In addition, some dark history of RFC covered up by W3C has also been found, haha.

The IETF is spontaneously organized by netizens, self-managed, anyone can participate, completely democratic and equal, without a voting mechanism, and fully embodies the spirit of freedom, openness, cooperation, and sharing).

The header field style of Cookie is as follows:

7.

7.6.2 Set-Cookie

The basic format is as follows, some preparatory operations before starting to use cookies:

Set-Cookie: status=enable; expires=Tue, 05 Jul 2011 07:26:31 

The basic field attributes are as follows:

7.

expires attribute: The validity period of the sent cookie, which defaults to the session level, that is, a browser visit. In addition, it should be noted that once the cookie is created on the server, it cannot be deleted casually. It can only be overwritten to rewrite the cookie information on the client.

path attribute: Restrict the sending scope directory of the specified cookie, but there are actually ways to bypass this restriction, so this attribute is not a security attribute.

domain attribute: Match the end of the domain check. In fact, it is safer not to specify this attribute, because this attribute is similar to a whitelist that allows multiple domains to access.

secure attribute(Set-Cookie: name=value; secure): It is a relatively secure attribute to restrict sending Cookies only in HTTPS connections, which means that when the same domain name uses HTTPS, Cookies will be sent, but when converted to HTTP, Client cookies are not overwritten. On the other hand, not specifying this attribute means that no recycling will occur.

7.6.3 HttpOnly attribute

Introduction: It belongs to the extended function of Cookie itself, and its function is to prevent JS scripts from stealing Cookie information, that is, to prevent XSS attacks.

Declaration method:

Set-Cookie: name=value; HttpOnly

After passing such a statement,JavaScriptofdocument.cookie cannot read the attachedHttpOnly ofCookiecontent.

In fact, the extension HttpOnly was not invented to prevent XSS attacks, but it was later widely used as an important means of mitigating XSS attacks.

An XSS attack is similar to the following script:

http://example.jp/login?ID="> <script>var+f=document.getElementById("login"); +f.action="h </script><span+s=" HTML source code corresponding to request (excerpt)

7.6.4 Cookie(Cookie: status=enable)

The header field Cookie will inform the server that when the client wants to obtain HTTP state management support, it will include the Cookie received from the server in the request. Multiple cookies can be sent.

7.7 Other header fields

Other header fields are also HTTP’s support for open extensions. These fields do not conform to WEB standards and need to be determined by the implementation party, but the frequency of use is not low.

7.7.1 X-Frame-Options

This field is the content of the response header, and its main function is to control the display content of the Frame tag, mainly to prevent clickjacking attacks.

The following two options are available

  • DENY: Deny
  • SAMEORIGIN: Same-origin pages match permissions.

Mainstream browsers basically already support this field, the following is a reference for Apache:

<IfModule mod_headers.c>
Header append X-FRAME-OPTIONS "SAMEORIGIN"
</IfModule>

7.7.2 X-XSS-Protection(X-XSS-Protection: 1)

header field X-XSS-ProtectionIt belongs to the HTTP response header, and its main function is to control the switch of the browser’s XSS protection mechanism.

grammar:

X-XSS-Protection: 0
X-XSS-Protection: 1
X-XSS-Protection: 1; mode=block
X-XSS-Protection: 1; report=<reporting-uri>

Logo explanation:

  • 0: Disable XSS filtering.
  • 1: Enable XSS filtering (usually the browser is the default). If a cross-site scripting attack is detected, the browser will clear the page (remove the insecure parts).
  • 1;mode=block, to enable XSS filtering. Instead of clearing the page if an attack is detected, the browser prevents the page from loading.
  • 1; report=<reporting-URI> (Chromium only), to enable XSS filtering. If a cross-site scripting attack is detected, the browser will clear the page and use CSPreport-uri (en-US)The function of the command sends a violation report.

7.7.2 DNT

DNT belongs to the HTTP request header and isDo Not TrackAbbreviation for , which is mainly used to prevent advertisements from grabbing personal information.

The field values ​​that can be specified in the header field DNT are as follows.

  • 0 : Agree to be tracked
  • 1: Refuse to be tracked

Here is a useful Google plug-in“Ublock origin”, the icon resembles a small red shield.

The biggest feature is that you can use html elements to directly erase the advertising information filtering elements of the page, which is very easy to use.

7.7.3 P3P

P3P (The Platform for Privacy Preferences, online privacy preference platform) technology, through this header, private information can be processed in a way that only application programs can identify it.

The steps to create a P3P are as follows:

Step 1: Create P3P privacy.

Step 2: After creating the P3P privacy control file, save it and name it in /w3c/p3p.xml.

Step 3: After creating Compact policies from P3P privacy, output them to HTTP response.

Regarding P3P, you can continue to read the following:

The Platform for Privacy Preferences 1.0(P3P1.0)Specification http://www.w3.org/TR/P3P/

X-prefix obsolete: Use this prefix to check out non-standard parameters, and use them as an extension of non-standard parameters in turn, but in actual use, it is found that this not only leads to naming confusion, but also may affect normal communication, so in the follow-up “RFC 6648 – Deprecating the “X-” Prefix and Similar Constructs in Application Protocols“Deprecated this usage.

7-2. HTTP Collaboration Server

7.1 Single virtual machine with multiple domain names

HTTP1.1 supports the server to build multiple sites and provide WEB hosting services. The mapping and lookup work for domain names and IPs involves DNS, and domain names need to be resolved through DNS before they can be accessed. When the request is sent to the server, it is already used. IP way up.

7.2 Communication Forwarding Procedure

There are several technical terms for communication forwarding: proxy, gateway, and tunnel. The following will distinguish their concepts one by one.

Proxy: The proxy acts as a “middleman” between the server and the client. The basic behavior of the proxy server is to receive the request sent by the client and forward it to other servers. The role of the proxy is usually to speed up the access to the target site or as a springboard.

Gateway: A server that is responsible for forwarding the communication data of other servers. It is similar to a microphone for its own position. It is responsible for passing the “words” of one server to another server, so the server that sends the request itself will also be regarded as the forwarded server .

Tunnel: An application that ensures the relay between clients and servers that are far apart.

7.2.1 Proxy

The main change information of the agent is inVia header information, each proxy forwarding needs to add forwarding information in the Via header, the specific added information is as follows:

7.

For the proxy, according to whether to modify the message and whether to cache the data, it is divided intotransparent proxyandcaching proxy

  • transparent proxy: Transparent proxy refers to a proxy method that does not process any request messages.
  • caching proxy: The cache proxy usually exists in the cache server. The proxy caches the data to the cache server before forwarding the response, and then returns it to the client.

7.2.2 Cache server

The role of the cache server is to reduce the burden on the server. Using the cache can prevent the same resources from being returned from the source server repeatedly, and can directly obtain resources from the cache server. This part of the content is described in detail in the book “How the Network is Connected”.

7.2.3 Tunnel

The tunnel can establish a communication line with other servers as required, and then use encryption methods such as SSL to communicate.

Protocols that came before HTTP

  • FTP: It is earlier than the TCP/IP protocol family. Although it is surpassed by HTTP, it is still widely used for file uploading.
  • NNTP (Network News Transfer Protocol): The protocol used to transfer messages in the NetNews electronic conference room.
  • Archie: A protocol for searching file information exposed by anonymous FTP.
  • WAIS (Wide Area Information Servers): A protocol used to search multiple databases through keywords.
  • Gopher: A protocol for finding information in computers connected to the Internet.