Easy to understand HTTP caching policy

Time:2021-9-1

I wrote the last articleSource code analysis of KOA static, which usesHTTPSet some cache headers for the returned static files, such asCache-ControlOr something. So I discussed it with my friendsHTTPCache policy for:

The friend said:“HTTPIt controls the header of the cache(header)Too much. WhatCache-ControlETagLast-Modified, a lot of, messy, and the logical relationship between them is not strong. We should master the basic knowledge! “

I was a little surprised: “why recite this? All technologies exist to solve problems. If you don’t understand the problems, it’s really boring and the effect is not good to simply learn technology, recite, memorize and memorize.HTTPThe caching strategy only exists to solve the problem of asymmetric information between the client and the server. The client caches some resources to speed up the speed, but the next request, the client does not know whether the resource has been updated, the server does not know which version the client caches, and does not know whether to return resources. In fact, it is an information synchronization problem,HTTPCache strategy is to solve this problem。 If we jump out of this pure technical thinking, we will find that this information synchronization problem is also very common in life. And our ideas to solve these problems are often common. From this point of view, this problem is easy to understand! “

So I told him a story about how I rented a CD to watch Altman when I was a child.

Rent a CD to watch Altman

The thing is, when I was a child, I especially liked watching cartoons, especially Altman, but there was no computer or network at that time. I only have one DVD player, so I often go to the CD rental shop to rent Altman.

ETag

One day, I finished watching ace AltmanEpisode 10, I want to keep watching. So I found the owner of the CD store: “boss,Episode 10I’ve finished reading it. Do you have any new ones? ” The boss said, “yes, I just got outEpisode 11, take it! “

The above simple communication process actually includes oneHTTPCache technology, that isETag! Compared with the network request, I am actually the client, and the CD store is the server. When I rent a CD, it is equivalent to initiating a request. But when I went to rent a CD, the boss didn’t know which episode I saw. Our information was out of sync. So I told him a sign(Tag), here the mark isEpisode 10, the boss got this mark and compared it with the mark in his own inventory. He found that his latest mark wasEpisode 11So I know there is an update and willEpisode 11Gave it to me.

Last-Modified & If-Modified-Since

Again, I finished watching ace Altman, and I began to watch Taylor Altman. But the boss compared the chicken thief this time. He did not buy the genuine version of Taylor Altman. He copied it himself. He didn’t know which episode it was, but he cleverly wrote the date on the CD. So the one I’m looking at doesn’t have a cover, just a bare oneDecember 1, 2000。 When I finished reading this set, I went to the boss again: “boss, youDecember 1, 2000I’ve finished reading it. Do you have any new ones? ” thereDecember 1, 2000In fact, it marks the update date of the copy in my hand, which corresponds toHTTPA caching technology,That’s itLast-ModifiedandIf-Modified-Since。 You can understand that the boss also named the dateLast-Modified, so the complete text on the disc isLast modified: December 1, 2000When I went to ask, I asked, “do you have any updates if modified since December 1, 2000?”.

Expires and Max age

Go on, I’ve finished Taylor Altman and started watching Leo Altman. This “Leo Altman” is different from the previous two. When I went to rent it, the boss said, “don’t come here every day《 Leo Altman: “I’ll buy the goods once a week. Just come and get them every Monday!”This sentence also corresponds to oneHTTPCache technology, that isExpiresandMax-Age。 I know. I have the latest before next Monday. It will expire next Monday(Expire)Yes. So the saying “what I have is the latest” has a life cycle. His age is limited. His age is equal to the update time next Monday minus the current time, which is his maximum age(Max-Age)。

Immutable

Another one, I’ve finished reading Leo Altman and started watching NEXTER Altman. This “NEXTER Altman” is different from the previous ones. When I went to rent it, the boss said, “boy, you’re lucky this time. This” NEXTER Altman “has been out. Take it all, and you don’t have to ask every day!” This sentence corresponds toHTTPWhat is caching technology?Immutable, of courseImmutableJust like the literal meaning, immutable! Just like “NEXTER Altman”, it’s finished. There’s no need to ask for updates.

get down to business

That’s the end of the bullshit. Let’s get back to business! The reason for giving such an example is to illustrateHTTPThe problems to be solved by caching technology are very common in life. Starting from these common scenarios, it is easier to understand. Let’s take it seriouslyHTTPCaching technology:

Two mechanisms

As can be seen from the above small examples, sometimes in order to know whether there is an update, I have to ask the boss. For example, in the first example, “boss,Episode 10I’ve finished reading it. Do you have any new ones? “. In order to know whether there is an update, we must communicate with the server to know. We call itNegotiation cache。 In other scenarios, I can tell whether there is an update without asking. For example, in the third example, I know that it is Zhou Geng. I won’t ask until Monday. I will ask again on Monday. This method of directly using a local copy without negotiating with the server is calledForce cache。 In technical terms,Force cacheInstead of sending a request, you can directly use the local cache,Negotiation cacheSend a request to ask if the server has been updated. Let’s talk about these two caches in detail:

Negotiation cache

The first and second examples need to ask the server every time, soNegotiation cache

Etag and if none match

ETagIs the URLEntity Tag, is the identifier of a URL resource, similar to that of a filemd5, the calculation method is similar. When the server returns, you can calculate a value based on the returned contenthashValue or a numeric version number, similar to oursEpisode 10, the specific returned value depends on the calculation strategy of the server. Then put itAddresponseofheaderinside, it may look like this:

ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"

When the client gets thisETagSave it together with the return value, and use the matching when the next request is madeIf-None-Match, put thisput torequestofheaderinside, it may look like this:

If-None-Match: "33a64df551425fcc55e4d42a148795d9f25f89d4"

Then the server gets the information in the requestIf-None-MatchWith the current versionETagCompare the following:

  1. If it is the same, return directly304, semanticNot Modified, no content returned(body), return onlyheader, tell the browser to use the cache directly.
  2. If not, return200And the latest content

AndETagThere is also a less commonly used onerequest header —-If-Match, this and the frontIf-None-MatchThe meaning of is opposite. frontIf-None-MatchThe semantics of isIf it doesn’t match, download it。 andIf-MatchUsually used forpostperhapsputIn the request, the semantics isSubmit if matchedFor example, if you are editing a product, others may be editing it at the same time. When you submit an edit, others may have submitted it before you. At this time, the serverETagIt has changed,If-MatchIt doesn’t work. At this time, the server will return it to you412Error, that isPrecondition Failed, precondition failed. IfIf-MatchOnce established, it will return normally200

Last-Modified & If-Modified-Since

Last-ModifiedandIf-Modified-SinceIt is also used together, similar toETagandIf-None-MatchRelationship. JustETagPut a version number orhashValue,Last-ModifiedIt is the last modification time of the resource.Last-ModifiedIs putresponseofheaderInside, it may look like this:

Last-Modified: Wed, 21 Oct 2000 07:28:00 GMT 

When using the client browser, the supportingIf-Modified-Sinceput torequestofheaderinside, it looks like this:

If-Modified-Since: Wed, 21 Oct 2000 07:28:00 GMT 

After receiving this header, the server will compare it with the modification time of the current version:

  1. The modification time of the current version is later than this, that is, it has been changed after this time. Return200And new content
  2. The modification time of the current version is the same as this one, that is, there is no update. Return304, no content is returned, only back, and the client directly uses the cache

AndIf-Modified-SinceCorresponding toIf-Unmodified-SinceIf-Modified-SinceCan be understood asDownload only when there are updates, thatIf-Unmodified-SincenamelyDownload without update。 If the client passesIf-Unmodified-Since, like this:

If-Unmodified-Since: Wed, 21 Oct 2000 07:28:00 GMT 

After the server gets this header, it will also compare with the modification time of the current version:

  1. If there is no update after this time, the server returns200, and return the content.
  2. If there is an update after this time, it is actually thisifIf not, the error code will be returned412, semanticPrecondition Failed

Etag and last modified priorities

ETagandLast-ModifiedBoth are negotiation caches, which require the server to calculate and compare. If both exist, which one?The answer isETagETagPriority ratio ofLast-Modifiedhigh。 becauseLast-ModifiedThere is a problem in the design, that isLast-ModifiedThe accuracy can only be seconds. If a resource is frequently modified and modified multiple times in the same second, you canLast-ModifiedI can’t see the difference. howeverETagEach modification will generate a new one, so it is better thanLast-ModifiedHigh precision, more accurate. howeverETagIt’s not all right. YoursETagIf designed as ahashValue. This value is calculated for each request, which requires additional server resources. Which one to use needs to be selected according to your own project situation.

Force cache

The third and fourth examples of the above bullshit are forced caching. I know that I don’t need to ask the server at all in a certain period of time, just use the cache directly. In these two examplesExpiresIs a separateheadermax-ageandimmutableBelong toCache-Controlthisheader

Expires

ExpiresRelatively simple, it is the serverresponseofheaderBring this field:

Expires: Wed, 21 Oct 2000 07:28:00 GMT

Then before this time, the client browser will no longer initiate requests, but directly use the cache resources.

Cache-Control

Cache-ControlIt is relatively complex, and there are many properties that can be set,max-ageJust one of the attributes, which looks like this:

Cache-Control: max-age=20000

This indicates that the current resource is20000 secondsYou don’t need to request any more, just use the cache.

Mentioned aboveimmutableAlsoCache-ControlA property of, but it is experimental, and the compatibility of various browsers is not good. SetCache-control: immutableIt means that it is impossible to request again after using cache all my life.

Other common attributes are:

no-cache: before using the cache, it is mandatory to submit the request to the server for verification (negotiate cache verification).

no-store: nothing about client requests or server responses is stored, that is, no cache is used.

in additionCache-ControlThere are many properties, you can refer toMDN documents

Priority of expires and cache control

Just one sentence: if inCache-ControlResponse header setmax-ageperhapss-maxageCommand, thenExpiresThe head is ignored.

Negotiate cache and enforce cache priority

In fact, it’s easy to understand. The negotiation cache needs to send a request to negotiate with the server. If the mandatory cache takes effect, no request will be sent at all. So this priority is:First judge the forced cache. If the forced cache takes effect, use the cache directly; If the forced cache fails, send another request and negotiate with the server to see whether to use the cache

summary

Starting with the common scenes in life, this paper expoundsHTTPCaching mechanism is actually a mechanism to improve access speed and solve information synchronization. This kind of information asynchrony is very common in life. Many solutions are common to us. With this kind of thinking, we can understand it wellHTTPCaching mechanism.HTTPThe key points of caching mechanism are as follows:

  1. HTTPThe caching mechanism is divided intoForce cacheandNegotiation cacheTwo categories.
  2. Force cacheDon’t ask (don’t make a request), just use the cache.
  3. Force cacheCommon technologies areExpiresandCache-Control
  4. ExpiresThe value of is a time, which means that the cache is valid before this time and there is no need to initiate a request.
  5. Cache-ControlThere are many attribute values, including common attributesmax-ageSet the effective time length of cache, unit:second, you don’t have to make a request before this time.
  6. immutableAlsoCache-ControlAn attribute of indicates that the resource does not need to be requested in this life, but its compatibility is poor,Cache-ControlOther attributes can be referencedMDN documents
  7. Cache-Controlofmax-agePriority ratioExpiresHigh.
  8. Negotiation cacheCommon technologies areETagandLast-Modified
  9. ETagIn fact, it is to calculate a resourcehashValue or version number, corresponding to commonrequest headerbyIf-None-Match
  10. Last-ModifiedIn fact, the time of resource modification is added to the corresponding commonrequest headerbyIf-Modified-Since, precision issecond
  11. ETagEvery modification changes, andLast-ModifiedThe accuracy is only tosecond, soETagIt is more accurate and has higher priority, but needs calculation, so the server overhead is greater.
  12. Force cacheandNegotiation cacheIf both exist, judge firstForce cacheWhether it is effective. If it is effective, you can directly use the cache instead of initiating a request. IfForce cacheJudge whether to initiate a request again if it does not take effectNegotiation cache

reference material:

ETag MDNfile:https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers/ETag

Last-Modified MDNfile:https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers/Last-Modified

Expires MDNfile:https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers/Expires

Cache-Control MDNfile:https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers/Cache-Control

At the end of the article, thank you for spending your precious time reading this article. If this article gives you a little help or inspiration, please don’t be stingy with your praise and GitHub little star. Your support is the driving force for the author’s continuous creation.

Welcome to my official account.The big front of the attackGet high-quality originality at the first time~

Source code address of “front end advanced knowledge” series articles:https://github.com/dennis-jiang/Front-End-Knowledges

Easy to understand HTTP caching policy