A Graph Understanding Http Caching

Time:2019-5-30

Referring to some browser caching information, this paper summarizes its process through a graph.

The first time a browser initiates a web serverhttpAfter the request, the server returns the requested resource and adds some caching fields to the response header, such as:Cache-ControlExpiresLast-ModifiedETagDateWait. The browser then requests the resource from the server to use it as appropriate.Strong cacheandNegotiation cache

  • Strong caching: Browsers get data directly from local caches without interacting with servers.
  • Negotiation Caching: Browsers send requests to servers to determine whether local caching can be used.
  • Connection and Difference: Both caching methods ultimately use local caching; the former does not need to interact with the server, and the latter does.

Let’s assume that the browser has accessed the server, the server has returned the header fields associated with the cache, and the browser has cached the related resources. Strong caching and negotiation caching are analyzed in the following figure:

A Graph Understanding Http Caching

Strong cache

The process shown in the red line in Figure 1 represents a strong cache. The user initiated ahttpAfter the request, the browser discovers that the cache of the requested resource already exists locally, and then begins to check whether the cache expires. There are two HTTP header fields that control the validity of the cache:ExpiresandCache-ControlThe browser determines whether the cache expires based on the following two steps:

  1. Check if the cache is availableCache-ControlOfs-maxageormax-ageInstructions, if any, use response message generation timeDate + s-maxage/max-ageObtain the expiration time and compare it with the current time(s-maxagePublic cache servers for multi-user use;
  2. WithoutCache-ControlOfs-maxageormax-ageDirectives, then comparisonsExpiresThe expiration time and current time.ExpiresIt’s an absolute time.

Be carefulIn HTTP/1.1, when the header fieldCache-ControlDesignateds-maxageormax-ageDirectives, compared to the first fieldExpiresPriority will be given tos-maxageormax-age

Here are a few moreCache-ControlCommon instructions:

  • no-cacheThis means that instead of using local caches, you need to use negotiation caches, i.e. confirm with the server whether the caches are available.
  • no-storeDisable caching.
  • publicIndicates that other users can also use caching, which is suitable for the case of public caching servers.
  • privateIndicates that only specific users can use caching, which is suitable for the case of public caching servers.

After these two steps, if the cache has not expired, the return status code is200If the cache expires, it enters the negotiation cache or the server returns new resources.

Negotiation cache

When the browser finds that the cache is out of date, the cache is not necessarily unusable, because the resources on the server side may still remain unchanged, so it needs to negotiate with the server to let the server decide whether the local cache can still be used. At this point, the browser will determine whether there is a cache or not.ETagorLast-ModifiedFields, if not, initiate an HTTP request, and the server returns the resource based on the request; if there are two fields, add them to the request headerIf-None-MatchField (yes)ETagIf fields are added,If-Modified-SinceField (yes)Last-ModifiedAdd fields if you want to.Be careful:If sent at the same timeIf-None-MatchIf-Modified-SinceFields, servers only need to compareIf-None-MatchandETagIf the content is consistent, the server will return the status code if it thinks the cache is still available.304The browser reads the local cache directly, which completes the negotiation process, i.e. the blue line in the graph; if the content is inconsistent, it returns other status codes and the requested resources as appropriate. The following process is explained in detail:

1.ETagandIf-None-Match

Both values are unique identifier strings allocated by the server for each resource.

  • The browser requests resources, and the server adds them to the response headerETagField. Server-sideETagThe values are updated accordingly.
  • When the browser requests resources again, it adds them to the request headerIf-None-MatchFields whose values are in the last response messageETagValue;
  • Servers will compareETagandIf-None-MatchIf not, the server accepts the request and returns the updated resource; if the consistency indicates that the resource has not been updated, the return status code is304Responses can continue to be cached locally. It should be noted that the response header will be added at this time.ETagField, even if it does not change.

2.Last-ModifiedandIf-Modified-Since

Both values are time strings in GMT format.

  • After the browser first requests resources from the server, the server adds them to the response headerLast-ModifiedFields, indicating the last modification time of the resource;
  • When the browser requests the resource again, it adds it to the request headerIf-Modified-SinceField, whose value was in the last server response messageLast-ModifiedValue;
  • Servers will compareLast-ModifiedandIf-Modified-SinceIf not, the server accepts the request and returns the updated resource; if the consistency indicates that the resource has not been updated, the return status code is304Responses can continue to use local caches, andETagThe difference is that the response header will not be added at this time.Last-ModifiedField.

3.ETagin comparison withLast-ModifiedAdvantage

The following is quoted from: http Negotiation Cache VS Strong Cache

You may feel like using it.Last-ModifiedIt’s enough for browsers to know if local cached copies are new enough and why they need toETagWhat about it?HTTP1.1inETagThe emergence is mainly to solve several problems.Last-ModifiedMore difficult problems to solve:

  • Some files may change periodically, but their content does not change (just the time of change). At this time, we don’t want the client to think that the file has been modified, but to restore it.GET
  • Some files are modified very frequently, such as in less than seconds (for example, N times in 1s).If-Modified-SinceThe granularity that can be checked is s-level, and this modification cannot be judged (or rather, it can’t be judged).UNIXRecordMTIMEIt can only be accurate to seconds.
  • Some servers can’t get the exact last modification time of the file.

At this point, useETagCaching can be controlled more accurately becauseETagIs the unique identifier of the resource automatically generated by the server. Every time the resource changes, it generates a new one.ETagValue.Last-ModifiedandETagIt can be used together, but the server will prioritize validationETag

User behavior

Finally, a diagram is attached to illustrate the impact of user behavior on browser caching:
A Graph Understanding Http Caching