HTTP cache: 200 from cache or 304 not modified?

Time:2020-12-3

I saw an article about caching the other dayThoroughly understand HTTP caching mechanism — three factor decomposition method based on cache strategy, I think it’s very interesting, so I plan to systematically learn about HTTP caching.

I divide the cache intoCache storageCache comparisonTwo parts.

  1. Basic concepts

  2. Hit cache speed comparison

  3. 200 from cache  vs  304 Not Modified 

  4. Thinking: local storage storage

(1) Basic concepts

(1) Cache storage

  • Pragma : no-cache Properties of http1.0 periodWill use for compatibility

  • Expires:0 Properties of http1.0 period

  • Cache-Control  http1.1 It has the following common parameters:

    • Public / private private cache / common cache

    • No cache: local caching is not recommended, but will still be cached locally

    • No store: no data is cached on the client side

    • max-age: special, it is a mixed attribute, which replaces the expiration time of expires

For example, if you want to set the client not to cache and be compatible with http1.0, you can write as follows:

Pragma : no-cache 
Expires:0
Cache-Control:no-store

Equivalent to

Pragma: no cache // pragma is compatible with http1.0
Cache control: Max age = 0 // remove the expires attribute (the following noun will explain why it is removed), and merge it into Max age,

Explanation:
Private cacheHTTP authoritative guide tells us that one kind of private cache is input in the browser about:cache You can view the contents of your browser’s cache, and a “disk cache statistics” page will be displayed. This page is very interesting and can display a lot of information
Expires: expiration date value, in GMT format, is the header field of the web server response message. It tells the browser that the browser can directly fetch data from the browser cache before the expiration time, without further request. However, expires is something of HTTP 1.0. Now the default browsers use HTTP 1.1 by default, so its role is basically ignored. One drawback of expires is that the expiration time returned is the server-side time. There is a problem. If the time difference between the client and the server is very large (for example, the clock is not synchronized, or cross time zones), the error is very large. Therefore, since HTTP version 1.1, cache control: Max age = seconds is used instead.

(2) Cache comparison

  • Last-Modified Http1.0 time attributeStill in use

  • ETag(Entity Tag) New attributes added during HTTP1.1, using inode + mtime (explained below). A hash string generated from the entity content (similar to the result after MD5 or SHA1) can identify the state of the resource. When the resource transmission changes, Etag also changes.

Explanation:
Inode: contains the meta information of the file, including the following

  • The number of bytes of the file, the user ID of the file owner, and the group ID of the file

  • Read, write and execute permissions of files

  • There are three timestamps: CTime refers to the last change time of inode, mtime refers to the last change time of file content, and atime refers to the last opening time of the file.

  • The number of links, that is, how many file names point to the location of the inode and file data block
    Mtime: refers to the time when the content of the file was last changed

1.1.1

  1. Some servers can’t get the last modification time of the file accurately, so it can’t judge whether the file is updated by the last modification time.

  2. Some files are modified very frequently and can be modified within seconds. Last modified can only be accurate to seconds.

  3. The last revision time of some documents has changed, but the content has not changed. We don’t want the client to think the file has been modified.

2.2 what are the problems with Etag

Etag also has its own problems, so Etag is often closed in use. For example, the inodes of the same file on different physical machines are different, which results in different etags returned when the access falls on different physical machines in the distributed web system, resulting in 304 invalidation and degradation to 200 requests. The solution is to separate inode from Etag algorithm, only use mtime, but this is the same as last modified. Of course, you can also make some additional improvements, so that Etag’s algorithm for static resources is also calculated through hash. In general, the deployment of the CDN will not affect many people, so it will not cause too many problems.Reference: This article is the answer of students from hefangshi

(2) Hit cache speed comparison

Picture description
Referring to a figure in HTTP authoritative guide, we can see the hit cache process:

(1) Cache hit speed

Cache hit > cache revalidation success > cache miss = cache revalidation failed;

1.1 cache hit priority

Cache-Control http1.1 > Expires > Pragma http1.0To decide whether (200 from cache)

1.2 cache revalidation successful

According to last modifiedhttp1.0And Etaghttp1.1To verify whether to return (304 not modified), both must be verified at the same time, and only when both are met can 304 be returned;

Picture description

  • The server response header last modified corresponds to the client request header if modified since

  • Server response header Etag and client request header if none match

(3) 200 from cache vs 304 not modified

Why is it that the status in the console does not display when the cache is hit200 from cache ? The reason is the browser:

  • Trigger 200 from cache:

    • Click on the link directly

    • Enter the URL and press enter to access it

    • QR code scanning

  • Trigger 304 not modified:

    • Triggered when the page is refreshed

    • Triggered when a long cache is set but the entity tags are not removed

How to choose between them

There is no doubt that the choice can hit the cache as much as possible and then invalidate the cache by updating the version number of the static file. The recommended version number file.xxx.js Instead of file.js?v=xxx 。

It can be seen that these two articles have the following reasons:

  1. Best Practices for Speeding Up Your Web Site

  2. How to develop and deploy front-end code in large companies

(4) Thinking

When studying the cache problem, Zhihu saw this problem:What are the disadvantages of storing static resources (JS / CSS) in localstorage? Why is it not widely used?After reading the answers of the great gods, the main reason is that the maintenance cost is too high. If the speed is really super fast, this point can be ignored and it is worth studying. However, if the speed of read and re execution may be lower than that of browser direct 304, there is no need to use this method.

(5) Reference articles:

  1. Gap caused by configuration error: 200 OK (from cache) and 304 not modified

  2. http://www.benhallbenhall.com/2012/03/http-codes-200-from-cache-304/