I saw an article about caching the other dayThoroughly understand HTTP caching mechanism — three factor decomposition method based on cache strategy, I think it’s very interesting, so I plan to systematically learn about HTTP caching.
I divide the cache intoCache storage、Cache comparisonTwo parts.
Hit cache speed comparison
200 from cachevs
304 Not Modified
Thinking: local storage storage
（1） Basic concepts
(1) Cache storage
Pragma : no-cache
Properties of http1.0 periodWill use for compatibility
Properties of http1.0 period
http1.1It has the following common parameters:
For example, if you want to set the client not to cache and be compatible with http1.0, you can write as follows:
Pragma : no-cache Expires：0 Cache-Control：no-store
Pragma: no cache // pragma is compatible with http1.0 Cache control: Max age = 0 // remove the expires attribute (the following noun will explain why it is removed), and merge it into Max age,
Private cacheHTTP authoritative guide tells us that one kind of private cache is input in the browser
about:cacheYou can view the contents of your browser’s cache, and a “disk cache statistics” page will be displayed. This page is very interesting and can display a lot of information
Expires: expiration date value, in GMT format, is the header field of the web server response message. It tells the browser that the browser can directly fetch data from the browser cache before the expiration time, without further request. However, expires is something of HTTP 1.0. Now the default browsers use HTTP 1.1 by default, so its role is basically ignored. One drawback of expires is that the expiration time returned is the server-side time. There is a problem. If the time difference between the client and the server is very large (for example, the clock is not synchronized, or cross time zones), the error is very large. Therefore, since HTTP version 1.1, cache control: Max age = seconds is used instead.
(2) Cache comparison
Http1.0 time attributeStill in use
New attributes added during HTTP1.1, using inode + mtime (explained below). A hash string generated from the entity content (similar to the result after MD5 or SHA1) can identify the state of the resource. When the resource transmission changes, Etag also changes.
Inode: contains the meta information of the file, including the following
The number of bytes of the file, the user ID of the file owner, and the group ID of the file
Read, write and execute permissions of files
There are three timestamps: CTime refers to the last change time of inode, mtime refers to the last change time of file content, and atime refers to the last opening time of the file.
The number of links, that is, how many file names point to the location of the inode and file data block
Mtime: refers to the time when the content of the file was last changed
Some servers can’t get the last modification time of the file accurately, so it can’t judge whether the file is updated by the last modification time.
Some files are modified very frequently and can be modified within seconds. Last modified can only be accurate to seconds.
The last revision time of some documents has changed, but the content has not changed. We don’t want the client to think the file has been modified.
2.2 what are the problems with Etag
Etag also has its own problems, so Etag is often closed in use. For example, the inodes of the same file on different physical machines are different, which results in different etags returned when the access falls on different physical machines in the distributed web system, resulting in 304 invalidation and degradation to 200 requests. The solution is to separate inode from Etag algorithm, only use mtime, but this is the same as last modified. Of course, you can also make some additional improvements, so that Etag’s algorithm for static resources is also calculated through hash. In general, the deployment of the CDN will not affect many people, so it will not cause too many problems.Reference: This article is the answer of students from hefangshi
（2） Hit cache speed comparison
Referring to a figure in HTTP authoritative guide, we can see the hit cache process:
(1) Cache hit speed
Cache hit > cache revalidation success > cache miss = cache revalidation failed;
1.1 cache hit priority
http1.1> Expires > Pragma
http1.0To decide whether (200 from cache)
1.2 cache revalidation successful
According to last modified
http1.1To verify whether to return (304 not modified), both must be verified at the same time, and only when both are met can 304 be returned;
The server response header last modified corresponds to the client request header if modified since
Server response header Etag and client request header if none match
（3） 200 from cache vs 304 not modified
Why is it that the status in the console does not display when the cache is hit
200 from cache ? The reason is the browser:
Trigger 200 from cache:
Click on the link directly
Enter the URL and press enter to access it
QR code scanning
Trigger 304 not modified:
Triggered when the page is refreshed
Triggered when a long cache is set but the entity tags are not removed
How to choose between them
There is no doubt that the choice can hit the cache as much as possible and then invalidate the cache by updating the version number of the static file. The recommended version number file.xxx.js Instead of file.js?v=xxx 。
It can be seen that these two articles have the following reasons:
When studying the cache problem, Zhihu saw this problem:What are the disadvantages of storing static resources (JS / CSS) in localstorage? Why is it not widely used?After reading the answers of the great gods, the main reason is that the maintenance cost is too high. If the speed is really super fast, this point can be ignored and it is worth studying. However, if the speed of read and re execution may be lower than that of browser direct 304, there is no need to use this method.