HTTP caching is an important knowledge in web development. The role of caching is to reduce unnecessary network requests, improve page loading speed, optimize user experience, reduce the pressure of the server and save bandwidth. We must master the basic knowledge of cache to make good use of cache, and also help us solve the problem that resources are not updated due to client cache in development.
HTTP cache is divided into mandatory cache and protocol cache
Before the resource file has expired, it will not send a request to the server. It directly uses the local cache (disk, memory). The strong cache is controlled by expires, cache control and pragma
it is a field defined by http1.0. The attribute value is a time stamp. When the client requests the resource again, it will compare the client time with the time stamp. If it is greater than the time stamp, it will be expired. Otherwise, the cache resource will be used directly.
A disadvantage of expires is that the expiration time returned is the time of the server. There is a problem. If the time difference between the client and the server is very large, the error will be very large. Therefore, from HTTP version 1.1, cache control: Max age = seconds is used instead. If cache control and expires exist at the same time, cache control will take effect.
This is a new field added to HTTP1.1. The common attribute values are as follows:
- Max age: the unit is seconds. The cache time is calculated by the number of seconds from the initiation time. If the number of seconds exceeds the interval, the cache will be invalid
- No cache: do not use strong cache, you need to verify whether the cache is fresh with the server
- No store: it is forbidden to use cache (including negotiation cache), and requests the latest resources from the server every time
- Private: a cache dedicated to individuals. Intermediate agents and CDNs cannot cache this response
- Public: responses can be cached by intermediate agents, CDNs, etc
- Must revalidate: it can be used before the cache is expired and must be verified to the server after it is expired
It is also a field defined by http1.0, with only one attribute value: no cache, which is the same as no cache in cache control, and has higher priority than cache control and expires; when the three appear at the same time, the priority is pragma > cache control > expires
When the strong cache fails or the strong cache is set to no cache, if there is last modified Etag in the response header of the previous request, the browser will display the last modified Etag in the request If modified since (last modified value returned from the last request) and if none match (Etag value returned from the last request) are carried in the header. The server compares the two parameters and service resources to see if they are expired. If they are not expired, the server returns 304, and the browser uses the local cache. Otherwise, it returns 200 to return the latest resource file of the service.
1. Last-Modified / If-Modified-Since
- When the browser requests a URL for the first time, the server’s return status code is 200. At the same time, the HTTP response header will have a last modified mark indicating the time when the file was last modified on the server.
- When the browser requests the last URL for the second time, it will add an if modified since tag in the HTTP request header to ask the server whether the file has been modified after that time.
- If there is no change in the server-side resources, it will automatically return to 304 status and use browser cache, so as to ensure that the browser will not repeatedly obtain resources from the server-side, and that the client can get the latest resources in time when the server changes.
2. ETag / If-None-Match
- When the browser requests a URL for the first time, the server’s return status code is 200, and the HTTP response header will have an Etag, which stores a sequence value generated by the server.
- When the browser requests the last URL for the second time, it will add an if none match tag in the HTTP request header to ask the server whether the file has been modified.
Why Etag with last modified? Etag is mainly used to solve some problems that cannot be solved by last modified
- Some files may change periodically, but their contents do not change (only the modification time). At this time, we do not want the client to think that the file has been modified and get it again.
- Some files are modified very frequently, for example, they are modified in less than seconds (for example, they are modified n times in 1 second). If modified since the granularity detected by if modified since is S-level, this kind of modification cannot be judged (or the UNIX record mtime can only be accurate to seconds)
- Some servers can’t get the last modification time of the file accurately
Let’s take a look at the browser caching process through the diagram
1. Browser first request
2. Second browser request
- Question one
A: If there is no cache control and other strong cache fields in the response header, what can the browser do?
Q: If there is no strong cache field, the browser will use (current time – last modified time) * 10% The calculated time is regarded as the max age. The cache time is not certain. There is no problem in regular development, but there will be problems in testing or production. Of course, this formula is not the same for all browsers. Because there is no specific cache policy, the browser manufacturer will cache according to the policy they think is appropriate.
- Question 2
A: How to solve the entrance page cache, such as wechat cache?
Q: Cache control: no cache or max age should be added to the entry page for a short time. It needs to be set in the HTTP response header, not the meta tag of the HTML page. Some browsers recognize the meta tag, while others do not.
- Question 3
A: How to solve the problem of IE8, IE9 Ajax get request cache?
Q: The header returned by the background interface is no cache or max age = 0. Of course, the front-end framework plus random number can also solve this problem, such as jQuery Ajax cache: false.
Summary & Practice
- For the front-end, the entry page uses no cache or max age to set it to a shorter time, JS and CSS itself set Max age to a longer time, and reference JS. CSS uses the packaging tool plus hash version number, so that the browser will only request the modified file, and directly use the cache if there is no modification.
- For the back-end interface, it’s better to add no cache, so that there will be no problem in the old browser.
- It is helpful to make different cache policies and cache fields by default, rather than to negotiate with different browsers.