HTTP authority guide learning – caching


Cache introduction

Advantages of caching

  1. Reduce redundant data transmission and save traffic

  2. Alleviate the problem of network bottleneck and load pages quickly without high bandwidth

  3. Reduced requirements for the original server faster server response to avoid overloading

  4. Reduce the distance from the delay server and reduce the transmission time

When to use cache

  1. Redundant data transfer (get)

  2. Bandwidth bottleneck

  3. Instant congestion

  4. 。。。。。

Caching process

General cache scenario

  1. Cache hit: client requests cache cache sends cached copy to client

  2. Cache miss: if there is no corresponding copy in the client request cache, please forward it to the client

  3. Cache revalidation hit: if the client requests the cache to determine whether the replica is fresh, send the validation request back to the server of the server or fresh, send the replica to the client

HTTP authority guide learning - caching

Cache processing steps

  1. Receive incoming request message in the network

  2. The cache parses the message and proposes URLs and headers

  3. Check if there is a local copy available. If there is no local copy available, save one

  4. Check if the copy is fresh enough

  5. Cache will build messages with new headers and cached topics

  6. Send message to client

  7. Logging transactions

HTTP authority guide learning - caching

Document expiration

Expires the absolute expiration time of the first http / 1.0
Cache control: Max age the first http / 1.1 defines the maximum usage period and the maximum legal lifetime of the document (unit: s)

Server revalidation

Just because a document is out of date doesn’t mean it’s not fresh. The next step is to verify the server and ask if the document has changed
If modified since head
If none match: tags head


When the browser gets a data response of 200 from the server for the first time, the cache will get the last modified field.
During revalidation, if modified since field will be added to the request. The value of this field is the value of last modified field. It tells the server that when the last modified time of the object is after if modified since, my cache is not fresh for me. Otherwise, my cache is fresh and no longer needs to be provided.
The server will generally have three responses:

  1. Revalidation hit: object not modified 304 not modified (successful revalidation is faster than cache miss, and failed revalidation is almost as fast as Miss)

  2. Revalidation Miss: object modified 200 OK

  3. Object deleted: 404 not found cache will also delete the corresponding copy


But there are some special cases

  1. The document cycle line rewrite (written by the background process) actually contains the same data, but the modification date changes

  2. The document has been modified, but the modification is unimportant. There is no need to cache the update

  3. Some servers are unable to determine the last modification time

  4. Some servers will change in the sub second interval. For servers, the granularity modification date in one second is not enough
    In this case, the method is to type the tag overtime number

HTTP authority guide learning - caching
If none mathc can be multiple, which tells the server that there are cached copies of the objects corresponding to these entity labels
HTTP authority guide learning - caching

Ability to control cache

The server can determine how long the document cache is based on priority through several ways defined by HTTP

  1. Cache-Control:public

  2. Cache-Control:no-store

  3. Cache-Control:no-cache

  4. Cache-Control:must-revalidate

  5. Cache-Control:max-age

  6. Expires:date

  7. Don’t add expiration time for browser to judge
    Here, the no store cache copies the response. The cache forwards the response and deletes the object

In fact, no cache can be stored in the local cache, but it cannot be provided to the client before the original server performs fresh validation
Cache control: Max age
Cache control: s-maxage = 3600 only for shared cache
If Max age = 0, the cache will not be used, and it will refresh every time it enters
Must revalidate tells the cache that it is not allowed to provide without fresh validation with the server
HTTP authority guide learning - caching
A message information allowed to be cached
HTTP authority guide learning - caching
A message information that cannot be cached

hit rate

Cache hit rate: the proportion of requests provided by cache in all requests
Byte hit rate: ratio of cache bytes to total bytes

Distinguish whether the request comes from cache or non cache

Compare the date field of the response. If the date field of the response is earlier than the current time, it means it comes from the cache
HTTP authority guide learning - caching

Proxy and cache

Cached topology

Private cache: browser cache (browser input chrome: / / cache / view local cache)
Shared proxy cache: cache proxy server

Cache hierarchy

In practice, it is usually hierarchical cache. Requests with smaller cache misses are directed to the larger parent cache.
The smaller and cheaper cache is used near the client. In the higher level, the larger and more powerful cache is gradually used to load the documents shared by multiple users

HTTP authority guide learning - caching

Mesh cache content routing and peer cache

Mesh cache is also called content Router: the more complex way of meeting between proxy caches is to talk, make dynamic cache communication decisions, decide to call on that parent cache, or decide to completely bypass the cache and directly connect to the original server
It includes the following functions:

  1. Dynamic selection based on the URL in the parent cache or the original server callback

  2. Dynamically select a specific parent cache based on the URL

  3. Search for a cached copy in the local cache before going to the parent cache

  4. Allow other caches to access parts of their cache, but do not allow rogue access to the cache
    The complex relationship of cache allows different organizations to be peer entities and cache sharing (sibling cache). HTPP does not support sibling caching, so there are still some protocols (htcp) to fast play P2P?

Summary: first field related to cache

Header field value describe
Cache-Control public Response can be cached (default)
private Responses can be cached by the browser but do not allow relay caching (CDN) caching
no-store Disable browser and relay caching
no-cache Browser can be cached but not used without freshness check
max-age A number in seconds is a relative time.
Date + Max age is the absolute maximum expiration time.
Expires date Absolute expiration time of HTTP / 1.0 standard
Date date The time that only exists in response when the message responds
Last-Modified date Only exists in response when the document was last modified
If-Modified-Since date Only needed for freshness test in request
Query whether the document has been modified after this time
If the date of the updated cached document has not been modified, and a cached copy is provided
If you have modified and updated a new document, provide a new document
If-None-Match tag It only exists in request and tells the server that these versions have corresponding cache
Pragma No-cache, etc. The first field function in the era of HTTP / 1.0 is equivalent to cache control

Recommended Today

Promise handwritten promise from getting started to getting offer

1. Implementation of promise constructor Promise constructor is used to declare the sample object, and an executor function needs to be passed in. It includes the resolve function and reject function, as well as several important attributes: status attribute, result attribute and callback function queue. Basic framework of constructor The resolve function is used to […]