Do you really understand cache avalanche, cache breakdown and cache penetration



Hello, everyone, I’m asong. Today, I’d like to talk with you about some common cache problems in the interview. Why do you suddenly want to do this article? Today, I looked through some data I sorted out when preparing for the interview, and found that the cache still accounts for a high proportion in the interview. I memorized it for a long time for the interview, but I forgot it now because it was all memorized. Today, I want to tidy up this part and make a good record. Because of my limited ability, this lecture is easy to understand and does not involve too difficult cache usage scenarios. All right, let’s start.

Cache application

Cache is more or less used in our ordinary projects. There are still many usage scenarios for cache. Cache is an important component in distributed systems. It mainly solves the performance problem of hot data access in high concurrency and big data scenarios. Fast data access to improve performance. When it comes to caching, these are some caching application scenarios we can think of, but we don’t know what the essence of caching is. The basic idea of caching is that we are very familiar with space for time. The cache is not so big, although it can improve the performance of the system. The idea of caching is actually widely used in operating systems or other places. such asCPU cache caches memory data to solve the problem of mismatch between CPU processing speed and memory. Memory cache caches hard disk data to solve the problem of too slow hard disk access speed. For another example, the operating system introduces a fast table based on the page table scheme to speed up the conversion from virtual address to physical address. We can understand the fast table as a special cache.

The above briefly introduces the basic idea of caching. Now back to the business system:In order to prevent users from getting data too slowly when requesting data, we added a cache layer on the database to make up for it.Drawing a picture can make it easier for everyone to understand:

Do you really understand cache avalanche, cache breakdown and cache penetration

To put it simply, when we query a piece of data, we first query the cache. If there is a cache, we will directly return it. If there is no cache, we will query the database and then return it. In this case, some phenomena may occur.

1. Cache avalanche

1.1 what is cache avalanche

Here we analyze it through an example. For example, teacher Ma’s treasure. When we open the home page of a treasure, we see some pictures, recommended store information, etc. These are hot data. Why do they load so fast? Because the cache is used. These hot data have been cached. Assuming that the cache expiration time of these hot data is the same, now Mr. Ma wants to do a second kill activity. Assuming that there are 8000 requests per second during the second kill activity, we could have carried 6000 requests per second with cache, but all keys in the cache failed at that time. At this time, all 8000 requests fall into the database in one second, and the database is bound to be unable to carry them. It will report to the police. In the real situation, the DBA may hang up without responding. At this time, if there is no special solution to deal with the fault, the DBA is very anxious and restarts the database, but the database is immediately killed by new traffic. The above causes of cache avalanche are due to the failure time, and there is another possibility that the cache service is down.

Do you really understand cache avalanche, cache breakdown and cache penetration

1.2 solutions

The analysis is divided into three time periods

1.2.1 in advance

If the cache avalanche is caused by the downtime of the cache service, you canredisUsing cluster deployment, you can use master-slave + sentinel and redis cluster to avoid the overall collapse of redis. If the cache avalanche is caused by a large number of cache expiration times, we are going toredisWhen saving data, just add a random value to the expiration time of each key, so as to ensure that the data will not fail in a large area at the same time, or set the hot data to never expire. If there is an update operation, just update the cache.

1.2.2 in process

If we didn’t consider the cache avalanche problem before, what should we do if a cache avalanche really occurs in practical use? At this time, we should consider using other methods to avoid this situation. We can use ehcache local cache + hystrix current limiting & degradation to avoid MySQL being killed.

Use hereechacheThe purpose of local cache is to consider that ehcache local cache can support a while when redis cluster is completely unavailable.

Use hystrix for current limiting & downgrade. For example, 5000 requests come in a second. We can set it. Assuming that only 2000 requests can pass through this component in a second, the other 3000 requests will go through the current limiting logic, and then call the downgraded component (downgrade) developed by ourselves. For example, some default values are set. To protect the last MySQL from being killed by a large number of requests.

1.2.3 afterwards

If the cache service goes down, we can start it hereRedisPersistenceRDB+AOF, once restarted, automatically load data from the disk and quickly recover cached data.

To sum up, it can be drawn as shown in the figure below:

Do you really understand cache avalanche, cache breakdown and cache penetration

2. Cache penetration

2.1 what is cache penetration

Under normal circumstances, the user query data exists, but under abnormal circumstances, there is no data in the cache and data, but the user continues to make requests, so that each request will hit the database. At this time, the user is likely to be an attacker. The attack will lead to excessive pressure on the database and seriously destroy the database.

Do you really understand cache avalanche, cache breakdown and cache penetration

2.2 solutions

2.2.1 add parameter verification

When I first started, my boss told me that as a back-end development engineer, don’t believe what comes from the front-end, so the data must be verified at the back-end. We can add verification in the interface layer. Illegal ones can be returned directly. There is no need to do subsequent operations.

2.2.2 cache null value

As we have also described above, the reason why penetration occurs is that there is no key to store these empty data in the cache. As a result, every query goes to the database.

Then we can set the value set for these keys to null and throw them into the cache. When the request to query this key appears later, it directly returns null, so you don’t have to go around the database. But don’t forget to set the expiration time.

2.2.3 bloom filter

redisA high-level use of Bloom filter is to use bloom filter, which is similar to a hash set to judge whether an element (key) exists in a set. This can also prevent cache penetration. Its principle is also very simple. It is to use efficient data structures and algorithms to quickly determine whether your key exists in the database. If it does not exist, just return. If it does exist, check the DB, refresh the kV, and then return.
Do you really understand cache avalanche, cache breakdown and cache penetration

Three methods are introduced above. Which method is the best? Let’s analyze:

The first method is to add parameter verification, which must be added here, but only some special values can be filtered out, such as those passedidIs a negative number, if the transmission is normalid, parameter verification does not work here.

In the second method, if there are some malicious attacks, the attack will bring a lot of Ke and y, which does not exist, so it is inappropriate to adopt the second method. Therefore, for this kind of data with abnormally many keys and low request repetition rate, we don’t need to cache it. We use the third scheme to filter it directly.

For those with limited empty data keys and high repetition rate, we can use the second method for caching.

3. Buffer breakdown

3.1 what is cache breakdown

In an ordinary highly concurrent system, when a large number of requests query a key at the same time, assuming that the key just fails at this time, a large number of requests will hit the database. This phenomenon is called breakdown.

In this way, the cache breakdown is a bit like the cache avalanche, but it is a little different. The cache avalanche is because a large area of cache failure collapses the DB, but the cache breakdown is differentBuffer breakdownIt means that a key is very hot and is constantly carrying large concurrency. Large concurrency focuses on accessing this point. When the key fails, the continuous large concurrency breaks through the cache and directly requests the database, just like cutting a hole in a intact bucket.

The problem caused by cache breakdown is that the amount of database requests will be too large and the pressure will increase sharply at a certain time.

3.2 solutions

3.2.1 no expiration

Let’s be simple and rough. Just let the hot data never expire, and refresh the data regularly. However, this setting needs to distinguish scenes. For example, this can be done on the homepage of a treasure.

3.2.2 mutex

In order to avoid cache breakdown, we can add a mutex lock to the first request to query the database, and the other query requests will be blocked until the lock is released. When the subsequent thread comes in and finds that there is already a cache, it will directly go to the cache to protect the database. But also because it will block other threads, the system throughput will decrease. We need to consider whether to do so in combination with the actual business.


Well, that’s the end of the analysis. This is a brief introductionRedisofavalanchebreakdownpierce throughIn fact, the three are almost the same, but there are some differences. In the interview, this is a necessary question when it comes to caching. Don’t confuse the three, because cache avalanche, penetration and breakdown are the biggest problems of caching. If they don’t appear, they will be fatal,

Again, the above three knowledge points are really important, because we should consider them in project design, so we must know why.

Finally, let’s give a preview. The next article is “routine of cache update”; Coming soon.

At the end, I’ll give you a little benefit. Recently, I’ve been reading the book “microservice architecture design pattern”, which is very good. I also collected a PDF. If you need it, you can download it by yourself. Access: public official account: DreamWorks [Golang], background reply: [micro service], you can get it.

I translated a gin Chinese document, which will be maintained regularly. If you need it, you can download it by replying to [gin] in the background.

I’m asong, an ordinary program ape. Let me grow stronger together. I built one myselfgolangExchange group, small partners in need add mevx, I’ll pull you into the group. Welcome your attention. I’ll see you next time~~~

Do you really understand cache avalanche, cache breakdown and cache penetration

Recommended previous articles:

  • Teach your sister to write a message queue
  • Explain the context package in detail. Just read this one!!!
  • This is enough for getting started with go elastic search (1)
  • Interviewer: have you used for range in go? Can you explain the reasons for these questions
  • Learning wire dependency injection and cron timing tasks is actually so simple!
  • I heard you can’t JWT and swagger – eat. I’ll come with the practice project
  • Mastering these go language features will improve your level by N levels (2)
  • Go to realize multi person chat room, where you can talk about anything you want!!!
  • Grpc practice – learning grpc is that simple
  • Go standard library RPC practice
  • 2020 latest gin framework Chinese document asong picked it up in English and translated it carefully
  • Several thermal loading methods based on gin

Recommended Today

Explanation of websocket heartbeat reconnection

Recently, websocket has been used in the development of applet. The applet provides corresponding native API, which is different from H5 API. Therefore, some mature class libraries of popular H5 are difficult to use, and the native API has some defects, so it implements a set of heartbeat reconnection mechanism by itself. First, let’s briefly […]