Cache aside mode of cache

Time:2020-1-21

order

This article focuses on the cache aside mode of the lower cache.

Cache Aside

There are two main points:

  • The application program fetches the data from the cache first. If it doesn’t get the data, it fetches the data from the database. After success, it puts it into the cache.

  • Update is to update the database first, and then invalidate the cache. Why not update the cache after writing the database? It is mainly afraid of dirty data caused by two concurrent write operations.

public V read(K key) {
  V result = cache.getIfPresent(key);
  if (result == null) {
    result = readFromDatabase(key);
    cache.put(key, result);
  }

  return result;
}

public void write(K key, V value) {
  writeToDatabase(key, value);
  cache.invalidate(key);
};

Dirty data

One is a read operation, but it fails to hit the cache, and then it fetches data from the database. At this time, a write operation comes. After the database is written, the cache is invalidated. Then, the previous read operation puts the old data in again, so it will cause dirty data.

In theory, this case will appear, but in fact, the probability of occurrence may be very low, because this condition needs to occur when the cache fails when reading the cache, and there is a write operation at the same time. In fact, the write operation of the database is much slower than the read operation, and the table must be locked. The read operation must enter the database operation before the write operation, and update the cache later than the write operation. All these conditions have little probability.

maven

        <!-- https://mvnrepository.com/artifact/com.github.ben-manes.caffeine/caffeine -->
        <dependency>
            <groupId>com.github.ben-manes.caffeine</groupId>
            <artifactId>caffeine</artifactId>
            <version>2.5.5</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>22.0</version>
        </dependency>

Code reappearance

Here we use the code to reproduce this dirty data scenario.

  • When the read operation comes in and there is no cache, the loading will be triggered to get the data. It has not been returned

  • Write in, update data source, invalidate cache

  • The old data obtained by loading is returned. The dirty data is stored in the cache

@Test
    public void testCacheDirty() throws InterruptedException, ExecutionException {
        AtomicReference<Integer> db = new AtomicReference<>(1);

        LoadingCache<String, Integer> cache = CacheBuilder.newBuilder()
                .build(
                new CacheLoader<String, Integer>() {
                    public Integer load(String key) throws InterruptedException {
                        LOGGER.info("loading reading from db ...");
                        Integer v = db.get();
                        LOGGER.info("loading read from db get:{}",v);
                        Thread. Sleep (1000L); // only 1 second is returned here, and the simulation causes dirty cache
                        LOGGER.info("loading Read from db return : {}",v);
                        return v;
                    }
                }
        );

        Thread t2 = new Thread(() -> {
            try {
                Thread.sleep(500L);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            LOGGER.info("Writing to db ...");
            db.set(2);
            LOGGER.info("Wrote to db");
            cache.invalidate("k");
            LOGGER.info("Invalidated cached");
        });

        t2.start();

        //Here, trigger cache loading before T2 invalidate
        //Add sleep in loading to ensure that cache loading returns only after invalidate
        //At this time, the returned cache is dirty data
        LOGGER.info("fire loading cache");
        LOGGER.info("get from cache: {}",cache.get("k"));

        t2.join();

        for(int i=0;i<3;i++){
            LOGGER.info("get from cache: {}",cache.get("k"));
        }
    }

output

15:54:05.751 [main] INFO com.example.demo.CacheTest - fire loading cache
15:54:05.772 [main] INFO com.example.demo.CacheTest - loading reading from db ...
15:54:05.772 [main] INFO com.example.demo.CacheTest - loading read from db get:1
15:54:06.253 [Thread-1] INFO com.example.demo.CacheTest - Writing to db ...
15:54:06.253 [Thread-1] INFO com.example.demo.CacheTest - Wrote to db
15:54:06.253 [Thread-1] INFO com.example.demo.CacheTest - Invalidated cached
15:54:06.778 [main] INFO com.example.demo.CacheTest - loading Read from db return : 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1

Using caffeine

@Test
    public void testCacheDirty() throws InterruptedException, ExecutionException {
        AtomicReference<Integer> db = new AtomicReference<>(1);

        com.github.benmanes.caffeine.cache.LoadingCache<String, Integer> cache = Caffeine.newBuilder()
                .build(key -> {
                    LOGGER.info("loading reading from db ...");
                    Integer v = db.get();
                    LOGGER.info("loading read from db get:{}",v);
                    Thread. Sleep (1000L); // only 1 second is returned here, and the simulation causes dirty cache
                    LOGGER.info("loading Read from db return : {}",v);
                    return v;
                });

        Thread t2 = new Thread(() -> {
            try {
                Thread.sleep(500L);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            LOGGER.info("Writing to db ...");
            db.set(2);
            LOGGER.info("Wrote to db");
            cache.invalidate("k");
            LOGGER.info("Invalidated cached");
        });

        t2.start();

        //Here, trigger cache loading before T2 invalidate
        //Add sleep in loading to ensure that cache loading returns only after invalidate
        //At this time, the returned cache is dirty data
        LOGGER.info("fire loading cache");
        LOGGER.info("get from cache: {}",cache.get("k"));

        t2.join();

        for(int i=0;i<3;i++){
            LOGGER.info("get from cache: {}",cache.get("k"));
        }
    }

output

16:05:10.141 [main] INFO com.example.demo.CacheTest - fire loading cache
16:05:10.153 [main] INFO com.example.demo.CacheTest - loading reading from db ...
16:05:10.153 [main] INFO com.example.demo.CacheTest - loading read from db get:1
16:05:10.634 [Thread-1] INFO com.example.demo.CacheTest - Writing to db ...
16:05:10.635 [Thread-1] INFO com.example.demo.CacheTest - Wrote to db
16:05:11.172 [main] INFO com.example.demo.CacheTest - loading Read from db return : 1
16:05:11.172 [main] INFO com.example.demo.CacheTest - get from cache: 1
16:05:11.172 [Thread-1] INFO com.example.demo.CacheTest - Invalidated cached
16:05:11.172 [main] INFO com.example.demo.CacheTest - loading reading from db ...
16:05:11.172 [main] INFO com.example.demo.CacheTest - loading read from db get:2
16:05:12.177 [main] INFO com.example.demo.CacheTest - loading Read from db return : 2
16:05:12.177 [main] INFO com.example.demo.CacheTest - get from cache: 2
16:05:12.177 [main] INFO com.example.demo.CacheTest - get from cache: 2
16:05:12.177 [main] INFO com.example.demo.CacheTest - get from cache: 2

It can be seen here that when invalidating, the loading is triggered again, and then the dirty data is cleared

doc

  • Cache update routines

  • Caffeine: Java 8 high performance cache library package