From scratch handwriting cache framework (12) random characteristics of redis expire expiration and its implementation

Time:2020-10-23

preface

Java implements redis from zero handwriting?

Java realizes redis from zero handwriting

How to restart memory data without losing it?

Java realizes redis from zero handwriting

Another way to realize the expiration policy of redis (5) from zero handwriting in Java

Java realizes redis from zero handwriting

Java handwritten redis from scratch (7) detailed explanation of LRU cache elimination strategy

Java handwriting from scratch (8) performance optimization of plain LRU elimination algorithm

In the second section, we have initially implemented the expiration function similar to that in redis. However, there is one problem that has not been solved, that is, the traversal is not returned randomly, which will cause each traversal to start from the beginning, which may cause many keys to be in a “starvation” state.

It can be recalled that:

Java realizes redis from zero handwriting

Another way to realize the expiration policy of redis (5) from zero handwriting in Java

In this section, let’s implement an expired random version. Let’s take a closer look at the subtleties of redis.

Review of previous implementations

Before we start our new journey, let’s review the original implementation.

Principle of expire

In fact, the actual idea of expiration is relatively simple: we can open a fixed time task, such as doing a round training once a second to clear the expired information.

Storage of expired information

/**
 *Expired map
 *
 *Space for time
 * @since 0.0.3
 */
private final Map<K, Long> expireMap = new HashMap<>();

@Override
public void expire(K key, long expireAt) {
    expireMap.put(key, expireAt);
}

We define a map. Key is the corresponding information to be expired, and value stores the expiration time.

Polling cleanup

We fixed a cleaning time of 100ms, with a maximum of 100 cleaning per time.

/**
 *Quantity limit of single emptying
 * @since 0.0.3
 */
private static final int LIMIT = 100;

/**
 *Cache implementation
 * @since 0.0.3
 */
private final ICache<K,V> cache;
/**
 *Thread execution class
 * @since 0.0.3
 */
private static final ScheduledExecutorService EXECUTOR_SERVICE = Executors.newSingleThreadScheduledExecutor();
public CacheExpire(ICache<K, V> cache) {
    this.cache = cache;
    this.init();
}
/**
 *Initialize task
 * @since 0.0.3
 */
private void init() {
    EXECUTOR_SERVICE.scheduleAtFixedRate(new ExpireThread(), 100, 100, TimeUnit.MILLISECONDS);
}

A single thread is defined to perform the empty task.

Clear task

This is very simple. It traverses the expired data to determine the corresponding time. If it has expired, the empty operation is executed.

In order to avoid a single execution time is too long, only 100 items are processed at most.

/**
 *Perform tasks regularly
 * @since 0.0.3
 */
private class ExpireThread implements Runnable {
    @Override
    public void run() {
        //1. Judge whether it is empty
        if(MapUtil.isEmpty(expireMap)) {
            return;
        }
        //2. Get the key for processing
        int count = 0;
        for(Map.Entry<K, Long> entry : expireMap.entrySet()) {
            if(count >= LIMIT) {
                return;
            }
            expireKey(entry);
            count++;
        }
    }
}

/**
 *Perform expiration action
 *@ param entry details
 * @since 0.0.3
 */
private void expireKey(Map.Entry<K, Long> entry) {
    final K key = entry.getKey();
    final Long expireAt = entry.getValue();
    //Logical processing of deletion
    long currentTime = System.currentTimeMillis();
    if(currentTime >= expireAt) {
        expireMap.remove(key);
        //Then remove the cache, which can be compensated by lazy deletion
        cache.remove(key);
    }
}

The timing task of redis

technological process

If you want to know what the problem is in our process, you can find out by comparing it with the regular cleaning task process of redis.

Redis maintains a timing task internally, which runs 10 times per second by default (controlled by configuring Hz).

In the timing task, an adaptive algorithm is adopted to delete the expired key. The key is recovered according to the overdue ratio of the key and the speed of using. The process is as follows.

From scratch handwriting cache framework (12) random characteristics of redis expire expiration and its implementation

Process description

1) The timing task randomly checks 20 keys in each database space, and removes the corresponding keys when it is found to be expired.

2) If more than 25% of the Keys expire, the recycle logic is executed until less than 25% or the run-time-out is exceeded. The timeout in slow mode is 25 ms.

3) If the reclaim key logic timed out before redis triggers the internal event, run the reclaim expired key task again in fast mode. The timeout time in fast mode is 1 millisecond and can only run once in 2 seconds.

4) The internal deletion logic of the two modes is the same, but the execution time-out is different.

PS: the fast / slow mode here is also cleverly designed. According to the proportion of overdue information, the corresponding task timeout is adjusted.

The randomness here is also very important. It can clean up the expired information objectively, instead of traversing from the beginning, resulting in the inaccessibility of the subsequent data.

Next, we mainly implement the feature of random extraction.

Transfer the set directly through map ා keys

Realization ideas

Keep the original expiremap unchanged, directly convert keys to collection, and then obtain them randomly.

This is also one of the most popular answers on the Internet.

Java code implementation

Basic attributes

public class CacheExpireRandom<K,V> implements ICacheExpire<K,V> {

    private static final Log log = LogFactory.getLog(CacheExpireRandom.class);

    /**
     *Quantity limit of single emptying
     * @since 0.0.16
     */
    private static final int COUNT_LIMIT = 100;

    /**
     *Expired map
     *
     *Space for time
     * @since 0.0.16
     */
    private final Map<K, Long> expireMap = new HashMap<>();

    /**
     *Cache implementation
     * @since 0.0.16
     */
    private final ICache<K,V> cache;

    /**
     *Enable fast mode
     * @since 0.0.16
     */
    private volatile boolean fastMode = false;

    /**
     *Thread execution class
     * @since 0.0.16
     */
    private static final ScheduledExecutorService EXECUTOR_SERVICE = Executors.newSingleThreadScheduledExecutor();

    public CacheExpireRandom(ICache<K, V> cache) {
        this.cache = cache;
        this.init();
    }

    /**
     *Initialize task
     * @since 0.0.16
     */
    private void init() {
        EXECUTOR_SERVICE.scheduleAtFixedRate(new ExpireThreadRandom(), 10, 10, TimeUnit.SECONDS);
    }

}

Timed tasks

Here we are consistent with redis and support fastmode.

In fact, the logic of fastmode and slow mode is exactly the same, but the timeout time is different.

I have adjusted the overtime time according to my personal understanding, and the overall process remains unchanged.

/**
 *Perform tasks regularly
 * @since 0.0.16
 */
private class ExpireThreadRandom implements Runnable {
    @Override
    public void run() {
        //1. Judge whether it is empty
        if(MapUtil.isEmpty(expireMap)) {
            log.info ("expiremap information is empty, skip this processing directly. "";
            return;
        }
        //2. Whether to enable fast mode
        if(fastMode) {
            expireKeys(10L);
        }
        //3. Slow mode
        expireKeys(100L);
    }
}

Core implementation of expired information

When the execution is expired, we will first record the timeout time, which is used to interrupt the execution directly when it is exceeded.

By default, fastmode = false is restored, and fastmode = true is set when the execution times out.

/**
 *Expired information
 *@ param timeoutmills timeout
 * @since 0.0.16
 */
private void expireKeys(final long timeoutMills) {
    //Set the timeout to 100ms
    final long timeLimit = System.currentTimeMillis() + timeoutMills;
    //Restore fastmode
    this.fastMode = false;
    //2. Get the key for processing
    int count = 0;
    while (true) {
        //2.1 return judgment
        if(count >= COUNT_LIMIT) {
            log.info ("the number of expired elimination has reached the maximum number: {}, this execution is completed. ", COUNT_ LIMIT);
            return;
        }
        if(System.currentTimeMillis() >= timeLimit) {
            this.fastMode = true;
            log.info ("expired elimination has reached the limit time, interrupt this execution, and set fastmode = true;");
            return;
        }
        //2.2 random expiration
        K key = getRandomKey();
        Long expireAt = expireMap.get(key);
        boolean expireFlag = expireKey(key, expireAt);
        log.debug (key: {} expired execution result {} ", key, expireflag);
        //2.3 information update
        count++;
    }
}
Get expired key randomly
/**
 *Get a key information randomly
 *Return keys @ return
 * @since 0.0.16
 */
private K getRandomKey() {
    Random random = ThreadLocalRandom.current();
    Set<K> keySet = expireMap.keySet();
    List<K> list = new ArrayList<>(keySet);
    int randomIndex = random.nextInt(list.size());
    return list.get(randomIndex);
}

This is the most common implementation method on the Internet. Directly convert all keys to list, and then get an element through random.

Performance improvement

Defects of the method

getRandomKey()Method in order to obtain a random information, the cost is still too high.

If the number of keys is very large, then we need to create a list, which itself is very time-consuming, and the space complexity directly doubles.

So it’s not clear why this is the most common solution at night.

Optimization ideas – avoid space waste

The simplest idea is that we should avoid creating lists.

All we need is a random value based on size, which we can traverse to obtain:

private K getRandomKey2() {
    Random random = ThreadLocalRandom.current();
    int randomIndex = random.nextInt(expireMap.size());
    //Traverse keys
    Iterator<K> iterator = expireMap.keySet().iterator();
    int count = 0;
    while (iterator.hasNext()) {
        K key = iterator.next();
        if(count == randomIndex) {
            return key;
        }
        count++;
    }
    //Normal logic doesn't come here
    Throw new cacheruntimeexception ("corresponding information does not exist");
}

Optimization idea – batch operation

The above method avoids the creation of a list, and also meets the random conditions.

However, it is also a relatively slow process (O (n) time complexity) to traverse from the beginning to the random size value.

If we take 100 times, pessimistic is 100 * O (n).

We can use the idea of batch production, such as taking 100 at a time to reduce the time complexity

/**
 *Batch access to multiple key information
 *@ param sizelimit size limit
 *Return keys @ return
 * @since 0.0.16
 */
private Set<K> getRandomKeyBatch(final int sizeLimit) {
    Random random = ThreadLocalRandom.current();
    int randomIndex = random.nextInt(expireMap.size());
    //Traverse keys
    Iterator<K> iterator = expireMap.keySet().iterator();
    int count = 0;
    Set<K> keySet = new HashSet<>();
    while (iterator.hasNext()) {
        //Determine the size of the list
        if(keySet.size() >= sizeLimit) {
            return keySet;
        }
        K key = iterator.next();
        //Index backward position, all put in.
        if(count >= randomIndex) {
            keySet.add(key);
        }
        count++;
    }
    //Normal logic doesn't come here
    Throw new cacheruntimeexception ("corresponding information does not exist");
}

We pass in the size limit of a list, and we can get more than one list at a time.

Time complexity of optimization idea-o (1)

At first, I thought of randomness. My first idea was to store keys in a redundant list at the same time, and then randomly return keys to solve the problem.

However, for the update of the list, the space complexity of O (n) is more than that of the list part, which is not worth it.

If you use the previous map to store bidirectional linked list nodes, it can also be solved, but it is relatively cumbersome. It has been implemented before, so I won’t repeat it here.

In fact, there are still some deficiencies in the randomness here

(1) For example, random, what if the data is repeated?

Of course, the current solution is direct count. Generally, when the amount of data is large, this probability is relatively low, and there is lazy deletion, so it is harmless.

(2) Random information is likely to be overdue

It is better to use our original map classification method based on expiration time, so as to ensure that the expiration time of the obtained information is in our grasp.

Of course, each method has its own advantages and disadvantages. Let’s see how we can choose according to the actual situation.

Summary

Here, an expiration function similar to redis is basically implemented.

For the implementation of redis expired, it is basically over here. Of course, there are a lot of optimizations. I hope you can write down your own methods in the comments section.

Open source address:https://github.com/houbb/cache

If you feel that this article is helpful to you, you are welcome to comment on it. Your encouragement is my biggest motivation~

I don’t know what you got? Or have more ideas, welcome to discuss with me in the message area, looking forward to meeting with your thoughts.

Original address

Implementation principle of redis expiration expiration for cache travel-09

reference material

Java randomly selects the key in map

Selecting random key and value sets from a Map in Java