Record of scan command in redis

Time:2020-10-14

1、Originally I thought I was quite familiar with the redis command. Various data models and various operations based on redis. But recently, when I used the command mode of redis scan, I suddenly found that my original understanding of the cursor of redis was very limited. So record the process of stepping on the pit, and the background is as follows:

Because the redis server memory is tight, the company needs to delete some useless keys that have no expiration time set. About 500 W of keys. Although the number of keys sounds scary. But I’ve been playing with redis for years. Isn’t it easy?

At that time, I thought about it. The specific solution was to filter out 500W keys through Lua script. Then delete it. Lua script is executed on the redis server, and the execution speed is fast. To execute a batch, it only needs to establish a connection with the redis server once. Filter out the key, and then delete 1W at a time. Then, you can delete all of them after 500 cycles of shell script. In the past, Lua script has done similar batch update operation, and once for 3W, it is second level. Basically, it will not cause the obstruction of redis. In this way, we can get 500W key in 10 minutes.

Then I started writing Lua scripts directly. The first is screening.

People who have used redis must know that redis is a single thread job, and they certainly can’t use the keys command to filter. Because the keys command will conduct a full disk search at one time, which will cause the obstruction of redis and affect the command execution of normal business.

The key of 500W data can only be done by incremental iteration. Redis provides the scan command, which is used for incremental iteration. This command can return a small number of elements at a time, so this command is very suitable for dealing with iterations of large datasets and can be used in production environments.

The scan command returns an array. The first item is the location of the cursor and the second is a list of keys. If the cursor reaches the end, the first item returns 0.

2、So the first version of lua script I wrote is as follows:


local c = 0
local resp = redis.call('SCAN',c,'MATCH','authToken*','COUNT',10000)
c = tonumber(resp[1])
local dataList = resp[2]

for i=1,#dataList do
 local d = dataList[i]
 local ttl = redis.call('TTL',d)
 if ttl == -1 then
  redis.call('DEL',d)
 end
end

if c==0 then
 return 'all finished'
else
 return 'end'
end

In the local test redis environment, mock 20W test data by executing the following command:


eval "for i = 1, 200000 do redis.call('SET','authToken_' .. i,i) end" 0

Then execute the script load command to upload Lua script to get Sha value, and then execute evalsha to execute the obtained Sha value to run. The specific process is as follows:

Every time I delete a week of data, I execute dbsize (because this is my local redis, which contains only mock data, dbsize is equal to the number of prefixed keys).

Strangely, the first few lines are normal. But by the third time, dbsize changed to 16999, and one more was deleted. I didn’t care too much. But in the end, when there were 124204 dbsize left, the number did not change. After that, no matter how many times it was executed, the number was still 124204.

Then I run the scan command directly:

It is found that although the cursor does not reach the end, the list of keys is empty.

This result made me confused for a while. I checked Lua script carefully, no problem. Is there a bug in redis’s scan command? Is there something wrong with my understanding?

I’ll go to the redis command document to explain the count option

After a detailed study, it is found that the number of returns specified by the count option is not certain. Although we know that it may be a count problem, the explanation of the document is difficult to understand in a popular way. We still don’t know what the specific problem is

3、Later, after being prompted by a partner, I saw another popular explanation for the count option of the scan command

After reading it, I suddenly realized. The number following the count option does not mean the number of elements returned each time, but the number of dictionary slots traversed by the scan command each time

When I run scan, I start to traverse from the position of cursor 0 every time, instead of storing the data I need to filter in every dictionary slot, which leads to my last phenomenon: Although my count is followed by 10000 dictionary slots, redis actually traverses 10000 dictionary slots from the beginning to the bottom, and finds that no data slot stores the data I need. So my final number of dbsizes is always at 124204.

Therefore, when using scan command, if iterative traversal is required, the cursor returned by the previous call should be used as the cursor parameter of the call every time, so as to continue the previous iteration process.

At this point, his doubts were solved and a new version of lua was changed


local c = tonumber(ARGV[1])
local resp = redis.call('SCAN',c,'MATCH','authToken*','COUNT',10000)
c = tonumber(resp[1])
local dataList = resp[2]

for i=1,#dataList do
 local d = dataList[i]
 local ttl = redis.call('TTL',d)
 if ttl == -1 then
  redis.call('DEL',d)
 end
end

return c

Execute after local upload:

As you can see, the scan command can’t guarantee that the number of filters per filter is exactly equal to the given count, but the whole iteration continues well. Finally, the cursor returns 0, that is to the end. So far, the test data 20W has been deleted.

This Lua can run directly in production as long as it is looped on the shell. It is estimated that 500W of data can be deleted in about 12 minutes.

Know what it is and know why it is. Although the scan command has been played before. But I don’t know the details. Moreover, the translation of the document is not so accurate that I wasted nearly an hour in the face of the wrong results. Write it down and deepen your understanding.

summary

This article on redis scan command pit introduced here, more about redis scan command pit content, please search the previous articles of developeppaer or continue to browse the related articles below, I hope you can support developeppaer more in the future!