The keys command of redis must not be misused

Time:2020-10-8

Kesy command

Time complexity: O (n). Suppose that the key name in redis and the length of the given schema are limited, n is the number of keys in the database.

The redis keys command is used to find all keys that match a given pattern

Although the time complexity of this operation is O (n), the constant time is quite low. For example, running redis on an ordinary laptop takes 40 milliseconds to scan a million keys.

Command format keys pattern

Warning: using the keys command in a production environment requires great care. Executing commands on a large database can affect performance. This command is suitable for debugging and special operations, such as changing the keyspace layout of the key space. Don’t use keys in your code. If you need to find a subset of keys in a key space, consider using scan or set structures.

Supported matching patterns:

  • H? LLO matches Hello, hallo and hxllo
  • H * Lo matches hllo and heeeello
  • H [AE] LLO matches hello and hallo, but does not match hillo
  • H [^ e] LLO matches hallo, hbllo It doesn’t match hello
  • H [A-B] LLO matches hallo and hbllo

Use to escape the special characters you want to match.

background

1. Redis is a single thread, and all its operations are atomic, and there will be no data exceptions due to concurrency

2. It is very dangerous to use the redis command, which will take up a large amount of processing time of a single thread, resulting in all requests being delayed

scene

When executing the keys command in the production environment, because redis is single threaded, the performance of keys command becomes slower and slower with the increase of database data. Use keys The command will take up a large amount of processing time of a single thread, causing redis blocking and increasing the CPU consumption of redis, causing all requests to be delayed, and possibly causing the server where redis is located to be down. The situation is very bad. Please ignore it in the actual production and application process. Imagine that if redis is blocked for more than 10 seconds, if there is a cluster scenario, the cluster may determine that redis has failed, and then perform a fail over.

If all threads can’t get data from redis, the application may avalanche when the situation is serious. If all the threads go to the database to get data in a moment, the database will be down.

Other dangerous orders

However, any command with time complexity of O (n) should be cautious and should not be used casually in production. For example, the commands hgetall, lrange, smallmembers, zrange, and singer are not unusable, but the time complexity of these commands is O (n). To use these commands, the value of n needs to be specified. Otherwise, cache downtime will occur.

1. The flushdb command is used to empty all keys in the current database

2. The flush command is used to empty the data of the entire redis server (delete all keys of all databases)

3. The server can be configured after the config client connects

How to disable dangerous commands

stay redis.conf In security, add the following command to disable the specified configuration:


rename-command FLUSHALL ""
rename-command FLUSHDB ""
rename-command CONFIG ""
rename-command KEYS ""

In addition, for the flush command, you need to set appendonly no in the configuration file, otherwise the server cannot be started.

If you want to keep the command, but it is not easy to use, you can rename the command to set:


rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52

Tip: changing the name of a command that is logged to an AOF file or transferred to a secondary server can cause problems.

Suggestions for improvement

1、 If there is such a demand, you can index the key values by yourself, such as storing various key values in different sets and establishing indexes by classification. In this way, you can get data quickly. However, there is an obvious drawback, that is, wasting valuable space. Therefore, we should consider reasonably. Of course, we can also think of ways to save regular key values Store their starting and ending values and so on.

2、 The scan command can also be used for the improved keys and smembers commands

Scan command

Since version 2.8, redis supports the scan command, which can be used to scan redis records in batches. This will definitely increase the total time consumed by the entire query and affect the service usage, but it will not affect the redis service stuck. The basic usage of the scan command is as follows:

Command format scan cursor [match pattern] [count count]

The scan command provides three parameters: the first is cursor, the second is the regular to match, and the third is the slot of single traversal

The scan command and its related sscan commands, hscan commands and zscan commands are used to incrementally iterate a collection of elements

  • The scan command is used to iterate over the database keys in the current database.
  • The sscan command is used to iterate over elements in a set key.
  • The hscan command iterates over key value pairs in hash keys.
  • The zscan command is used to iterate over elements (including element members and element scores) in an ordered set.

The four commands listed above all support incremental iteration, and they only return a small number of elements each time they are executed, so these commands can be used in the production environment without the problems caused by the keys command and the smembers command – when the keys command is used to process a large database, or smembers When they are blocked for a number of seconds, they may be used by the server to process a large set of keys.

However, the incremental iteration command is not without its drawbacks: for example, the smembers command can return all the elements currently contained in the collection key, but for scan For this kind of incremental iteration command, because the key may be modified during the incremental iteration of the key, the incremental iteration command can only provide limited guarantees about the returned elements (offer limited guarantees about the returned elements).

Since scan, sscan, hscan and zscan work in a very similar way, we will introduce these four commands together, but remember:

  • The first parameter of the sscan command, hscan command, and zscan command is always a database key.
  • The scan command does not need to provide any database keys in the first argument — because it iterates over all database keys in the current database.

Basic usage of scan command

Scan command is a cursor based iterator: after the scan command is called, it will return a new cursor to the user. The user needs to use this new cursor as the cursor parameter of scan command in the next iteration to continue the previous iteration process.

When the cursor is set to 0 for the server, when the cursor is set to 0 for the new iteration of the server, the value of 0 is returned to the server when the new iteration of the cursor is set to 0.

The following is an example of an iterative process for the scan command:


redis 127.0.0.1:6379> scan 0
1) "17"
2) 1) "key:12"
 2) "key:8"
 3) "key:4"
 4) "key:14"
 5) "key:16"
 6) "key:17"
 7) "key:15"
 8) "key:10"
 9) "key:3"
 10) "key:7"
 11) "key:1"

redis 127.0.0.1:6379> scan 17
1) "0"
2) 1) "key:5"
 2) "key:18"
 3) "key:0"
 4) "key:2"
 5) "key:19"
 6) "key:13"
 7) "key:6"
 8) "key:9"
 9) "key:11"

In the example above, the first iteration uses 0 as the cursor to start a new iteration.

The second iteration uses the cursor returned in the first iteration, that is, the command replies to the value of the first element – 17.

As can be seen from the above example, the reply to the scan command is an array containing two elements. The first array element is a new cursor for the next iteration, and the second array element is an array, which contains all the elements to be iterated.

When the scan command is called the second time, the command returns cursor 0, which indicates that the iteration has ended and the entire collection has been traversed.

Start a new iteration with 0 as the cursor, and call the scan command until the command returns cursor 0. We call this process a full iteration.

Guarantee of scan command

The scan command, as well as other incremental iteration commands, can provide users with the following guarantees: from the beginning of the complete traversal to the end of the complete traversal, all elements that have been in the dataset will be returned by the full traversal; this means that if there is an element, the, It exists in the traversed dataset from the beginning to the end of the traversal, so the scan command always returns this element to the user in an iteration.

However, because incremental commands only use cursors to record iteration status, these commands have the following disadvantages:

  • The same element may be returned more than once. It’s up to the application to deal with duplicate elements. For example, consider using elements returned by iterations only for operations that can be safely repeated multiple times.
  • If an element is added to or removed from the dataset during an iteration, the element may or may not be returned, which is undefined.

The number of elements returned per scan command execution

The incremental iteration command does not guarantee that a given number of elements will be returned each time it is executed.

An incremental command may even return zero elements, but an application should not treat iteration as an end as long as the cursor returned by the command is not 0.

However, the number of elements returned by the command always conforms to certain rules

  • For a large dataset, the incremental iteration command can return up to dozens of elements at a time
  • For a small enough dataset, if the underlying layer of the dataset is represented as encoded data structure (applicable to small set key, small hash key and small ordered set key), then the incremental iteration command will return all elements in the dataset in one call.

Finally, the user can specify the maximum value of the returned elements for each iteration through the count option provided by the incremental iteration command.

Count option

Although the incremental iteration command does not guarantee the number of elements returned in each iteration, we can use the count option to adjust the behavior of the command to a certain extent.

Basically, the count option lets the user tell the iteration command how many elements should be returned from the dataset in each iteration.

Although the count option is only a hint to the incremental iteration command, it works in most cases.

  • The default value for the count parameter is 10.
  • When iterating over a sufficiently large database, set key, hash key, or ordered set key implemented by hash table, if the user does not use the match option, the number of elements returned by the command is usually the same as or slightly more than that specified by the count option.
  • When iterating an integer set (intset) or a compressed list (ziplist), the incremental iteration command usually ignores the value specified by the count option and returns all elements contained in the dataset to the user in the first iteration.

Not every iteration uses the same count value

Users can change the count value according to their own needs in each iteration. Just remember to use the cursor returned from the last iteration to the next iteration

Match option

Like the keys command, the incremental iteration command can also provide a glob style pattern parameter to make the command return only the elements matching the given pattern. This can be achieved by giving match parameters when executing the incremental iteration command.

The following is an example of iteration using the match option:


redis 127.0.0.1:6379> sadd myset 1 2 3 foo foobar feelsgood
(integer) 6

redis 127.0.0.1:6379> sscan myset 0 match f*
1) "0"
2) 1) "foo"
 2) "feelsgood"
 3) "foobar"

It should be noted that the pattern matching of elements is carried out after the command takes the elements from the data set and before returning the elements to the client. Therefore, if only a small number of elements in the iterated dataset match the pattern, the iteration command may not return any elements during multiple execution.

Here is an example of this:


redis 127.0.0.1:6379> scan 0 MATCH *11*
1) "288"
2) 1) "key:911"

redis 127.0.0.1:6379> scan 288 MATCH *11*
1) "224"
2) (empty list or set)

redis 127.0.0.1:6379> scan 224 MATCH *11*
1) "80"
2) (empty list or set)

redis 127.0.0.1:6379> scan 80 MATCH *11*
1) "176"
2) (empty list or set)

redis 127.0.0.1:6379> scan 176 MATCH *11* COUNT 1000
1) "0"
2) 1) "key:611"
 2) "key:711"
 3) "key:118"
 4) "key:117"
 5) "key:311"
 6) "key:112"
 7) "key:111"
 8) "key:110"
 9) "key:113"
 10) "key:211"
 11) "key:411"
 12) "key:115"
 13) "key:116"
 14) "key:114"
 15) "key:119"
 16) "key:811"
 17) "key:511"
 18) "key:11"

As you can see, most of the iterations above do not return any elements.

In the last iteration, we forced the command to scan more elements for this iteration by setting the count option parameter to 1000, so that the command returned more elements.

Perform multiple iterations concurrently

At the same time, any number of clients can iterate on the same data set. Each time the client executes an iteration, it needs to pass in a cursor and obtain a new cursor after the iteration. This cursor contains all the states of the iteration. Therefore, the server does not need to record any state for the iteration.

Stop iteration midway

Because all States of an iteration are saved in a cursor, and the server does not need to save any state for the iteration, the client can stop an iteration in the middle of the iteration without any notification to the server.

Even if any number of iterations stop midway, there will be no problem.

Incremental iteration with wrong cursor

Using broken, negative, out of range, or other abnormal cursors to perform incremental iterations does not cause the server to crash, but may cause undefined behavior of the command.

Undefined behavior means that incremental commands may no longer guarantee the return value to be true.

Only two types of cursors are legal:

1. When starting a new iteration, the cursor must be 0.

2. The cursor returned by the incremental iteration command to continue the iteration process.

Guarantee of the end of iteration

In other words, if the size of the iterated data set increases continuously, the incremental iteration command may never complete a complete iteration.

Intuitively, when a data set keeps growing, more and more work needs to be done to access all the elements in the dataset. Whether an iteration can be ended depends on whether the user performs the iteration faster than the dataset grows.

Time complexity:

The complexity of each execution of incremental iteration command is O (1), and the complexity of a complete iteration of the dataset is O (n), where n is the number of elements in the dataset.

Return value:

The scan command, sscan command, hscan command and zscan command all return a multi bulk reply containing two elements: the first element of the reply is an unsigned 64 bit integer (cursor) represented by a string, and the second element of the reply is another multi bulk reply, which contains the element to be iterated.

Each element returned by the scan command is a database key.

Each element returned by the sscan command is a collection member.

Each element returned by the hscan command is a key value pair, which consists of a key and a value.

Each element returned by zscan command is an ordered set element, which consists of a member and a score.

For more usage, see the redis scan command:

http://doc.redisfans.com/key/scan.html

reference resources:

https://redis.io

https://www.dazhuanlan.com/2019/12/17/5df832fd189f6

Summary

This article about redis keys command must not be misused article introduced here, more related redis keys command content, please search the previous articles of developeppaer or continue to browse the related articles below, I hope you can support developeppaer more in the future!