Redis real combat 09. Implement task queue, message pull and file distribution

Time:2021-3-4

Task queueP133

By putting the relevant information of the task to be executed into the queue, and then processing the queue, you can delay the execution of those time-consuming operations. This method of handing the work to the task processor for execution is called task queue.P133

FIFO queueP133

sureRedisTo store the information about the task, and use theRPUSHPush the information about the task to be executed to the right of the list, and use the blocking version of the pop-up commandBLPOPPop up the information about the task to be executed from the queue (because the task processor does not need to perform other work except executing the task).P134

Send task

//Push the task parameters to the right of the list corresponding to the specified task
func SendTask(conn redis.Conn, queueName string, param string) (bool, error) {
    count, err := redis.Int(conn.Do("RPUSH", queueName, param))
    if err != nil {
        return false, nil
    }
    //Only when one is successfully pushed in can it be regarded as successful sending
    return count == 1, nil
}

Carry out the task

//Continuously obtain task parameters from the list corresponding to the task, and execute the task
func RunTask(conn redis.Conn, queueName string, taskHandler func(param string)) {
    for ; ; {
        result, err := redis.Strings(conn.Do("BLPOP", queueName, 10))
        //If the task information is obtained successfully, the task is executed
        if err != nil && len(result) == 2 {
            taskHandler(result[1])
        }
    }
}

The above code is task queue andRedisThe general version of interaction is simple to use. You only need to sequence the input parameter information into a string and pass it in. You can send a task and provide a callback method to process the task.

Task priorityP136

On this basis, it can be said that the original FIFO task queue is changed to a priority task queue, that is, high priority tasks need to be executed before low priority tasks.BLPOPThe first element of the first non empty list will pop up, so we just need to sort all the task team column name arrays in descending order of priority, and let the task team column name array be the first elementBLPOPOf course, if the generation rate of high priority tasks is higher than the consumption rate, the low priority tasks will never be executed.P136

Give priority to high priority tasks

//Continuously obtain task parameters from the list corresponding to the task, and execute the task
//The priority of queuenames decreases from front to back
func RunTasks(conn redis.Conn, queueNames []string, queueNameToTaskHandler map[string]func(param string)) {
    //Verify that all tasks have corresponding processing methods
    for _, queueName := range queueNames {
        if _, exists := queueNameToTaskHandler[queueName]; !exists {
            panic(fmt.Sprintf("queueName(%v) not in queueNameToTaskHandler", queueName))
        }
    }
    //Put all input parameters into the same array
    length := len(queueNames)
    args := make([]interface{}, length + 1)
    for i := 0; i < length; i++ {
        args[i] = queueNames[i]
    }
    args[length] = 10
    for ; ; {
        result, err := redis.Strings(conn.Do("BLPOP", args...))
        //If the task information is obtained successfully, the task is executed
        if err != nil && len(result) == 2 {
            //Find the corresponding processing method and execute it
            taskHandler := queueNameToTaskHandler[result[0]]
            taskHandler(result[1])
        }
    }
}
Delay taskP136

In the actual business scenario, there are some tasks that need to be operated at a specified time, such as sending mail regularly. At this time, we also need to store the execution time of the task, and put the tasks that can be executed into the task queue just now. You can use an ordered set to store the time stamp as the value of the score, task related information, queue name and other informationjsonString as the key.

Send delay task

//Stores information about deferred tasks for serialization and deserialization
type delayedTaskInfo struct {
    UnixNano  int64  `json:"unixNano"`
    QueueName string `json:"queueName"`
    Param     string `json:"param"`
}
//Send a delayed task
func SendDelayedTask(conn redis.Conn, queueName string, param string, executeAt time.Time) (bool, error) {
    //If the execution time is up, it is sent directly to the task queue
    if executeAt.UnixNano() <= time.Now().UnixNano() {
        return SendTask(conn, queueName, param)
    }
    //It's not time to execute yet. It needs to be put into an ordered collection
    //Serializing related information
    infoJson, err := json.Marshal(delayedTaskInfo{
        UnixNano: time.Now().UnixNano(),
        QueueName:queueName,
        Param:param,
    })
    if err != nil {
        return false, err
    }
    //Put in an ordered set
    count, err := redis.Int(conn.Do("ZADD", "delayed_tasks", infoJson, executeAt.UnixNano()))
    if err != nil {
        return false, err
    }
    //Only if you successfully join one is success
    return count == 1, nil
}

Pull the executable delay task and put it into the task queue

//Polling delay task, put executable task into task queue
func PollDelayedTask(conn redis.Conn) {
    for ; ; {
        //Get the earliest task to perform
        infoMap, err := redis.StringMap(conn.Do("ZRANGE", "delayed_tasks", 0, 0, "WITHSCORES"))
        if err != nil || len(infoMap) != 1 {
            //Sleep for 1ms and then continue
            time.Sleep(time.Millisecond)
            continue
        }
        for infoJson, unixNano := range infoMap {
            //Time is up, put in task queue
            executeAt, err := strconv.Atoi(unixNano)
            if err != nil {
                log.Errorf("#PollDelayedTask -> convert unixNano to int error, infoJson: %v, unixNano: %v", infoJson, unixNano)
                //Do some follow-up processing, such as deleting the message to prevent delay of other tasks
            }
            if int64(executeAt) <= time.Now().UnixNano() {
                //Deserialization
                info := new(delayedTaskInfo)
                err := json.Unmarshal([]byte(infoJson), info)
                if err != nil {
                    log.Errorf("#PollDelayedTask -> infoJson unmarshal error, infoJson: %v, unixNano: %v", infoJson, unixNano)
                    //Do some follow-up processing, such as deleting the message to prevent delay of other tasks
                }
                //The information is removed from the ordered collection and placed in the task queue
                count, err := redis.Int(conn.Do("ZREM", "delayed_tasks", infoJson))
                if err != nil && count == 1 {
                    _, _ = SendTask(conn, info.QueueName, info.Param)
                }
            } else {
                //Before the time, sleep for 1ms and then continue
                time.Sleep(time.Millisecond)
            }
        }
    }
}

Ordered collection does not have the blocking pop-up mechanism of list, so the program needs to loop continuously and try to get the task to be executed from the queue, which will increase the load of network and processor. An adaptive method can be added to the function to automatically extend the sleep time when no executable task is found in a period of time, or determine the sleep time according to the execution time of the next task, and limit the maximum sleep time to 100ms, so as to ensure that the task can be executed in time.P138

Message pullP139

When two or more clients send and receive messages to each other, they usually use the following two methods to transfer information:P139

  • Push messaging: the sender ensures that all recipients have successfully received the message.RedisBuilt in for message pushPUBLISHOrders andSUBSCRIBECommand (05. The introduction of redis’s other commands introduces the usage and defects of these two commands)
  • Pull messaging: that is, the receiver gets the stored information by himself
Single recipientP140

For a single receiver, you only need to save the sent information to the corresponding list of each receiverRPUSHYou can send a message to the execution recipient using theLTRIMYou can remove the first few elements from the list to get the received message.P140

Multiple recipientsP141

The situation of multiple recipients is similar to that of a group, that is, people in the group can send messages and others can receive them. We can use the following data structures to store the required data in order to achieve our required functions:

  • String: Group‘s message autoincrement ID

    • INCR: realize ID auto increment and get
  • Zset: stores each message in the group, and the score is the self increasing ID of the message in the current group

    • ZRANGEBYSCORE: get unreached message
  • Zset: stores the ID of the latest message obtained by each person in the group, which is 0 when all messages are not obtained

    • ZCARD: get the number of people in the group
    • ZRANGE: after processing, the function of which messages are successfully received by which people can be realized
    • ZRANGE: get the minimum ID data, which can realize the function of deleting the message obtained by everyone
  • Zset: stores the ID of the latest message obtained by all groups of a person. It is automatically deleted when leaving the group and initialized to 0 when joining the group

    • ZCARD: get the number of groups
    • ZRANGE: after processing, it can realize the function of batch fetching the unreachable messages of all groups

Document distributionP145

Aggregating user data by geographic locationP146

Now we have the daily activity time and specific operation of each IP. Now we need to calculate the number of people operating in each city every day (similar to counting daily life).

The original data is very huge, so it needs to be read into memory in batches for aggregation statistics. However, the aggregated data is relatively small, so it is completely possible to aggregate statistics in memory, and then write the results to memoryRedisIt can effectively reduce the number of programs and programsRedisService communication times, shorten the task time.

Log distribution and processing

Now, the local log of a machine needs to be analyzed by multiple log processors.

This scenario is similar to a group, so we can reuse the message pull component mentioned above, which supports multiple recipients.

Local machine:

  1. Send all logs to the group, and finally send an end message
  2. Wait for all log processors to finish processing (completion ID corresponding to the group = number of members in the group – 1)
  3. Clean up all logs sent this time

Log processor:

  1. Continue to pull messages from the group, and enter the relevant processing, until the end of the pull message
  2. The completion identifier corresponding to the group is checkedINCR, indicating that the current log processor has finished processing

This article starts with the official account: full Fu machine (click to view the original), open source in GitHub:reading-notes/redis-in-action
Redis real combat 09. Implement task queue, message pull and file distribution