Golang Version of Multiple Transactions in Distributed Message Queue Based on 2PC and Delayed Update



Distributed Multi-Message Transaction Problem

Golang Version of Multiple Transactions in Distributed Message Queue Based on 2PC and Delayed Update
In message queue usage scenarios, it is sometimes necessary to send multiple messages at the same time, but current message queues such as Kafka only support single message transaction assurance, not multiple messages. Today’s solution is based on 2PC and delayed update in a sub-project of Kafka to achieve distributed transactions.


Golang Version of Multiple Transactions in Distributed Message Queue Based on 2PC and Delayed Update
2PC is commonly known as two-stage submission, by dividing an operation into two stages: preparation stage and submission stage to ensure the atomic execution of the operation as far as possible (actually impossible, everyone has a concept first)

Delayed updates

Golang Version of Multiple Transactions in Distributed Message Queue Based on 2PC and Delayed Update
Delayed updating is actually a very common technical means. Simply speaking, when a certain operating condition is not satisfied, temporary storage of data through certain means and execution when the conditions are satisfied.

Implementation of Distributed Transaction Based on 2PC and Delay Queue

system architecture

Golang Version of Multiple Transactions in Distributed Message Queue Based on 2PC and Delayed Update
The implementation is quite simple, adding a transaction message after the original business message (the transaction message can be related to the previously submitted message through a similar unique ID). If the worker does not consume the message submitted by the transaction, it will always put the message in the local delayed storage. Only when the transaction submission message is received will it be able to do so. Business logic processing

operation flow


  1. Send Business Message Group Item by Item
  2. Send transaction submission message


  1. Consumption message queue to store business messages in local Delayed Storage
  2. Receive the commit transaction message, retrieve all data from the local delayed store, and then delete the message from the delayed store

code implementation

Core components

Golang Version of Multiple Transactions in Distributed Message Queue Based on 2PC and Delayed Update
MemoryQuue: Used to simulate message queues and receive event distribution events
Worker: Simulate specific business services, receive messages, store local delayed update storage, or submit transactions to trigger business callbacks

Event and EventListener

Event: Used to identify events, users encapsulate business data as events and store them in MemoryQueue
EventListener: Event callback interface for callbacks after MemoryQueue receives data
When an event is sent, it needs to be identified by a prefix. There are three kinds of Task Prefix, CommitTask Prefix and ClearTask Prefix.

const (
    // TaskPrefix task key prefix
    TaskPrefix string = "task-"
    // CommitTask Prefix Submits Task Key Prefix
    CommitTaskPrefix string = "commit-"
    // ClearTask Prefix Clearance Task
    ClearTaskPrefix string = "clear-"

// Event event type
type Event struct {
    Key   string
    Name  string
    Value interface{}

// EventListener for receiving message callbacks
type EventListener interface {
    onEvent(event *Event)


MemoryQueue Memory Message Queue receives user data through the Push interface, registers EventListener through AddListener, and disseminates data from Chan event internally through poll to all Listeners.

// MemoryQueue Memory Message Queue
type MemoryQueue struct {
    done      chan struct{}
    queue     chan Event
    listeners []EventListener
    wg        sync.WaitGroup

// Push Add Data
func (mq *MemoryQueue) Push(eventType, name string, value interface{}) {
    mq.queue <- Event{Key: eventType + name, Name: name, Value: value}

// AddListener adds listeners
func (mq *MemoryQueue) AddListener(listener EventListener) bool {
    for _, item := range mq.listeners {
        if item == listener {
            return false
    mq.listeners = append(mq.listeners, listener)
    return true

// Notify distributes messages
func (mq *MemoryQueue) Notify(event *Event) {
    defer mq.wg.Done()
    for _, listener := range mq.listeners {

func (mq *MemoryQueue) poll() {
    for {
        select {
        case <-mq.done:
        case event := <-mq.queue:

// Start starts the memory queue
func (mq *MemoryQueue) Start() {
    go mq.poll()

// Stop Stops Memory Queue
func (mq *MemoryQueue) Stop() {


Worker receives data from MemoryQueue, and then processes event types locally according to different types of events, mainly through the event prefix to select the corresponding event callback function.

// Worker Work Process
type Worker struct {
    name                string
    deferredTaskUpdates map[string][]Task
    onCommit            ConfigUpdateCallback

func (w *Worker) onEvent(event *Event) {
    switch {
    // Getting Task Events
    case strings.Contains(event.Key, TaskPrefix):
        // Clear tasks in local delay queues
    case strings.Contains(event.Key, ClearTaskPrefix):
        // Get commit events
    case strings.Contains(event.Key, CommitTaskPrefix):

Event Handling Tasks

Event processing tasks are mainly divided into: onTaskClear, onTaskEvent and onTaskCommit.

func (w *Worker) onTaskClear(event *Event) {
    task, err := event.Value.(Task)
    if !err {
        // log
    _, found := w.deferredTaskUpdates[task.Group]
    if !found {
    delete(w.deferredTaskUpdates, task.Group)
    // You can also continue to stop locally started tasks

// onTaskCommit receives task submission, retrieves data from the delayed queue and performs business logic processing
func (w *Worker) onTaskCommit(event *Event) {
    // Get all tasks previously received locally
    tasks, found := w.deferredTaskUpdates[event.Name]
    if !found {

    // Get configuration
    config := w.getTasksConfig(tasks)
    if w.onCommit != nil {
    delete(w.deferredTaskUpdates, event.Name)

// onTaskEvent receives task data, which needs to be dropped to local temporary storage and cannot be applied
func (w *Worker) onTaskEvent(event *Event) {
    task, err := event.Value.(Task)
    if !err {
        // log

    // Save tasks until delayed update map
    configs, found := w.deferredTaskUpdates[task.Group]
    if !found {
        configs = make([]Task, 0)
    configs = append(configs, task)
    w.deferredTaskUpdates[task.Group] = configs

// GettTasksConfig Gets Task Task Task List
func (w *Worker) getTasksConfig(tasks []Task) map[string]string {
    config := make(map[string]string)
    for _, t := range tasks {
        config = t.updateConfig(config)
    return config

Mainstream process

unc main() {

    // Generate a memory queue boot
    queue := NewMemoryQueue(10)

    // Generate a worker
    name := "test"
    worker := NewWorker(name, func(data map[string]string) {
        for key, value := range data {
            println("worker get task key: " + key + " value: " + value)
    // Register in the queue

    taskName := "test"
    // Task events sent by events
    configs := []map[string]string{
        map[string]string{"task1": "SendEmail", "params1": "Hello world"},
        map[string]string{"task2": "SendMQ", "params2": "Hello world"},

    // Distribution tasks
    queue.Push(ClearTaskPrefix, taskName, nil)
    for _, conf := range configs {
        queue.Push(TaskPrefix, taskName, Task{Name: taskName, Group: taskName, Config: conf})
    queue.Push(CommitTaskPrefix, taskName, nil)
    // Stop queue


# go run main.go
worker get task key: params1 value: Hello world
worker get task key: task1 value: SendEmail
worker get task key: params2 value: Hello world
worker get task key: task2 value: SendMQ


In distributed environments, there is no need to use the CP model in many cases, but more often to satisfy the final consistency.

This design, based on 2PC and delay queue, relies mainly on event-driven architecture.

In Kafka connect, each node change triggers a task redistribution, so delayed storage directly uses HashMap in memory, because even if the primary node of the message allocation hangs up, it triggers another event, clears the data in the HashMap directly, and carries on the next transaction without guarantee. The data in the delayed storage is not lost.

So the solution can make some trade-offs because of the different environment and requirements. There is no need to add a middleware of CP model to everything. Of course, it’s simpler.

Not finished yet! More articles can be accessed at http://www.sreguide.com/