The cache design is good, and the service will not fail

Time:2021-7-27

This article is modified from the live broadcast content of go zero in the fourth issue of “go open source theory”. The video content is long and divided into two parts. The content of this article has been deleted and reconstructed.

Hello, I’m glad to come to “go open source theory” to share with you some stories, design ideas and usage methods behind the open source project. Today’s shared project is go zero, a web and RPC framework integrating various engineering practices. I’m Kevin, the author of go zero, and my GitHub ID iskevwan

Go zero overview

Although go zero was opened on August 7, 20 years ago, it has been tested on a large scale online. It is also my accumulation of engineering experience in recent 20 years. After opening the source, I received positive feedback from the community and obtained 6K stars in more than five months. It has topped the GitHub go language daily list, weekly list and monthly list for many times, and won the gitee most valuable project (GVP) and the best popular project of the year in open source China. At the same time, the wechat community is extremely active. With a community group of 3000 + people, go zero lovers share their experience in the use of go zero and discuss problems in the use process.

The cache design is good, and the service will not fail

How does go zero automatically manage the cache?

Cache design principles

We only delete the cache and do not update it. Once the data in the DB is modified, we will directly delete the corresponding cache instead of updating it.

Let’s see how to delete the cache in the correct order.

  • Delete the cache before updating the DB

The cache design is good, and the service will not fail

Let’s look at two concurrent requests. Request a needs to update the data. First delete the cache, and then request B to read the data. At this time, if there is no data in the cache, the data will be loaded from the DB and written back to the cache. Then a updates the DB, and the data in the cache will always be dirty. It is known that the cache has expired or there are new requests to update data. As shown in the figure

The cache design is good, and the service will not fail

  • Update the DB before deleting the cache

The cache design is good, and the service will not fail

A requests to update the DB first, and then B requests to read the data. At this time, the old data is returned. At this time, it can be considered that the a request has not been updated, and the final consistency is acceptable. Then a deletes the cache, and subsequent requests will get the latest data, as shown in the figure

The cache design is good, and the service will not fail

Let’s take another look at the normal request process:

  1. The first request updates the DB and deletes the cache
  2. The second request reads the cache. If there is no data, it reads the data from the DB and writes it back to the cache
  3. Subsequent read requests can be read directly from the cache

The cache design is good, and the service will not fail

Let’s take another look at the DB query. Suppose there are seven columns of ABCDEFG data in the row record:

  1. The request to query only part of the column data, such as ABC, CDE or EFG, is shown in the figure

The cache design is good, and the service will not fail

  1. Query a single complete line record, as shown in the figure

The cache design is good, and the service will not fail

  1. Query some or all columns of multiple row records, as shown in the figure

The cache design is good, and the service will not fail

For the above three cases, first of all, we do not need some queries, because some queries cannot be cached. Once cached, the data is updated, and it is impossible to locate which data needs to be deleted; Secondly, for multi row queries, we will establish the corresponding mapping from query criteria to primary key in the business layer according to the actual scenarios and needs; For the query of single line complete records, go zero has built-in complete cache management mode. So the core principles are:The go zero cache must be complete row records

Let’s introduce the cache processing methods of three built-in scenarios in go zero in detail:

  1. Primary key based caching

    PRIMARY KEY (`id`)

    This kind of cache is relatively the easiest to handle, which only needs to be stored in theredisIn useprimary keyAskeyTo cache row records.

  2. Cache based on unique index

    The cache design is good, and the service will not fail

    When I do index based cache design, I use it for referencedatabaseIndex design method, indatabaseIn the design, if you check the data through the index, the engine will firstIndex - > primary keyYestreeTo find the primary key and then query the row records through the primary key is to introduce an indirect layer to solve the corresponding problem of indexing to row records. The same principle is used in the cache design of go zero.

    Index based cache is divided into single column unique index and multi column unique index:

    • The unique index of a single column is as follows:

      UNIQUE KEY `product_idx` (`product`)
    • The unique indexes of multiple columns are as follows:

      UNIQUE KEY `vendor_product_idx` (`vendor`, `product`)

    But for go zero, single column and multi column only generate cachekeyThe way is different, and the control logic behind it is the same. Then, the built-in cache management of go zero not only controls the data consistency problem, but also prevents the cache breakdown, penetration and avalanche problems (these have been discussed carefully at the gophergina conference. See the subsequent gophergina sharing videos).

    In addition, go zero has built-in statistics of cache accesses and access hit rate, as shown below:

    dbcache(sqlc) - qpm: 5057, hit_ratio: 99.7%, hit: 5044, miss: 13, db_fails: 0

    You can see more detailed statistical information, which is convenient for us to analyze the usage of cache. For the case of very low cache hit rate or very small request volume, we can remove the cache, which can also reduce the cost.

Cache code interpretation

1. Cache logic based on primary key

The cache design is good, and the service will not fail

The specific implementation code is as follows:

func (cc CachedConn) QueryRow(v interface{}, key string, query QueryFn) error {
  return cc.cache.Take(v, key, func(v interface{}) error {
    return query(cc.db, v)
  })
}

thereTakeThe method is to go through the cache firstkeyGet the data. If you get it, you can return it directly. If you can’t get it, you can pass itqueryMethod goDBRead the full row record and write it back to the cache before returning the data. The whole logic is relatively simple and easy to understand.

Let’s take a closer lookTakeImplementation of:

func (c cacheNode) Take(v interface{}, key string, query func(v interface{}) error) error {
  return c.doTake(v, key, query, func(v interface{}) error {
    return c.SetCache(key, v)
  })
}

TakeThe logic is as follows:

  • usekeyFind data from cache
  • If found, data is returned
  • If not, usequeryMethod to read data
  • Call after readc.SetCache(key, v)Set cache

Among themdoTakeThe code and explanation are as follows:

//V - data object to be read
//Key - cache key
//Query - method used to read complete data from DB
//Cacheval - the method used to write the cache
func (c cacheNode) doTake(v interface{}, key string, query func(v interface{}) error,
  cacheVal func(v interface{}) error) error {
  //Use a barrier to prevent cache breakdown and ensure that there is only one request in a process to load the data corresponding to the key
  val, fresh, err := c.barrier.DoEx(key, func() (interface{}, error) {
    //Read data from cache
    if err := c.doGetCache(key, v); err != nil {
      //If it is a placeholder put in in advance (used to prevent cache penetration), the preset errnotfound is returned
      //If it is an unknown error, it will be returned directly, because we cannot give up the cache error and directly request all requests to the DB,
      //In this way, the DB will hang up in high concurrency scenarios
      if err == errPlaceholder {
        return nil, c.errNotFound
      } else if err != c.errNotFound {
        // why we just return the error instead of query from db,
        // because we don't allow the disaster pass to the DBs.
        // fail fast, in case we bring down the dbs.
        return nil, err
      }

      //Request DB
      //If the returned error is errnotfound, we need to set placeholder in the cache to prevent cache penetration
      if err = query(v); err == c.errNotFound {
        if err = c.setCacheWithNotFound(key); err != nil {
          logx.Error(err)
        }

        return nil, c.errNotFound
      } else if err != nil {
        //Statistics DB failed
        c.stat.IncrementDbFails()
        return nil, err
      }

      //Write data to cache
      if err = cacheVal(v); err != nil {
        logx.Error(err)
      }
    }

    //Return JSON serialized data
    return jsonx.Marshal(v)
  })
  if err != nil {
    return err
  }
  if fresh {
    return nil
  }

  // got the result from previous ongoing query
  c.stat.IncrementTotal()
  c.stat.IncrementHit()

  //Writes data to the incoming V object
  return jsonx.Unmarshal(val.([]byte), v)
}

2. Cache logic based on unique index

Because this is more complex, I marked the response code block and logic with different colors,block 2In fact, it is the same as the primary key based cache, which is mainly discussed hereblock 1The logic of.

The cache design is good, and the service will not fail

Code blockblock 1This part is divided into two cases:

  1. The primary key can be found from the cache through the index

    At this point, just use the primary keyblock 2The following is the same as the above primary key based cache logic

  2. The primary key cannot be found from the cache through the index

    • Query complete row records from DB through index, if anyerror, return
    • After a complete row record is found, the cache from the primary key to the complete row record and the cache from the index to the primary key will be written to the database at the same timeredisin
    • Return the required row record data
    //V - data object to be read
    //Key - cache key generated by index
    //Keyer - a method of generating keys based on the primary key cache with the primary key
    //Indexquery - the method of reading complete data from DB by index. It needs to return the primary key
    //Primaryquery - a method to obtain complete data from a DB using a primary key
    func (cc CachedConn) QueryRowIndex(v interface{}, key string, keyer func(primary interface{}) string,
    indexQuery IndexQueryFn, primaryQuery PrimaryQueryFn) error {
    var primaryKey interface{}
    var found bool
    
    //First, query the cache through the index to see if there is a cache indexed to the primary key
    if err := cc.cache.TakeWithExpire(&primaryKey, key, func(val interface{}, expire time.Duration) (err error) {
      //If there is no cache indexed to the primary key, the complete data is queried through the index
      primaryKey, err = indexQuery(cc.db, v)
      if err != nil {
        return
      }
    
      //When the complete data is queried through the index, set found and use it directly later. There is no need to read the data from the cache
      found = true
      //Save the mapping from the primary key to the complete data to the cache. The takewithexpire method has saved the mapping from the index to the primary key to the cache
      return cc.cache.SetCacheWithExpire(keyer(primaryKey), v, expire+cacheSafeGapBetweenIndexAndPrimary)
    }); err != nil {
      return err
    }
    
    //The data has been found through the index. Just return it directly
    if found {
      return nil
    }
    
    //Read data from the cache through the primary key. If the cache does not exist, read and write back the cache from the DB through the primaryquery method, and then return the data
    return cc.cache.Take(v, keyer(primaryKey), func(v interface{}) error {
      return primaryQuery(cc.db, v, primaryKey)
    })
    }

    Let’s look at a practical example

    func (m *defaultUserModel) FindOneByUser(user string) (*User, error) {
      var resp User
      //Generate index based key
      indexKey := fmt.Sprintf("%s%v", cacheUserPrefix, user)
    
      err := m.QueryRowIndex(&resp, indexKey,
        //Generate a full data cache key based on the primary key
        func(primary interface{}) string {
          return fmt.Sprintf("user#%v", primary)
        },
        //Index based DB query method
        func(conn sqlx.SqlConn, v interface{}) (i interface{}, e error) {
          query := fmt.Sprintf("select %s from %s where user = ? limit 1", userRows, m.table)
          if err := conn.QueryRow(&resp, query, user); err != nil {
            return nil, err
          }
          return resp.Id, nil
        },
        //DB query method based on primary key
        func(conn sqlx.SqlConn, v, primary interface{}) error {
          query := fmt.Sprintf("select %s from %s where id = ?", userRows, m.table)
          return conn.QueryRow(&resp, query, primary)
        })
    
      //For error handling, you need to determine whether sqlc.errnotfound is returned. If so, we use errnotfound defined in this package to return
      //Avoid the user's perception of whether the cache is used or not, and it is also an isolation of the underlying dependencies
      switch err {
        case nil:
          return &resp, nil
        case sqlc.ErrNotFound:
          return nil, ErrNotFound
        default:
          return nil, err
      }
    }

All the above automatic cache management codes can be generated automatically through goctl, which is within our teamCRUDAnd cache are basically generated automatically through goctl, which can save a lot of development time, and the cache code itself is very error prone. Even with good code experience, it is difficult to write correctly every time. Therefore, we recommend using automatic cache code generation tools as much as possible to avoid errors.

Need more?

If you want to better understand the go zero project, please go to the official website to learn specific examples.

Video playback address

www.bilibili.com/video/BV1Jy4y127X…

Project address

github.com/tal-tech/go-zero

Welcome to go zero andstarSupport us!

This work adoptsCC agreement, reprint must indicate the author and the link to this article