Add the built-in function as tidb contributor |

Time:2020-10-26

Recently, we have made some improvements to tidb code, greatly simplifying the process of adding built-in functions. This tutorial will share with you how to add builtin functions to tidb. First, introduce some necessary background knowledge, then introduce the process of adding builtin function, and finally take a function as an example.

background knowledge

After the SQL statement is sent to tidb, it will first go through the parser, change the text parse into ast (abstract syntax tree), generate the execution plan through query optimizer, get an executable plan, and then get the result by executing this plan. During this period, how to get the table will be involved How to filter, calculate, sort, aggregate, filter weight, and evaluate expressions.
For a builtin function, it is important to parse the syntax and how to evaluate it. In the syntax parsing part, we need to know how to write yacc and how to modify tidb’s lexical parser. We have done this part of work in advance, and most of the parsing work of builtin function has been completed.
The evaluation of the builtin function needs to be completed under the expression evaluation framework of tidb. Each builtin function is considered as an expression, which is represented by a scalarfunction. Each builtin function obtains the corresponding function type and function signature through its function name and parameters, and then evaluates it through the function signature.
Generally speaking, the above process is relatively complicated for those who are not familiar with tidb. We have done some work on this part, and have done a unified treatment of some procedural and tedious work. At present, we have completed the parsing of most of the unimplemented buitlin functions and the search for function signatures, but the function implementation part is left blank.In other words, as long as you find the function implementation left blank and complete it, it can be used as a pr.

The whole process of adding builtin function

  • An unimplemented function was found
    Search in the expression directory of tidb source codeerrFunctionNotExistsYou can find all the unimplemented functions and select one of the functions of interest, such as Sha2 function:

func (b *builtinSHA2Sig) eval(row []types.Datum) (d types.Datum, err error) {
    return d, errFunctionNotExists.GenByArgs("SHA2")
}
  • Implement function signature
    The next thing to do is to implement the eval method. Please refer to the MySQL document for the function functions, and refer to the functions that have been implemented for specific implementation methods.

  • Adding type inference information to typeinferer
    In the plan/ typeinferer.go Add the return result type of this function in handlefunccallexpr() in. Please keep the result consistent with that of MySQL. See MySQL const for all type definitions.

*Note that in addition to filling in the return value type, most functions also need to get the length of the return value.
  • Write unit test
    In the expression directory, add unit tests for the implementation of the function. At the same time, you should also add unit tests in the plan / type informer_ test.go Add the unit test of typeinformer in the file

  • Run make dev to make sure all test cases can run through

Examples

Here, the PR of the new sha1() function is taken as an example to explain in detail
First lookexpression/builtin_encryption.go
Complete the evaluation method of sha1()

func (b *builtinSHA1Sig) eval(row []types.Datum) (d types.Datum, err error) {
    //First, the parameter is evaluated, which is generally not modified
    args, err := b.evalArgs(row)
    if err != nil {
        return types.Datum{}, errors.Trace(err)
    }
    //Please refer to the MySQL documentation for the meaning of each parameter
    // SHA/SHA1 function only accept 1 parameter
    arg := args[0]
    if arg.IsNull() {
        return d, nil
    }
    //Here, a type conversion is made for the parameter value. Please refer to util / types for the implementation of the function/ datum.go
    bin, err := arg.ToBytes()
    if err != nil {
        return d, errors.Trace(err)
    }
    hasher := sha1.New()
    hasher.Write(bin)
    data := fmt.Sprintf("%x", hasher.Sum(nil))
    //Set return value
    d.SetString(data)
    return d, nil
}

Next, add unit tests to the function implementation, seeexpression/builtin_encryption_test.go

var shaCases = []struct {
    origin interface{}
    crypt  string
 }{
    {"test", "a94a8fe5ccb19ba61c4c0873d391e987982fbbd3"},
    {"c4pt0r", "034923dcabf099fc4c8917c0ab91ffcd4c2578a6"},
    {"pingcap", "73bf9ef43a44f42e2ea2894d62f0917af149a006"},
    {"foobar", "8843d7f92416211de9ebb963ff4ce28125932878"},
    {1024, "128351137a9c47206c4507dcf2e6fbeeca3a9079"},
    {123.45, "22f8b438ad7e89300b51d88684f3f0b9fa1d7a32"},
 }
 
 func (s *testEvaluatorSuite) TestShaEncrypt(c *C) {
    defer  testleak.AfterTest (c) () // the tool for monitoring goroutine leakage can be copied directly
    fc := funcs[ast.SHA]
    for _, test := range shaCases {
        in := types.NewDatum(test.origin)
        f, _ := fc.getFunction(datumsToConstants([]types.Datum{in}), s.ctx)
        crypt, err := f.eval(nil)
        c.Assert(err, IsNil)
        res, err := crypt.ToString()
        c.Assert(err, IsNil)
        c.Assert(res, Equals, test.crypt)
    }
    // test NULL input for sha
    var argNull types.Datum
    f, _ := fc.getFunction(datumsToConstants([]types.Datum{argNull}), s.ctx)
    crypt, err := f.eval(nil)
    c.Assert(err, IsNil)
    c.Assert(crypt.IsNull(), IsTrue)
}
*Note that in addition to the normal case, it is better to add some exception cases, such as the input value of nil, or multiple types of parameters

Finally, you need to add type derivation information and test case. Seeplan/typeinferer.goplan/typeinferer_test.go

case ast.SHA, ast.SHA1:
        tp = types.NewFieldType(mysql.TypeVarString)
        chs = v.defaultCharset
        tp.Flen = 40
        {`sha1(123)`, mysql.TypeVarString, "utf8"},
        {`sha(123)`, mysql.TypeVarString, "utf8"},

Editor’s note: add tidb robot wechat, join tidb contributor club, participate in open source projects without barriers, and change the world. Start here (Mengmeng DA).

Add the built-in function as tidb contributor |

It’s said that if you become a tidb contributor, you’ll get a limited edition mug~

Add the built-in function as tidb contributor |