Detailed explanation of MySQL function call optimization

Time:2021-10-15

Function call optimization

MySQL functions are internally marked as deterministic or uncertain. If a function with a fixed value of a given parameter can return different results for different calls, it is uncertain. Examples of uncertain functions:RAND()UUID()

If a function is marked as indeterminate, theWHEREEvaluate the reference to the function in the clause for each row (when selecting from a table) or combination of rows (when selecting from a multi table join).

MySQL also determines when to evaluate a function based on the type of parameter (whether the parameter is a table column or a constant value). Whenever a table column changes its value, you must evaluate the deterministic function that uses the table column as a parameter.

Non deterministic functions may affect query performance. For example, some optimizations may not be available, or more locks may be required. The following discussion usesRAND()But it also applies to other uncertainty functions.

Suppose a table t has the following definitions:


CREATE TABLE t (id INT NOT NULL PRIMARY KEY, col_a VARCHAR(100));

Consider the following two queries:


SELECT * FROM t WHERE id = POW(1,2);
SELECT * FROM t WHERE id = FLOOR(1 + RAND() * 49);

Due to the equality comparison with the primary key, both queries seem to use the primary key lookup, but this only applies to the first query:

  • The first query always produces at most one row becausePOW()A constant with a constant parameter is a constant value and is used for index lookup.
  • The second query contains an expression that uses a nondeterministic functionRAND()It is not a constant in the query, but actually has a new value t for each row of the table. Therefore, the query reads each row of the table, evaluates the predicate of each row, and outputs all rows whose primary key matches the random value. Depending on the ID column value and the value in the rand () sequence, it can be zero row, one row or more rows.

The impact of uncertainty is not limited toSELECTstate. ShouldUPDATEStatement uses a non deterministic function to select the row to modify:


UPDATE t SET col_a = some_expr WHERE id = FLOOR(1 + RAND() * 49);

The general purpose is to update at most one row whose primary key matches the expression. However, it may update zero, one or more rows, depending onidColumn value sumRAND()Values in the sequence.

The behavior just described has an impact on performance and replication:

  • Since it is uncertain that the function will not produce a constant value, the optimizer cannot use other strategies that may apply, such as index lookup. The result may be a table scan.
  • InnoDBIt is possible to upgrade to a range key lock instead of acquiring a single row lock for a matching row.
  • It cannot be determined that the update performed is not safe for replication.

Difficulties stem fromRAND()The fact that the function is evaluated once for each row of the table. To avoid a multi-functional assessment, use one of the following techniques:

  • Move the expression containing the uncertainty function to a separate statement and save the value in the variable. In the original statement, replace the expression with a reference to a variable that the optimizer can treat as a constant value:

SET @keyval = FLOOR(1 + RAND() * 49);
UPDATE t SET col_a = some_expr WHERE id = @keyval;
  • Assign random values to variables in derived tables. This technique enables variables toWHEREA value is assigned before being used in the comparison in clause:

SET optimizer_switch = 'derived_merge=off';
UPDATE t, (SELECT @keyval := FLOOR(1 + RAND() * 49)) AS dt
SET col_a = some_expr WHERE id = @keyval;

As mentioned earlier, theWHEREUncertain expressions in clauses may prevent optimization and cause table scans. But,WHEREIf other expressions are deterministic, the clause can be partially optimized. For example:


SELECT * FROM t WHERE partial_key=5 AND some_column=RAND();

If the optimizer canpartial_keyUsed to reduce the set of selected rows,RAND()The number of execution is less, which can reduce the impact of uncertainty on optimization.

The above is a detailed explanation of MySQL function call optimization. For more information about MySQL function call optimization, please pay attention to other related articles of developeppaer!

Recommended Today

SQL exercise 20 – Modeling & Reporting

This blog is used to review and sort out the common topic modeling architecture, analysis oriented architecture and integration topic reports in data warehouse. I have uploaded these reports to GitHub. If you are interested, you can have a lookAddress:https://github.com/nino-laiqiu/TiTanI recorded a relatively complete development process in my hexo blog deployed on GitHub. You can […]