Analysis of performance problems caused by connection with table valued functions

Time:2021-5-8

Table valued function

    SQL server provides functions similar to other programming languages, and the essence of functions is usually encapsulation of a piece of code and return value. In SQL server, functions can return not only simple data types (int, varchar, etc.), but also a set, that is, a table.
    According to whether to return the set directly or after definition, table valued functions are divided into inline user-defined table valued functions and user-defined table valued functions (hereinafter referred to as table valued functions, omitting the word “user defined”).

inline table-valued functions
    There is no difference between an inline table valued function and an ordinary function. The only difference is that the returned result is a set (table), not a simple data type. A simple inline table valued function is shown in code listing 1 (excerpted from MSDN).


CREATE FUNCTION Sales.ufn_CustomerNamesInRegion
( @Region nvarchar(50) )
RETURNS table
AS
RETURN (
SELECT DISTINCT s.Name AS Store, a.City
FROM Sales.Store AS s
INNER JOIN Person.BusinessEntityAddress AS bea 
ON bea.BusinessEntityID = s.BusinessEntityID 
INNER JOIN Person.Address AS a 
ON a.AddressID = bea.AddressID
INNER JOIN Person.StateProvince AS sp 
ON sp.StateProvinceID = a.StateProvinceID
WHERE sp.Name = @Region
);
GO

Code listing 1. A simple table valued function

User defined table valued functions
For user-defined table valued functions, you need to define the returned table structure at the beginning of the function, and then you can write any code for data operation, insert it into the defined table structure and return it. An example of a slightly responsible user-defined table valued function is shown in code listing 2 (excerpted from MSDN).


CREATE FUNCTION dbo.ufnGetContactInformation(@ContactID int) 
RETURNS @retContactInformation TABLE 
( 
-- Columns returned by the function 
ContactID int PRIMARY KEY NOT NULL, 
FirstName nvarchar(50) NULL, 
LastName nvarchar(50) NULL, 
JobTitle nvarchar(50) NULL, 
ContactType nvarchar(50) NULL 
) 
AS 
-- Returns the first name, last name, job title, and contact type for the specified contact. 
BEGIN 
DECLARE 
@FirstName nvarchar(50), 
@LastName nvarchar(50), 
@JobTitle nvarchar(50), 
@ContactType nvarchar(50); 
-- Get common contact information 
SELECT 
@ContactID = BusinessEntityID, 
@FirstName = FirstName, 
@LastName = LastName 
FROM Person.Person 
WHERE BusinessEntityID = @ContactID; 
-- Get contact job title 
SELECT @JobTitle = 
CASE 
-- Check for employee 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'EM') 
THEN (SELECT JobTitle 
FROM HumanResources.Employee AS e 
WHERE e.BusinessEntityID = @ContactID) 
-- Check for vendor 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'VC') 
THEN (SELECT ct.Name 
FROM Person.ContactType AS ct 
INNER JOIN Person.BusinessEntityContact AS bec 
ON bec.ContactTypeID = ct.ContactTypeID 
WHERE bec.PersonID = @ContactID) 
 
-- Check for store 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'SC') 
THEN (SELECT ct.Name 
FROM Person.ContactType AS ct 
INNER JOIN Person.BusinessEntityContact AS bec 
ON bec.ContactTypeID = ct.ContactTypeID 
WHERE bec.PersonID = @ContactID) 
ELSE NULL 
END; 
-- Get contact type 
SET @ContactType = 
CASE 
-- Check for employee 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'EM') 
THEN 'Employee' 
-- Check for vendor 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'VC') 
THEN 'Vendor Contact' 
-- Check for store 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'SC') 
THEN 'Store Contact' 
-- Check for individual consumer 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'IN') 
THEN 'Consumer' 
-- Check for general contact 
WHEN EXISTS(SELECT * FROM Person.Person AS p 
WHERE p.BusinessEntityID = @ContactID AND p.PersonType = 'GC') 
THEN 'General Contact' 
END; 
-- Return the information to the caller 
IF @ContactID IS NOT NULL 
BEGIN 
INSERT @retContactInformation 
SELECT @ContactID, @FirstName, @LastName, @JobTitle, @ContactType; 
END; 
RETURN; 
END; 
GO

Code order 2. Table valued function

Why use table valued functions
    It doesn’t look like a table valued function does anything different from a stored procedure, but it does. Because table valued functions can be used to write other queries, but stored procedures can’t. In addition, the combination of table valued function and apply operator can greatly simplify the join operation.

If a stored procedure meets one of the following conditions, consider rewriting it as a table valued function.

The logic of stored procedure is very simple. It’s just a select statement. The reason why you don’t use the view is that you need parameters.
There is no update operation in the stored procedure.
There is no dynamic SQL in the stored procedure.
· only one result set is returned in a stored procedure.
The main purpose of a stored procedure is to generate a temporary result set and store the result set in a temporary table for other queries to call.

The problem of user defined table valued function

    The table valued function is different from the inline table valued function. The inline table valued function is more like a view in the process of processing, which means that in the query optimization stage, the inline table valued function can participate in the optimization of the query optimizer, such as pushing the filter condition (where) to the bottom of the algebraic tree, which means that where can be joined first, Thus, index lookup can be used to reduce IO and improve performance.
    Let’s take a simple example. The following code example is a simple example of joining with a table valued function:
    First, we create table valued functions, which are inline and table valued functions, as shown in code listing 3.

--Number of table valued rows created 
CREATE FUNCTION tvf_multi_Test ( ) 
RETURNS @SaleDetail TABLE ( ProductId INT ) 
AS 
BEGIN 
INSERT INTO @SaleDetail 
SELECT ProductID 
FROM Sales.SalesOrderHeader soh 
INNER JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID 
RETURN 
END 
--Creating inline table valued functions 
CREATE FUNCTION tvf_inline_Test ( ) 
RETURNS TABLE 
AS 
RETURN 
SELECT ProductID 
FROM Sales.SalesOrderHeader soh 
INNER JOIN Sales.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID

Code listing 3. Creating two different functions

Now, we join the two table valued functions using the same query, and the code is shown in Listing 4.

--Table valued functions do join 
SELECT c.personid , 
Prod.Name , 
COUNT(*) 'numer of unit' 
FROM Person.BusinessEntityContact c 
INNER JOIN dbo.tvf_multi_Test() tst ON c.personid = tst.ProductId 
INNER JOIN Production.Product prod ON tst.ProductId = prod.ProductID 
GROUP BY c.personid , 
Prod.Name 
 
--Join with inline table valued functions 
SELECT c.personid , 
Prod.Name , 
COUNT(*) 'numer of unit' 
FROM Person.BusinessEntityContact c 
INNER JOIN dbo.tvf_inline_Test() tst ON c.personid = tst.ProductId 
INNER JOIN Production.Product prod ON tst.ProductId = prod.ProductID 
GROUP BY c.personid , 
Prod.Name

Code listing 4. Join table valued functions and inline table valued functions

The cost of execution is shown in Figure 1.

Figure 1. Cost of the two methods

From the IO point of view, it is obvious that the suboptimal execution plan is selected, and business entity contact selects 121317 searches instead of one scan. The inline table function can correctly know that the cost of a scan is much lower than that of a search.

The root of the problem is the inline table valued function. For SQL server, it is the same as the view, which means that the inline table valued function can participate in the algebraic operation (or algebraic tree optimization) of the logical execution plan, which means that the inline table can be further split (as shown in figure 1, the query of the second inline table, The execution plan specifically knows that the sales order header table and the sales order detail table are included in the introverted table. Since only one column is selected in the query, the execution plan is optimized until it is unnecessary to scan the sales order header table). For the inline table valued function, the execution plan can completely know the metadata such as the index and related statistical information on the involved table.
On the other hand, the table valued function, as shown in the first part of Figure 1, is a black box for the entire execution plan, which has neither statistical information nor index. The table involved in the table valued function is not known in the execution plan (the temporary table # ae4e5168 is shown in Figure 1, not specifically). Therefore, for the whole execution plan, SQL server will assume that the result set returned by the table valued function is very small. When there are many results returned by the table valued function (as shown in this example), a relatively poor execution plan will be generated.
Therefore, to sum up, when the return result of table valued function is very small, it may not affect the performance, but if the return result is a little more, it will certainly affect the quality of the execution plan.

How to deal with it
First of all, in SQL server, we need to find the existing and table valued functions to join. By mining the execution plan, we can find out such statements. The code used is shown in Listing 5.


WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS p) 
SELECT st.text, 
qp.query_plan 
FROM ( 
SELECT TOP 50 * 
FROM sys.dm_exec_query_stats 
ORDER BY total_worker_time DESC 
) AS qs 
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st 
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp 
WHERE qp.query_plan.exist('//p:RelOp[contains(@LogicalOp, "Join")]/*/p:RelOp[(@LogicalOp[.="Table-valued function"])]') = 1

Code listing 5. Find and table valued functions from the execution plan cache to do join queries

The results are shown in Figure 2.

Figure 2. Executing join queries with table valued functions that already exist in the plan cache

Summary
This paper describes the concept of table valued functions, why table valued functions affect performance, and how to find join queries with table valued functions in the execution plan cache. It may not affect the query with a very small number of rows returned by apply or table valued functions. However, if you join a table valued function that returns more results, it may cause performance problems. Therefore, if possible, rewriting the table valued function as an inline table valued function or storing the results of the table valued function in a temporary table for joining can improve performance.

reference material:

http://www.brentozar.com/blitzcache/tvf-join/

http://blogs.msdn.com/b/psssql/archive/2010/10/28/query-performance-and-multi-statement-table-valued-functions.aspx?CommentPosted=true#commentmessage