What index to add in case of join by two optional fields

Question 1

I have query similar to the one below, it joins two tables by field which can have NULL value. The result matchs the data if both tables have the same data or consider the condition as optional.

SELECT B.X, A.Y FROM tableA A
INNER JOIN tableB B ON 
 ISNULL(B.level5, A.f05) = A.f05 
 AND ISNULL(B.level6, A.f06) = A.f06

The best index for table B I could think of is such which

CREATE NONCLUSTERED INDEX [Index_On_Table_B] ON B
( X ASC ) INCLUDE (level5, level6)

It helps a bit and from execution plan I can see the index is used but 90 % of the cost is spent in Nested Loops (Inner join).

Is there any way how to get this query working faster?

Question 2

Sure; fix your data model so that you are not using NULLABLE fields as FK's and PK's.

Question 3

Any time you apply a function to a column in a query, you prevent SQL Server from being able to use an index on that column. Also, it may often not be possible for SQL Server to use one of its more powerful join operations: merge and hash joins, which require equijoins - a simple columnA = columnB. For efficient queries, as many terms in the WHERE clause, and ON clauses, as possible should be sargable.

I think what you actually wanted is

SELECT B.X, A.Y
FROM tableA A INNER JOIN tableB B
ON (B.level5 = A.f05 OR B.level5 IS NULL) AND 
 (B.level6 = A.f06 OR B.level6 IS NULL)

Then I would expect that a non-clustered index on level5, level6 would do the job:

CREATE NONCLUSTERED INDEX Index_On_B ON tableB (level5, level6)

All your current index is doing is storing just the data from the columns in this query, which will cause less I/O than scanning tableB itself, assuming that there are other columns in tableB.

Question 4

Thanks for the suggestion. I have found out the field B.level5 is actually always populated so I can use equality instead of ISNULL function. This way mssql has much better execution plan.

Mike Dimmick Mike Dimmick 1212 bronze badges · Answer 1 · 2013-03-11 19:33:13Z

Any time you apply a function to a column in a query, you prevent SQL Server from being able to use an index on that column. Also, it may often not be possible for SQL Server to use one of its more powerful join operations: merge and hash joins, which require equijoins - a simple columnA = columnB. For efficient queries, as many terms in the WHERE clause, and ON clauses, as possible should be sargable.

I think what you actually wanted is

SELECT B.X, A.Y
FROM tableA A INNER JOIN tableB B
ON (B.level5 = A.f05 OR B.level5 IS NULL) AND 
 (B.level6 = A.f06 OR B.level6 IS NULL)

Then I would expect that a non-clustered index on level5, level6 would do the job:

CREATE NONCLUSTERED INDEX Index_On_B ON tableB (level5, level6)

All your current index is doing is storing just the data from the columns in this query, which will cause less I/O than scanning tableB itself, assuming that there are other columns in tableB.

Thanks for the suggestion. I have found out the field B.level5 is actually always populated so I can use equality instead of ISNULL function. This way mssql has much better execution plan.

Stack Exchange Network

What index to add in case of join by two optional fields

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

What index to add in case of join by two optional fields

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions