search for text across multiple tables

Question 1

I've got the following query.

SELECT xcli.ID_X_Table, cli.ID, cli.Presentation, COALESCE(cli.MobilePhone, cli.BusinessPhone, cli.HomePhone), EmailAddress, cli.Descr
FROM B_Client cli
INNER JOIN X_Table xcli ON xcli.TableName = 'B_Client' AND ISNULL(cli.Flg_Deleted, 0) = 0
WHERE cli.EmailAddress LIKE '%'+@SearchText+'%'
UNION 
SELECT xcli.ID_X_Table
 , cli.ID
 , cli.Presentation
 , ISNULL(cli.MobilePhone, ISNULL(cli.BusinessPhone, cli.HomePhone))
 , cli.EmailAddress
 , cli.Descr
FROM B_Client cli
INNER JOIN X_Table xcli ON xcli.TableName = 'B_Client'
 AND ISNULL(cli.Flg_Deleted, 0) = 0
WHERE RTRIM(ISNULL(cli.ClientName, '')) + ' ' + LTRIM(ISNULL(cli.FirstMiddleName, '')) LIKE '%'+@SearchText+'%'
 OR RTRIM(ISNULL(cli.FirstMiddleName, '')) + ' ' + LTRIM(ISNULL(cli.ClientName, '')) LIKE '%'+@SearchText+'%'
 OR SUBSTRING(RTRIM(ISNULL(cli.FirstMiddleName, '')), 0, PATINDEX('% %', RTRIM(ISNULL(cli.FirstMiddleName, '')))) + ' ' + LTRIM(ISNULL(cli.ClientName, '')) LIKE '%'+@SearchText+'%' 
UNION 
SELECT xcli.ID_X_Table
 , cli.ID
 , cli.Presentation
 , dbo.Get_B_Car_RegNum(car.ID, GETDATE()) as RegNum
 , car.Presentation
 , cli.Descr
FROM B_Car car
INNER JOIN X_Table xcli on xcli.TableName = 'B_Client'
INNER JOIN B_Client cli on cli.ID = [dbo].[Get_B_Car_ID_B_Client](car.ID, GETDATE())
WHERE car.CarPotential = 0
 AND (car.Descr LIKE '%'+@SearchText
 OR (SELECT TOP 1 RegNum FROM B_Car_RegNum
 WHERE ID_B_Car = car.ID
 AND Dte_Start <= GETDATE() ORDER BY Dte_Start DESC) LIKE '%'+@SearchText+'%')

This query searches for user-provided text across multiple tables (it's a sort of primitive search engine). The unions overlap because they are added programmatically if the input is appropriate. For example if the input contains letters it won't search for telephone numbers and so on.

The query is quite slow in production. It takes minutes to retrieve the query. The tables contain hundreds of thousands entries. I'm not sure if this query performs poorly due to its design, or because of the magnitude of data.

Tips and critique are welcome.

Question 2

Welcome to Code Review. What indexes are there on those tables? What does the output of explain look like for your query?

Question 3

An what are the apx table sizes

Question 4

You can find out if it is a volume thing just by select count(*)

Question 5

Why are you having to RTRIM your name fields? Make sure there are no trailing spaces on the data being input instead, surely? Function calls like this hamstring SQL as it can no longer use indexes on those fields but instead has to scan the entire table, applying the function to every value.

Question 6

You have a few different challenges that are affecting performance.

Expressions on the field used to filter, rather than converting the filtered result. EX: RTRIM( ISNULL( cli.ClientName, '' )) + ' ' + LTRIM( ISNULL( cli.FirstMiddleName, '' )) LIKE '%' + @SearchText + '%'
```
AND ISNULL( cli.Flg_Deleted, 0 ) = 0 
```

Using a scalar valued function in your resultset

RegNum = dbo.Get_B_Car_RegNum( car.ID, GETDATE()),

Using a scalar valued function in a join

ON cli.ID = dbo.Get_B_Car_ID_B_Client( car.ID, GETDATE())

Using a wildcard at the start of the search string
```
car.Descr LIKE '%' + @SearchText
```
Possibly, the use of UNION rather than UNION ALL.

The expressions and wildcards negate the use of any indexes and force a table scan on the underlying table. As an example changing AND ISNULL( cli.Flg_Deleted, 0 ) = 0 to AND ( CLI.Flg_Deleted IS NULL OR cli.Flg_Deleted = 0 ) can allow the optimizer to use an index on Flg_Deleted because the desired value is not dependent on an expression

The scalar valued function turns your set based query into a procedural query, (one row at a time) for which SQL Server is not optimized. See if you can convert this into an inline table-valued function. You should also consider dumping the result into a table variable and join on that. That would at least prevent running the udf for every possible combination of your joined table.

Each UNION will check for duplicates in the top and bottom result sets. If the data is guaranteed unique between the result sets, you'll gain some performance by switching to UNION ALL, removing the distinct sort across the combined results.

Question 7

Not sure if this will be faster but I think it is easier to read

Materialize the CTE(s) may help performance.

with CTEcliB as 
( SELECT cli.ID, cli.Presentation
 , COALESCE(cli.MobilePhone, cli.BusinessPhone, cli.HomePhone) as 'Phone'
 , EmailAddress, cli.Descr 
 , Ltrim(RTRIM(ISNULL(cli.ClientName, ''))) as 'ClientName' 
 , LTRIM(RTRIM(ISNULL(cli.FirstMiddleName, ''))) as 'FirstMiddleName'
 FROM B_Client cli 
 WHERE ISNULL(cli.Flg_Deleted, 0) = 0 
), CTEcar as 
( SELECT ID, RegNum 
 , row_number() over (partition by ID order by Dte_Start desc) as rn 
 FROM B_Car_RegNum 
 WHERE Dte_Start <= GETDATE() ORDER BY Dte_Start DESC 
)
SELECT xcli.ID_X_Table
 , cli.ID, cli.Presentation, cli.Phone, cli.EmailAddress, cli.Descr
 FROM CTEcliB cli
 JOIN X_Table xcli 
 ON xcli.TableName = 'B_Client'
 WHERE cli.EmailAddress LIKE '%'+@SearchText+'%'
 OR cli.ClientName + ' ' + cli.FirstMiddleName LIKE '%'+@SearchText+'%'
 OR cli.FirstMiddleName + ' ' + cli.ClientName LIKE '%'+@SearchText+'%' 
 UNION 
SELECT xcli.ID_X_Table
 , cli.ID, cli.Presentation
 , dbo.Get_B_Car_RegNum(car.ID, GETDATE()) as RegNum
 , car.Presentation, cli.Descr
FROM B_Car car
JOIN X_Table xcli 
 on xcli.TableName = 'B_Client'
JOIN B_Client cli 
 on cli.ID = [dbo].[Get_B_Car_ID_B_Client](car.ID, GETDATE())
WHERE car.CarPotential = 0
 AND ( car.Descr LIKE '%'+@SearchText
 OR (SELECT RegNum 
 FROM CTEcar
 WHERE ID_B_Car = car.ID
 AND RN = 1) LIKE '%'+@SearchText+'%'
 )

I sure hope ON xcli.TableName = 'B_Client' is only returning one row and you have an index on that column.

Question 8

Your where clauses are causing the query to run slowly

WHERE RTRIM(ISNULL(cli.ClientName, '')) + ' ' + LTRIM(ISNULL(cli.FirstMiddleName, '')) LIKE '%'+@SearchText+'%'
 OR RTRIM(ISNULL(cli.FirstMiddleName, '')) + ' ' + LTRIM(ISNULL(cli.ClientName, '')) LIKE '%'+@SearchText+'%'
 OR SUBSTRING(RTRIM(ISNULL(cli.FirstMiddleName, '')), 0, PATINDEX('% %', RTRIM(ISNULL(cli.FirstMiddleName, '')))) + ' ' + LTRIM(ISNULL(cli.ClientName, '')) LIKE '%'+@SearchText+'%'

The slow down here is that you are performing two trims and using a like comparison on every single row of the data set, and then returning the rows that match.

Execute the Select Statement without the Where clause and you will see how many rows that each trim and like is being performed on.

Another way to see where the slow down is occurring is to take the union statements out and see which query is actually running slowly

Wes H Wes H 3731 silver badge3 bronze badges · Answer 1 · 2018-02-21 14:48:32Z

You have a few different challenges that are affecting performance.

Expressions on the field used to filter, rather than converting the filtered result. EX: RTRIM( ISNULL( cli.ClientName, '' )) + ' ' + LTRIM( ISNULL( cli.FirstMiddleName, '' )) LIKE '%' + @SearchText + '%'
```
AND ISNULL( cli.Flg_Deleted, 0 ) = 0 
```

Using a scalar valued function in your resultset

RegNum = dbo.Get_B_Car_RegNum( car.ID, GETDATE()),

Using a scalar valued function in a join

ON cli.ID = dbo.Get_B_Car_ID_B_Client( car.ID, GETDATE())

Using a wildcard at the start of the search string
```
car.Descr LIKE '%' + @SearchText
```
Possibly, the use of UNION rather than UNION ALL.

The expressions and wildcards negate the use of any indexes and force a table scan on the underlying table. As an example changing AND ISNULL( cli.Flg_Deleted, 0 ) = 0 to AND ( CLI.Flg_Deleted IS NULL OR cli.Flg_Deleted = 0 ) can allow the optimizer to use an index on Flg_Deleted because the desired value is not dependent on an expression

The scalar valued function turns your set based query into a procedural query, (one row at a time) for which SQL Server is not optimized. See if you can convert this into an inline table-valued function. You should also consider dumping the result into a table variable and join on that. That would at least prevent running the udf for every possible combination of your joined table.

Each UNION will check for duplicates in the top and bottom result sets. If the data is guaranteed unique between the result sets, you'll gain some performance by switching to UNION ALL, removing the distinct sort across the combined results.

paparazzo paparazzo 6,1263 gold badges20 silver badges41 bronze badges · Answer 2 · 2018-02-21 15:26:57Z

Not sure if this will be faster but I think it is easier to read

Materialize the CTE(s) may help performance.

with CTEcliB as 
( SELECT cli.ID, cli.Presentation
 , COALESCE(cli.MobilePhone, cli.BusinessPhone, cli.HomePhone) as 'Phone'
 , EmailAddress, cli.Descr 
 , Ltrim(RTRIM(ISNULL(cli.ClientName, ''))) as 'ClientName' 
 , LTRIM(RTRIM(ISNULL(cli.FirstMiddleName, ''))) as 'FirstMiddleName'
 FROM B_Client cli 
 WHERE ISNULL(cli.Flg_Deleted, 0) = 0 
), CTEcar as 
( SELECT ID, RegNum 
 , row_number() over (partition by ID order by Dte_Start desc) as rn 
 FROM B_Car_RegNum 
 WHERE Dte_Start <= GETDATE() ORDER BY Dte_Start DESC 
)
SELECT xcli.ID_X_Table
 , cli.ID, cli.Presentation, cli.Phone, cli.EmailAddress, cli.Descr
 FROM CTEcliB cli
 JOIN X_Table xcli 
 ON xcli.TableName = 'B_Client'
 WHERE cli.EmailAddress LIKE '%'+@SearchText+'%'
 OR cli.ClientName + ' ' + cli.FirstMiddleName LIKE '%'+@SearchText+'%'
 OR cli.FirstMiddleName + ' ' + cli.ClientName LIKE '%'+@SearchText+'%' 
 UNION 
SELECT xcli.ID_X_Table
 , cli.ID, cli.Presentation
 , dbo.Get_B_Car_RegNum(car.ID, GETDATE()) as RegNum
 , car.Presentation, cli.Descr
FROM B_Car car
JOIN X_Table xcli 
 on xcli.TableName = 'B_Client'
JOIN B_Client cli 
 on cli.ID = [dbo].[Get_B_Car_ID_B_Client](car.ID, GETDATE())
WHERE car.CarPotential = 0
 AND ( car.Descr LIKE '%'+@SearchText
 OR (SELECT RegNum 
 FROM CTEcar
 WHERE ID_B_Car = car.ID
 AND RN = 1) LIKE '%'+@SearchText+'%'
 )

I sure hope ON xcli.TableName = 'B_Client' is only returning one row and you have an index on that column.

Malachi Malachi 29k11 gold badges86 silver badges188 bronze badges · Answer 3 · 2018-02-21 14:09:35Z

Your where clauses are causing the query to run slowly

WHERE RTRIM(ISNULL(cli.ClientName, '')) + ' ' + LTRIM(ISNULL(cli.FirstMiddleName, '')) LIKE '%'+@SearchText+'%'
 OR RTRIM(ISNULL(cli.FirstMiddleName, '')) + ' ' + LTRIM(ISNULL(cli.ClientName, '')) LIKE '%'+@SearchText+'%'
 OR SUBSTRING(RTRIM(ISNULL(cli.FirstMiddleName, '')), 0, PATINDEX('% %', RTRIM(ISNULL(cli.FirstMiddleName, '')))) + ' ' + LTRIM(ISNULL(cli.ClientName, '')) LIKE '%'+@SearchText+'%'

The slow down here is that you are performing two trims and using a like comparison on every single row of the data set, and then returning the rows that match.

Execute the Select Statement without the Where clause and you will see how many rows that each trim and like is being performed on.

Another way to see where the slow down is occurring is to take the union statements out and see which query is actually running slowly

Stack Exchange Network

search for text across multiple tables

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

search for text across multiple tables

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions