Find occurrences of strings in view B in column from table A

Question 1

I'm trying to find the occurrences of strings in view B in a string in table A. However, the performance is horrendous (takes about 17 minutes to run) and I was wondering if anyone had any suggestions to improve it.

view B has 3570 records and has one column of type varchar named MISubCode

Table A has 160081 records and has the schema in the attached image.

TableBSchema

There's also a third table, (Table C) which is largely used to filter the results of table A since I don't need to find occurrences in any of those records.

My rational to solve this was as follows:

Count the number of characters in the string in the desired search column in table A, producing one row for each character
Filter the unneeded rows using view C
Run each row generated in result set from 2. and use the Number column to find the occurrences from view B in the string in column from Table A until the end of the string in column from table A.

Code:

 SELECT *
FROM
(
SELECT Number, A.StatusInputPoint_DESC, A.StatusInputPoint_CODE, A.StatusInputPoint_CORE_ID
FROM
(
SELECT Number, A.StatusInputPoint_DESC, A.StatusInputPoint_CODE, A.StatusInputPoint_CORE_ID
FROM 
tblStatusInputPoints_CORE A
CROSS JOIN
(
 SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
 FROM sys.all_objects
) AS n(Number)
WHERE Number >= 1 AND Number <= CAST(LEN(A.StatusInputPoint_DESC) AS INT) AND System_ID = 1
) AS A
LEFT JOIN
vwAlarmLibrary_MI_StatusPoint_Mask_Match_NoWildCards C
ON
A.StatusInputPoint_CORE_ID = C.StatusInputPoint_CORE_ID
WHERE C.StatusInputPoint_CORE_ID IS NULL
) AS RemainingStatusPoints
LEFT JOIN
dbo.vwAlarmLibrary_MI_Substation_Abbreviations B
ON
SUBSTRING(RemainingStatusPoints.StatusInputPoint_DESC, Number, LEN(B.MISubCode)) = B.MISubCode
WHERE B.MISubCode IS NOT NULL

I'm at a loss as to how to make this faster. I tried several different methods and join combinations and I can't seem to get it to perform well. I'm not an expert in SQL Server so I don't know all the features.

I'm currently using SQL Server 2016

EXAMPLE:

For clarity, lets say I have these strings in view B

 ABC
 EFGH
 IJ

And I have these strings in table A

 123 ABC 45 IJ
 IJ
 IJ EFGH 22

The result should be this:

 Pos String Occurrence
 5 | 123 ABC 45 IJ | ABC
 12 | 123 ABC 45 IJ | IJ
 1 | IJ | IJ
 1 | IJ EFGH 22 | IJ
 4 | IJ EFGH 22 | EFGH

Question 2

So you want to take the results from a column in TableA and find all instances of these values in ViewB, or am I mis-understanding your question

Question 3

No, its the other way around. For clarity, lets say I have these strings in view B ABC EFGH IJ And I have these strings in table A 123 ABC 45 IJ IJ IJ EFGH 22 The result should be this: 5 | 123 ABC 45 IJ | ABC 12 | 123 ABC 45 IJ | IJ 1 | IJ | IJ 1 | IJ EFGH 22 | IJ 4 | IJ EFGH 22 | EFGH

Question 4

Ok, so find all instances where any column in TableA contains values from a column in ViewB

Question 5

No, I'm trying to figure out how to edit my question, I'll explain in better detail since comments don't preserve spacing.

Question 6

I think you could just simplify this with CHARINDEX and CROSS APPLY(). See how this method stacks up in your environment and paste your execution plans if it happens to be slower, if you can.

To give a concrete example, create these tables:

declare @ViewB table (myStrings varchar(16))
insert into @ViewB
values
('ABC'),
('EFGH'),
('IJ')
declare @TableA table (colToSearch varchar(256))
insert into @TableA
values
('123 ABC 45 IJ'),
('IJ'),
('IJ EFGH 22')

Then this CROSS APPLY() query should be more efficient:

select
 charindex(b.myStrings,a.colToSearch) as Pos
 ,a.colToSearch as String
 ,b.myStrings as Occurance
from
 @ViewB b
cross apply (select * from @TableA) a
where
 charindex(b.myStrings,a.colToSearch) > 0
order by 
 String, charindex(b.myStrings,a.colToSearch)

Alternatively, you could use CROSS JOIN:

select
 charindex(b.myStrings,a.colToSearch) as Pos
 ,a.colToSearch as String
 ,b.myStrings as Occurance
from
 @ViewB b
cross join @TableA a
where
 charindex(b.myStrings,a.colToSearch) > 0
order by 
 String, charindex(b.myStrings,a.colToSearch)

RESULTS

+-----+---------------+-----------+
| Pos | String | Occurance |
+-----+---------------+-----------+
| 5 | 123 ABC 45 IJ | ABC |
| 12 | 123 ABC 45 IJ | IJ |
| 1 | IJ | IJ |
| 1 | IJ EFGH 22 | IJ |
| 4 | IJ EFGH 22 | EFGH |
+-----+---------------+-----------+

Question 7

Why cross apply? It's a simple cross join.

Question 8

There's no real difference here @dnoeth for that part, but instead of enumerating a row value was just make a Cartesian product to enable the use of charindex

Question 9

scsimon you're a genius. This made it significantly faster using CROSS APPLY. Brought it down from 17 minutes to 50 seconds after adding in other filtering necessities from the original query to yours.

Question 10

No worries @Jake I'm glad it sped it up!

Question 11

This might give you better performance

select B.myStrings, A.colToSearch 
 , charindex(b.myStrings,a.colToSearch) as Pos
from @ViewB as B 
join @TableA as A
 on A.colToSearch like '%'+B.myStrings+'%'

Question 12

I've tried this. It doesn't perform any better because it has to perform a table scan that takes up most of the execution plan due to non-sargability of %x%. While CHARINDEX is also not sargable, the execution plan apparently doesn't put all the work on the table scans and instead, 70% of the cost is on the joins.

Question 13

@Jake If you are searching for data inside a varchar it is going to be non-sargable. This fix is a proper data design to address the need.

Question 14

That's not necessarily true. Sargability: Why %string% Is Slow If you use a WHERE clause LIKE 'nut%' for example, the query is considered sargable because it performs a seek instead of a scan.

Question 15

@Jake Whatever, you seem to have the situation under control.

S3S S3S 3461 silver badge9 bronze badges · Accepted Answer · 2017-04-07 00:19:30Z

I think you could just simplify this with CHARINDEX and CROSS APPLY(). See how this method stacks up in your environment and paste your execution plans if it happens to be slower, if you can.

To give a concrete example, create these tables:

declare @ViewB table (myStrings varchar(16))
insert into @ViewB
values
('ABC'),
('EFGH'),
('IJ')
declare @TableA table (colToSearch varchar(256))
insert into @TableA
values
('123 ABC 45 IJ'),
('IJ'),
('IJ EFGH 22')

Then this CROSS APPLY() query should be more efficient:

select
 charindex(b.myStrings,a.colToSearch) as Pos
 ,a.colToSearch as String
 ,b.myStrings as Occurance
from
 @ViewB b
cross apply (select * from @TableA) a
where
 charindex(b.myStrings,a.colToSearch) > 0
order by 
 String, charindex(b.myStrings,a.colToSearch)

Alternatively, you could use CROSS JOIN:

select
 charindex(b.myStrings,a.colToSearch) as Pos
 ,a.colToSearch as String
 ,b.myStrings as Occurance
from
 @ViewB b
cross join @TableA a
where
 charindex(b.myStrings,a.colToSearch) > 0
order by 
 String, charindex(b.myStrings,a.colToSearch)

RESULTS

+-----+---------------+-----------+
| Pos | String | Occurance |
+-----+---------------+-----------+
| 5 | 123 ABC 45 IJ | ABC |
| 12 | 123 ABC 45 IJ | IJ |
| 1 | IJ | IJ |
| 1 | IJ EFGH 22 | IJ |
| 4 | IJ EFGH 22 | EFGH |
+-----+---------------+-----------+

There's no real difference here @dnoeth for that part, but instead of enumerating a row value was just make a Cartesian product to enable the use of charindex
scsimon you're a genius. This made it significantly faster using CROSS APPLY. Brought it down from 17 minutes to 50 seconds after adding in other filtering necessities from the original query to yours.

Stack Exchange Network

Find occurrences of strings in view B in column from table A

2 Answers 2

RESULTS

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Find occurrences of strings in view B in column from table A

2 Answers 2

RESULTS

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions