sql server partitioned view wrong execution plan on update a temporary table

Question 1

i need your help because we have a performance issues. I would like to update a temporary table that contains a join to the partitioned view.

Schema:

create schema SCHEMA1
go
create schema SCHEMA2
go
create schema DATA
go

Partitioned Tables:

CREATE TABLE SCHEMA1.Positions (
 Month [smallint] NOT NULL,
 Client [tinyint] NOT NULL,
 Type [char](3) NOT NULL,
 Amount [int] NOT NULL,
 CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED 
(
 Month ASC,
 Client ASC,
 Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psDATA(Month)
ALTER TABLE SCHEMA1.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(1)))
GO
ALTER TABLE SCHEMA1.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO
CREATE TABLE SCHEMA2.Positions (
 Month [smallint] NOT NULL,
 Client [tinyint] NOT NULL,
 Type [char](3) NOT NULL,
 Amount [int] NOT NULL,
 CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED 
(
 Month ASC,
 Client ASC,
 Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psDATA(Month)
ALTER TABLE SCHEMA2.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(2)))
GO
ALTER TABLE SCHEMA2.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO

Partitioned View:

create view DATA.Positions as
 select * from SCHEMA1.Positions
 union all
 select * from SCHEMA2.Positions
go

Sample Data:

insert SCHEMA1.Positions (Month, Client, Type, Amount)
values (1804, 1, '001', 1000),
 (1802, 1, '003', 10000),
 (1705, 1, '004', 104),
 (1601, 1, '001', 12),
 (1512, 1, '005', 7),
 (1703, 1, '001', 12897),
 (1605, 1, '007', 3245),
 (1401, 1, 'ZUR', 45),
 (1201, 1, 'NOR', 175000)
insert SCHEMA2.Positions (Month, Client, Type, Amount)
values (1804, 2, '001', 1000),
 (1802, 2, '003', 10000),
 (1705, 2, '004', 104),
 (1601, 2, '001', 12),
 (1512, 2, '005', 7),
 (1703, 2, '001', 12897),
 (1605, 2, '007', 3245),
 (1401, 2, 'ZUR', 45),
 (1201, 2, 'NOR', 175000)

expected behaviour:

select *
from DATA.Positions
where Client = 1

expected behaviour

non expected behaviour:

create table #t (
 Month smallint,
 Client tinyint,
 Type char(3),
 Amount int
)
insert #t (Month, Client, Type)
values (1605, 2, '007')
update t
set Amount = p.Amount
from #t t
join DATA.Positions p
 on t.Month = p.Month
 and t.Client = p.Client
 and t.Type = p.Type

non expected behaviour

example with fixed client number and right execution plan:

update t
set Amount = p.Amount
from #t t
join DATA.Positions p
 on t.Month = p.Month
 and 2 = p.Client
 and t.Type = p.Type

expected behaviour

How can I achieve that only one partitioned table (SCHEMA2.Positions) is queried?

Thanks in advance and best regards, Andreas

Edit #1:

Thanks for your fast answer, Martin Smith. To clarifiy the question I will add some primary keys fields (like our production environment have) and some more data.

New Partitioned Tables with more primary key fields:

CREATE TABLE SCHEMA1.Positions (
 Month [smallint] NOT NULL,
 Client [tinyint] NOT NULL,
 SNr int not null,
 RNr int not null,
 VNr int not null,
 PNr int not null,
 Type [char](3) NOT NULL,
 Amount [int] NOT NULL,
 CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED 
(
 Month ASC,
 Client ASC,
 SNr ASC,
 RNr ASC,
 VNr ASC,
 PNr ASC,
 Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psTP3DATA(Month)
ALTER TABLE SCHEMA1.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(1)))
GO
ALTER TABLE SCHEMA1.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO
CREATE TABLE SCHEMA2.Positions (
 Month [smallint] NOT NULL,
 Client [tinyint] NOT NULL,
 SNr int not null,
 RNr int not null,
 VNr int not null,
 PNr int not null,
 Type [char](3) NOT NULL,
 Amount [int] NOT NULL,
 CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED 
(
 Month ASC,
 Client ASC,
 SNr ASC,
 RNr ASC,
 VNr ASC,
 PNr ASC,
 Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psTP3DATA(Month)
ALTER TABLE SCHEMA2.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(2)))
GO
ALTER TABLE SCHEMA2.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO

New sample data: SCHEMA1.Positions: 78.105.993 SCHEMA2.Positions: 5.951.206

New query:

create table #t (
 Month smallint,
 Client tinyint,
 SNr int,
 RNr int,
 VNr int,
 PNr int,
 Amount007 int
)
insert #t (Month, Client, SNr, RNr, VNr, PNr)
select top 750000 Month, Client, SNr, RNr, VNr, PNr
from schema2.Positions
update t
set Amount007 = p.Amount
from #t t
join DATA.Positions p
 on t.Month = p.Month
 and t.Client = p.Client
 and t.SNr = p.SNr
 and t.RNr = p.RNr
 and t.VNr = p.VNr
 and t.PNr = p.PNr
where p.Type = '007'

new query plan

new query plan

details clustered index scan SCHEMA1

index scan SCHEMA1

details clustered index scan SCHEMA2

index scan SCHEMA2

Question 2

Adding a constraint (check constraint) to the temp table #t , (client = 2), will help the optimizer to find a better way

Question 3

thanks four your fast answer, but the data in table #t is dynamic. so i don't know which client is inserted.

Question 4

How can I achieve that only one partitioned table (SCHEMA2.Positions) is queried?

You already have (in the initial plan you showed).

The cached execution plan needs to be able to deal with any contents in #t but at runtime it will only access the needed table(s).

The filters both have start up predicates and won't run the subtree underneath unless this is true. For your example data the number of executions of the seek against SCHEMA1.Positions is zero.

enter image description here

For the revised plan this is using a hash join so the ability to selectivity apply a filter on a row by row basis is lost.

You can use

DECLARE @Client INT = (SELECT MAX(Client)
 FROM #t
 HAVING COUNT(DISTINCT Client) = 1)
UPDATE t
SET Amount = p.Amount
FROM #t t
 INNER JOIN DATA.Positions p
 ON t.Month = p.Month
 AND t.Client = p.Client
 AND t.Type = p.Type
 AND ( t.Client = @Client
 OR @Client IS NULL )
OPTION (RECOMPILE);

To provide information to the optimiser in the event that the temp table contains only rows for a specific client and one branch in the view can be ignored.

Question 5

thanks, you are right. so i clarified my question. my execution plan looks weird now.

Question 6

your new plan is not using a row by row approach of nested loops so it can't use the row by row filter. Do you know upfront that the temp table will only ever contain rows for one Client?

Question 7

what do you mean with your first sentence? can you explain it? we used the hint: option(loop join) and sql server now uses a different execution plan (it's really fast now). why does the query analyzer not do it by itself?

Question 8

The nested loops algorithm is for each outer row look up the inner. So it is possible to look at the ClientId in the temp table row and apply a filter on that. The hash join is hashing both inputs as it doesn't know in advance that they may be totally unused if the temp table will only ever contain rows for one Client. Is that the case? Still waiting on an answer to that Q

Question 9

oh excuse me, i forgot to answer your question. no, we don't know in upfront which client is inside the temp table. it can be only one or even all clients.

Martin Smith Martin Smith 88.4k15 gold badges257 silver badges357 bronze badges · Answer 1 · 2018-05-31 07:52:52Z

How can I achieve that only one partitioned table (SCHEMA2.Positions) is queried?

You already have (in the initial plan you showed).

The cached execution plan needs to be able to deal with any contents in #t but at runtime it will only access the needed table(s).

The filters both have start up predicates and won't run the subtree underneath unless this is true. For your example data the number of executions of the seek against SCHEMA1.Positions is zero.

enter image description here

For the revised plan this is using a hash join so the ability to selectivity apply a filter on a row by row basis is lost.

You can use

DECLARE @Client INT = (SELECT MAX(Client)
 FROM #t
 HAVING COUNT(DISTINCT Client) = 1)
UPDATE t
SET Amount = p.Amount
FROM #t t
 INNER JOIN DATA.Positions p
 ON t.Month = p.Month
 AND t.Client = p.Client
 AND t.Type = p.Type
 AND ( t.Client = @Client
 OR @Client IS NULL )
OPTION (RECOMPILE);

To provide information to the optimiser in the event that the temp table contains only rows for a specific client and one branch in the view can be ignored.

thanks, you are right. so i clarified my question. my execution plan looks weird now.
your new plan is not using a row by row approach of nested loops so it can't use the row by row filter. Do you know upfront that the temp table will only ever contain rows for one Client?
what do you mean with your first sentence? can you explain it? we used the hint: option(loop join) and sql server now uses a different execution plan (it's really fast now). why does the query analyzer not do it by itself?
The nested loops algorithm is for each outer row look up the inner. So it is possible to look at the ClientId in the temp table row and apply a filter on that. The hash join is hashing both inputs as it doesn't know in advance that they may be totally unused if the temp table will only ever contain rows for one Client. Is that the case? Still waiting on an answer to that Q
oh excuse me, i forgot to answer your question. no, we don't know in upfront which client is inside the temp table. it can be only one or even all clients.

Stack Exchange Network

sql server partitioned view wrong execution plan on update a temporary table

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

sql server partitioned view wrong execution plan on update a temporary table

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions