i need your help because we have a performance issues. I would like to update a temporary table that contains a join to the partitioned view.
Schema:
create schema SCHEMA1
go
create schema SCHEMA2
go
create schema DATA
go
Partitioned Tables:
CREATE TABLE SCHEMA1.Positions (
Month [smallint] NOT NULL,
Client [tinyint] NOT NULL,
Type [char](3) NOT NULL,
Amount [int] NOT NULL,
CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED
(
Month ASC,
Client ASC,
Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psDATA(Month)
ALTER TABLE SCHEMA1.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(1)))
GO
ALTER TABLE SCHEMA1.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO
CREATE TABLE SCHEMA2.Positions (
Month [smallint] NOT NULL,
Client [tinyint] NOT NULL,
Type [char](3) NOT NULL,
Amount [int] NOT NULL,
CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED
(
Month ASC,
Client ASC,
Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psDATA(Month)
ALTER TABLE SCHEMA2.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(2)))
GO
ALTER TABLE SCHEMA2.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO
Partitioned View:
create view DATA.Positions as
select * from SCHEMA1.Positions
union all
select * from SCHEMA2.Positions
go
Sample Data:
insert SCHEMA1.Positions (Month, Client, Type, Amount)
values (1804, 1, '001', 1000),
(1802, 1, '003', 10000),
(1705, 1, '004', 104),
(1601, 1, '001', 12),
(1512, 1, '005', 7),
(1703, 1, '001', 12897),
(1605, 1, '007', 3245),
(1401, 1, 'ZUR', 45),
(1201, 1, 'NOR', 175000)
insert SCHEMA2.Positions (Month, Client, Type, Amount)
values (1804, 2, '001', 1000),
(1802, 2, '003', 10000),
(1705, 2, '004', 104),
(1601, 2, '001', 12),
(1512, 2, '005', 7),
(1703, 2, '001', 12897),
(1605, 2, '007', 3245),
(1401, 2, 'ZUR', 45),
(1201, 2, 'NOR', 175000)
expected behaviour:
select *
from DATA.Positions
where Client = 1
expected behaviour
non expected behaviour:
create table #t (
Month smallint,
Client tinyint,
Type char(3),
Amount int
)
insert #t (Month, Client, Type)
values (1605, 2, '007')
update t
set Amount = p.Amount
from #t t
join DATA.Positions p
on t.Month = p.Month
and t.Client = p.Client
and t.Type = p.Type
non expected behaviour
example with fixed client number and right execution plan:
update t
set Amount = p.Amount
from #t t
join DATA.Positions p
on t.Month = p.Month
and 2 = p.Client
and t.Type = p.Type
expected behaviour
How can I achieve that only one partitioned table (SCHEMA2.Positions) is queried?
Thanks in advance and best regards, Andreas
Edit #1:
Thanks for your fast answer, Martin Smith. To clarifiy the question I will add some primary keys fields (like our production environment have) and some more data.
New Partitioned Tables with more primary key fields:
CREATE TABLE SCHEMA1.Positions (
Month [smallint] NOT NULL,
Client [tinyint] NOT NULL,
SNr int not null,
RNr int not null,
VNr int not null,
PNr int not null,
Type [char](3) NOT NULL,
Amount [int] NOT NULL,
CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED
(
Month ASC,
Client ASC,
SNr ASC,
RNr ASC,
VNr ASC,
PNr ASC,
Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psTP3DATA(Month)
ALTER TABLE SCHEMA1.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(1)))
GO
ALTER TABLE SCHEMA1.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO
CREATE TABLE SCHEMA2.Positions (
Month [smallint] NOT NULL,
Client [tinyint] NOT NULL,
SNr int not null,
RNr int not null,
VNr int not null,
PNr int not null,
Type [char](3) NOT NULL,
Amount [int] NOT NULL,
CONSTRAINT [PK_Positions] PRIMARY KEY CLUSTERED
(
Month ASC,
Client ASC,
SNr ASC,
RNr ASC,
VNr ASC,
PNr ASC,
Type ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) on psTP3DATA(Month)
ALTER TABLE SCHEMA2.Positions WITH CHECK ADD CONSTRAINT [CK_Positions_Client] CHECK ((Client=(2)))
GO
ALTER TABLE SCHEMA2.Positions CHECK CONSTRAINT [CK_Positions_Client]
GO
New sample data: SCHEMA1.Positions: 78.105.993 SCHEMA2.Positions: 5.951.206
New query:
create table #t (
Month smallint,
Client tinyint,
SNr int,
RNr int,
VNr int,
PNr int,
Amount007 int
)
insert #t (Month, Client, SNr, RNr, VNr, PNr)
select top 750000 Month, Client, SNr, RNr, VNr, PNr
from schema2.Positions
update t
set Amount007 = p.Amount
from #t t
join DATA.Positions p
on t.Month = p.Month
and t.Client = p.Client
and t.SNr = p.SNr
and t.RNr = p.RNr
and t.VNr = p.VNr
and t.PNr = p.PNr
where p.Type = '007'
new query plan
new query plan
details clustered index scan SCHEMA1
index scan SCHEMA1
details clustered index scan SCHEMA2
index scan SCHEMA2
-
Adding a constraint (check constraint) to the temp table #t , (client = 2), will help the optimizer to find a better waySabin B– Sabin B2018年05月31日 07:30:41 +00:00Commented May 31, 2018 at 7:30
-
thanks four your fast answer, but the data in table #t is dynamic. so i don't know which client is inserted.realatok– realatok2018年05月31日 09:18:52 +00:00Commented May 31, 2018 at 9:18
1 Answer 1
How can I achieve that only one partitioned table (SCHEMA2.Positions) is queried?
You already have (in the initial plan you showed).
The cached execution plan needs to be able to deal with any contents in #t
but at runtime it will only access the needed table(s).
The filters both have start up predicates and won't run the subtree underneath unless this is true. For your example data the number of executions of the seek against SCHEMA1.Positions
is zero.
For the revised plan this is using a hash join so the ability to selectivity apply a filter on a row by row basis is lost.
You can use
DECLARE @Client INT = (SELECT MAX(Client)
FROM #t
HAVING COUNT(DISTINCT Client) = 1)
UPDATE t
SET Amount = p.Amount
FROM #t t
INNER JOIN DATA.Positions p
ON t.Month = p.Month
AND t.Client = p.Client
AND t.Type = p.Type
AND ( t.Client = @Client
OR @Client IS NULL )
OPTION (RECOMPILE);
To provide information to the optimiser in the event that the temp table contains only rows for a specific client and one branch in the view can be ignored.
-
thanks, you are right. so i clarified my question. my execution plan looks weird now.realatok– realatok2018年05月31日 09:17:58 +00:00Commented May 31, 2018 at 9:17
-
your new plan is not using a row by row approach of nested loops so it can't use the row by row filter. Do you know upfront that the temp table will only ever contain rows for one Client?Martin Smith– Martin Smith2018年05月31日 09:29:38 +00:00Commented May 31, 2018 at 9:29
-
what do you mean with your first sentence? can you explain it? we used the hint: option(loop join) and sql server now uses a different execution plan (it's really fast now). why does the query analyzer not do it by itself?realatok– realatok2018年05月31日 10:33:01 +00:00Commented May 31, 2018 at 10:33
-
The nested loops algorithm is for each outer row look up the inner. So it is possible to look at the ClientId in the temp table row and apply a filter on that. The hash join is hashing both inputs as it doesn't know in advance that they may be totally unused if the temp table will only ever contain rows for one Client. Is that the case? Still waiting on an answer to that QMartin Smith– Martin Smith2018年05月31日 11:48:21 +00:00Commented May 31, 2018 at 11:48
-
oh excuse me, i forgot to answer your question. no, we don't know in upfront which client is inside the temp table. it can be only one or even all clients.realatok– realatok2018年05月31日 12:05:49 +00:00Commented May 31, 2018 at 12:05
Explore related questions
See similar questions with these tags.