I have a multi-tenant database where I am utilizing an AccountId
column within all tables as a part of the composite Primary Key for tenant isolation. Is it beneficial to create additional non-clustered indexes on each column that is part of the composite Primary Key to assist SQL Server with maintaining accurate statistics and improving query performance when joining to lookup tables?
For instance, within an associative table that defines one-to-many relationships between an Account
and the American State
in which they have offices, in theory which of the two options is preferable given the following structure and sample query?
Create the Account
and State
tables and populate with sample data.
DROP TABLE IF EXISTS [dbo].[Account];
DROP TABLE IF EXISTS [dbo].[State];
-- [Account] table and sample values.
IF OBJECT_ID('[dbo].[Account]', 'U') IS NULL
BEGIN
CREATE TABLE [dbo].[Account] (
[AccountId] [int] IDENTITY(1,1) NOT NULL
,[AccountAlias] [varchar](3) NOT NULL
,[AccountName] [varchar](128) NOT NULL
,CONSTRAINT [PK_Account] PRIMARY KEY CLUSTERED ([AccountId] ASC)
,CONSTRAINT [UQ_Account_Alias] UNIQUE NONCLUSTERED ([AccountAlias] ASC)
,CONSTRAINT [UQ_Account_Name] UNIQUE NONCLUSTERED ([AccountName] ASC)
);
SET IDENTITY_INSERT [dbo].[Account] ON;
INSERT INTO [dbo].[Account] ([AccountId], [AccountAlias], [AccountName])
VALUES (1, 'SA1', 'Sample Account 1'), (2, 'SA2', 'Sample Account 2'), (3, 'SA3', 'Sample Account 3')
SET IDENTITY_INSERT [dbo].[Account] OFF;
END;
GO
-- [State] table and sample values.
IF OBJECT_ID('[dbo].[State]', 'U') IS NULL
BEGIN
CREATE TABLE [dbo].[State] (
[StateId] [tinyint] IDENTITY(1,1) NOT NULL
,[StateCode] [varchar](2) NOT NULL
,[StateName] [varchar](32) NOT NULL
,CONSTRAINT [PK_State] PRIMARY KEY CLUSTERED ([StateId] ASC)
,CONSTRAINT [UQ_State_Code] UNIQUE NONCLUSTERED ([StateCode] ASC)
,CONSTRAINT [UQ_State_Name] UNIQUE NONCLUSTERED ([StateName] ASC)
);
SET IDENTITY_INSERT [dbo].[State] ON;
INSERT INTO [dbo].[State] ([StateId], [StateCode], [StateName])
VALUES (1, 'AL', 'Alabama'), (2, 'AK', 'Alaska'), (3, 'AZ', 'Arizona'), (4, 'AR', 'Arkansas'), (5, 'CA', 'California')
SET IDENTITY_INSERT [dbo].[State] OFF;
END;
GO
Create AccountState
OPTION 1 - composite Primary Key only
DROP TABLE IF EXISTS [dbo].[AccountState];
IF OBJECT_ID('[dbo].[AccountState]', 'U') IS NULL
BEGIN
CREATE TABLE [dbo].[AccountState] (
[AccountId] [int] NOT NULL
,[StateId] [tinyint] NOT NULL
,CONSTRAINT [PK_AccountState] PRIMARY KEY CLUSTERED ([AccountId] ASC, [StateId] ASC)
,CONSTRAINT [FK_AccountState_Account] FOREIGN KEY ([AccountId]) REFERENCES [dbo].[Account]([AccountId])
,CONSTRAINT [FK_AccountState_State] FOREIGN KEY ([StateId]) REFERENCES [dbo].[State]([StateId])
);
INSERT INTO [dbo].[AccountState] ([AccountId], [StateId])
SELECT A.[AccountId], S.[StateId]
FROM [dbo].[Account] A CROSS JOIN [dbo].[State] S
WHERE A.[AccountId] = 1 AND S.[StateId] IN (1, 2, 3)
UNION
SELECT A.[AccountId], S.[StateId]
FROM [dbo].[Account] A CROSS JOIN [dbo].[State] S
WHERE A.[AccountId] = 2 AND S.[StateId] IN (3, 4, 5)
UNION
SELECT A.[AccountId], S.[StateId]
FROM [dbo].[Account] A CROSS JOIN [dbo].[State] S
WHERE A.[AccountId] = 3 AND S.[StateId] IN (1, 3, 5)
END;
GO
Create AccountState
OPTION 2 - composite Primary Key + non-clustered indexes
DROP TABLE IF EXISTS [dbo].[AccountState];
IF OBJECT_ID('[dbo].[AccountState]', 'U') IS NULL
BEGIN
CREATE TABLE [dbo].[AccountState] (
[AccountId] [int] NOT NULL
,[StateId] [tinyint] NOT NULL
,CONSTRAINT [PK_AccountState] PRIMARY KEY CLUSTERED ([AccountId] ASC, [StateId] ASC)
,CONSTRAINT [FK_AccountState_Account] FOREIGN KEY ([AccountId]) REFERENCES [dbo].[Account]([AccountId])
,CONSTRAINT [FK_AccountState_State] FOREIGN KEY ([StateId]) REFERENCES [dbo].[State]([StateId])
,INDEX [IX_AccountState_Account] NONCLUSTERED ([AccountId])
,INDEX [IX_AccountState_State] NONCLUSTERED ([StateId])
);
INSERT INTO [dbo].[AccountState] ([AccountId], [StateId])
SELECT A.[AccountId], S.[StateId]
FROM [dbo].[Account] A CROSS JOIN [dbo].[State] S
WHERE A.[AccountId] = 1 AND S.[StateId] IN (1, 2, 3)
UNION
SELECT A.[AccountId], S.[StateId]
FROM [dbo].[Account] A CROSS JOIN [dbo].[State] S
WHERE A.[AccountId] = 2 AND S.[StateId] IN (3, 4, 5)
UNION
SELECT A.[AccountId], S.[StateId]
FROM [dbo].[Account] A CROSS JOIN [dbo].[State] S
WHERE A.[AccountId] = 3 AND S.[StateId] IN (1, 3, 5)
END;
GO
Sample query
SELECT
A.[AccountName]
,S.[StateName]
FROM
[dbo].[AccountState] A_S
JOIN [dbo].[Account] A
ON A_S.[AccountId] = A.[AccountId]
JOIN [dbo].[State] S
ON A_S.[StateId] = S.[StateId]
WHERE
S.[StateCode] = 'CA'
Of the two options, what type of index combination would be seen as most suitable to scale? Composite primary key only or composite primary key plus additional non-clustered indexes? Or is there another option that is more viable?
1 Answer 1
I'm not sure that the sample query is a fair example to base a recommendation on. The sample query is not a typical multi-tenant application query since it is not specific to a particular client. It is more of a support or management query looking to gain insight about all (or at least multiple) customers. Of course, it could also be maintenance related (e.g. garbage collection would look for oldest dates and not care about AccountId
). So let's separate this out:
General
I don't see any benefit to ever having a non-clustered index that is solely the leading / left-most key column in the clustered index. The clustered index is already in that order and so statistics exist for it. So, the
IX_AccountState_Account
index in Option 2 is purely waste, and a drag on the system.Support / Management / Maintenance queries
These queries, especially the maintenance ones, can work across
AccountId
values. As such, some queries will certainly benefit from non-clustered indexes on clustered index key columns that are notAccountId
(or to put it more generally: that are not the left-most / leading key column). This assumes that you have queries that filter / sort on only the entity ID. Not having any queries like that means you probably don't need this index.Application queries
These queries should always include
AccountId
, so I don't see how they would benefit from an index that is solely on the entity ID.While I have worked on multi-tenant systems similar to what you are describing, I just thought of something that never occurred to me before: since the clustered index statistics are based on the left-most / leading column and not the combination of key columns, it might be beneficial to manually create statistics for those objects. I seem to recall that it is just the density portion that will account for all key columns (not just the leading column), but that might help the query optimizing. This needs to be tested as I have not tried it (and I won't be able to do such a test anytime soon).
-
I agree that the sample query isn't the best example as I shared a stripped down table set to be as concise as possible with a hopeful emphasis on the
JOIN
s to the lookup tables. My uncertainty was whether the leading key column would be utilized when it is the only column of theON
clause. It appears it will use it per point 1. Following that, it seems thatStateId
would need an NCI to assist with theJOIN
. Point 2 will be something to dig into on a query-by-query basis per the execution plans. Point 3 has been ingrained into all my developers - always includeAccountId
.PicoDeGallo– PicoDeGallo2019年03月06日 17:28:00 +00:00Commented Mar 6, 2019 at 17:28 -
@PicoDeGallo Whether or not to add a NCI on
StateId
is part of Point 2. Only do it if you have cross-customer queries that filter / order based onStateId
. But it shouldn't be needed for queries like the sample query because it is the given value that will be looked-up in the parent table. I don't believe it would be used in the context of the sample query, but you should test on your system with your data (actuall execution plan, not estimated) to see if it is used. I am also curious if the manual stats on the same composite keys as the clustered index helps.Solomon Rutzky– Solomon Rutzky2019年03月06日 17:43:02 +00:00Commented Mar 6, 2019 at 17:43 -
Indeed, I should have clarified. I was leaning towards needing the NCI based off of existing queries within the system that do perform such a
JOIN
as opposed to the sample query. We have several associative tables where we have a composite Primary Key where the secondary column is a lookup table (such asState
). At present we do not have any NCI on these table columns, but I will be carrying out a deeper performance analysis to see if SQL Server will positively leverage them. I will try to report back with my findings when I've reached that scope of work.PicoDeGallo– PicoDeGallo2019年03月07日 16:59:53 +00:00Commented Mar 7, 2019 at 16:59
Explore related questions
See similar questions with these tags.