It's common knowledge that the order of records from a simple one-table query is not guaranteed to be in the order of the primary key/clustered index. Adding a simple ORDER BY is no problem of course, but I'd like to understand why this is the case. Shouldn't ordering on a clustered index always be quicker than a nonclustered index? What's going on in the mind of SQL Server which would cause the latter to ever be preferred?
Here's an example of a table definition and some small data I inserted which demonstrates this scenario.
CREATE TABLE FranchisorSetup.ContactMethod
(
RowID INT NOT NULL IDENTITY(1,1) CONSTRAINT [PK_FranchisorSetup.ContactMethod] PRIMARY KEY CLUSTERED,
[Name] VARCHAR(16) NOT NULL CONSTRAINT [UK_FranchisorSetup.ContactMethod_Name] UNIQUE NONCLUSTERED
)
GO
SET IDENTITY_INSERT FranchisorSetup.ContactMethod ON
INSERT INTO FranchisorSetup.ContactMethod
(
RowID,
Name
)
VALUES
(0, 'Work Phone'),
(1, 'Work Cell'),
(2, 'Work Fax'),
(3, 'Work Email'),
(4, 'Pager'),
(5, 'Radio'),
(6, 'Assistant'),
(7, 'Home Phone'),
(8, 'Alt Email'),
(9, 'Personal Cell')
GO
SET IDENTITY_INSERT FranchisorSetup.ContactMethod OFF
As you can see below, it's being ordered on the Name column, which is a nonclustered index.
SELECT * FROM FranchisorSetup.ContactMethod
/* results:
RowID Name
8 Alt Email
6 Assistant
7 Home Phone
4 Pager
9 Personal Cell
5 Radio
1 Work Cell
3 Work Email
2 Work Fax
0 Work Phone
*/
1 Answer 1
Shouldn't ordering on a clustered index always be quicker than a nonclustered index?
Let' think about why this is usually the case.
- A clustered index contains all the table's data, ordered by the clustered index key (
RowID
in your example). - A non-clustered index contains the index fields (
Name
in your example) as well as the clustered index key.
Ordering on the clustered index is usually faster because only one lookup is needed: You just fetch the data from the clustered index (which contains all the table's data), and you are done. In contrast, ordering by a non-clustered index means you have to:
- Fetch the data from the non-clustered index (which contains the non-clustered index fields as well as the clustered index key) and then
- use the clustered index key to fetch the actual data.
Now, the thing is: In your toy example, we are done after step 1, because your non-clustered index already contains all the data required. We can see that in the execution plan: sqlfiddle of your example
SELECT * FROM ContactMethod
--> yields a single non-clustered index scan
Estimated I/O Cost: 0.003125
Estimated CPU Cost: 0.000168
SELECT * FROM ContactMethod ORDER BY RowID
--> yields a single clustered index scan
Estimated I/O Cost: 0.003125
Estimated CPU Cost: 0.000168
So it doesn't really matter: The estimated amount of work is exactly the same, and SQL Server just happens to chose the non-clustered index.
Let's modify your example and add another column, which is not covered by your non-clustered index:
ALTER TABLE FranchisorSetup.ContactMethod ADD moreData int;
And, voilà, your non-clustered index scan is replaced by a clustered index scan, since the non-clustered index scan would require two steps instead of one. sqlfiddle of the extended example
SELECT
when noORDER BY
clause is included?