2

I have a table with over 10 million rows tracking user activities. I created a nonclustered index on (UserID, DtModified DESC), which performs well for queries filtering a single user. However, when querying multiple users, SQL Server first joins on the UserActivities table, then sorts the results by last modified before selecting the rows.

Since I’m using this query for pagination, if the combined users have 10,000 rows, SQL Server retrieves all of them, sorts them, and then selects only the first 50 rows. This approach becomes inefficient when searching for users with a large number of records.

Is there a way to improve performance with better indexing? Any advice would be greatly appreciated. Thanks!

P.S. Erik Darling previously suggested columnstore indexes in another post, but that isn’t an option for me right now.

Plan with single user: https://www.brentozar.com/pastetheplan/?id=uvYVLDaq9D

Plan with multiple users: https://www.brentozar.com/pastetheplan/?id=EmkR1GGa3p

-- Create a temporary table to demonstrate the issue
CREATE TABLE #UserActivities (
 ActivityID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
 UserID INT NOT NULL,
 ActivityType VARCHAR(50) NOT NULL,
 DtModified DATETIME2 NOT NULL,
 Details NVARCHAR(MAX) NULL
);
-- Create the index we want to evaluate
CREATE NONCLUSTERED INDEX IX_UserID_DtModified 
ON #UserActivities(UserID, DtModified DESC);
-- Insert sample data (10,000 rows for demonstration)
INSERT INTO #UserActivities (UserID, ActivityType, DtModified, Details)
SELECT 
 ABS(CHECKSUM(NEWID())) % 1000 + 1 AS UserID, -- 1,000 distinct users
 CASE WHEN n % 10 = 0 THEN 'Login' 
 WHEN n % 5 = 0 THEN 'Purchase'
 ELSE 'PageView' END AS ActivityType,
 DATEADD(MINUTE, -ABS(CHECKSUM(NEWID())) % 525600, GETDATE()) AS DtModified,
 'Sample activity details for user ' + CAST(ABS(CHECKSUM(NEWID())) % 1000 + 1 AS VARCHAR(10)) AS Details
FROM (
 SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n
 FROM sys.objects a
 CROSS JOIN sys.objects b -- Cross join to get enough rows
) AS Numbers;
-- Demonstrate the index behavior
-- Good case: Single user (uses seek)
SELECT UserID, DtModified
FROM #UserActivities
WHERE UserID = 42
ORDER BY DtModified DESC;
-- Problem case: Multiple users (often uses scan)
SELECT UserID, DtModified
FROM #UserActivities
WHERE UserID IN (42, 100, 789, 1024)
ORDER BY DtModified DESC;
-- Clean up
DROP TABLE #UserActivities;
asked Apr 3 at 14:14
4
  • Have you considered Keyset Pagination see use-the-index-luke.com/sql/partial-results/fetch-next-page and stackoverflow.com/questions/70519518/… Commented Apr 3 at 15:20
  • "first joins on the Users table" what Users table? I only see #UserActivities And your "problem" plan actually has a seek. Commented Apr 3 at 15:22
  • I did consider keyset pagination, but since this approach relies on page numbers, my research suggests that keyset pagination doesn’t work well in such cases. Commented Apr 3 at 15:54
  • Yes, it has a seek but also has to do a sort right after which is what i'm trying to avoid. Commented Apr 3 at 15:54

1 Answer 1

1

You need to match the ORDER BY to the order of the index, to avoid the extra Sort

SELECT UserID, DtModified
FROM #UserActivities
WHERE UserID IN (42, 100, 789, 1024)
ORDER BY UserId ASC, DtModified DESC;

enter image description here

Alternatively, construct a query (dynamically or in your application) that contains a bunch of = predicates with UNION ALL. The optimizer will normally create a Merge Concatentation to keep the rows in the same order that they came from the Index Seek

SELECT UserID, DtModified
FROM #UserActivities
WHERE UserID = 42
UNION ALL
SELECT UserID, DtModified
FROM #UserActivities
WHERE UserID = 100
UNION ALL
SELECT UserID, DtModified
FROM #UserActivities
WHERE UserID = 789
UNION ALL
SELECT UserID, DtModified
FROM #UserActivities
WHERE UserID = 1024
ORDER BY
 DtModified DESC;

enter image description here

A final option, which I found was less guaranteed than the others to get a good plan, is to change the IN list to SELECT 1 UNION SELECT 2 UNION ... (note the use of UNION not UNION ALL).

SELECT
 ua.UserID,
 ua.DtModified
FROM #UserActivities AS ua
WHERE UserID IN
 (
 SELECT 42
 UNION
 SELECT 100
 UNION
 SELECT 789
 UNION
 SELECT 1024
 )
ORDER BY
 ua.DtModified DESC;

enter image description here

db<>fiddle

This still doesn't help you with reading all those rows. If there are 10,000 rows and you only want the last 50, it's still going to have to read them all.

I strongly recommend you consider using Keyset Pagination, as it's far more efficient. It requires a rethink of the whole stack, in order to use some kind of infinite scroll, or by passing a LastKey parameter to your API.

answered Apr 3 at 16:09

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.