5

It appears I have a case I can't quite wrap my brains around. So coming here in hopes to find pointers to a query that maybe could be helpful to someone else too.

In the following, I have a query that functions correctly as far returning results goes but requires a second query that is the same as the one presented here but without OFFSET and the output is just a COUNT(*) of all of the rows.

I have two objectives:

  1. Write the query so that COUNT(*) is returned in the same query. Indeed I have been looking at help pieces such as the excellent SQL SERVER – How to get total row count from OFFSET / FETCH NEXT (Paging) with different ways of solving the problem, but then there's another piece...
  2. Rewrite the join with a window function (e.g. OVER(PARTITION BY) or some more performant way as that query seem to generate an INDEX SCAN and INDEX SEEK on the table. The real query is a bit more complicated in the WHERE part, but it looks to me even one scan could be enough if the query were a bit more straightforward so that the COUNT and MAX could be had simultaneously with the outer query. Even this would be a win, but combined with having the overall COUNT would be even bigger.

Maybe I'm trying to chew a teeny bit more than I can chew currently, but on the other hand, maybe there is now a chance to learn something.

Here are the table and data

CREATE TABLE Temp
(
 Id INT NOT NULL PRIMARY KEY,
 Created INT NOT NULL,
 ParentId INT,
 SomeInfo INT NOT NULL,
 GroupId INT NOT NULL
 CONSTRAINT FK_Temp FOREIGN KEY(ParentId) REFERENCES Temp(Id)
);
-- Some root levels nodes.
INSERT INTO Temp VALUES(1, 1, NULL, 1, 1);
INSERT INTO Temp VALUES(2, 2, NULL, 2, 2);
INSERT INTO Temp VALUES(3, 3, NULL, 1, 3);
INSERT INTO Temp VALUES(13, 13, NULL, 1, 1);
-- First order child nodes.
INSERT INTO Temp VALUES(4, 4, 1, 2, 1);
INSERT INTO Temp VALUES(5, 5, 2, 1, 2);
INSERT INTO Temp VALUES(6, 6, 3, 2, 3);
-- Second order child nodes.
INSERT INTO Temp VALUES(7, 7, 4, 1, 1);
INSERT INTO Temp VALUES(8, 8, 5, 2, 2);
INSERT INTO Temp VALUES(9, 9, 6, 1, 3);
SELECT
 Id,
 newestTable.SomeInfo,
 newestTable.Created,
 CASE WHEN newestTable.RootCount > 1 THEN 1 ELSE 0 END AS IsMulti
FROM
 Temp as originalTable
 INNER JOIN
 (
 SELECT
 SomeInfo,
 Max(Created) AS Created,
 Count(*) AS RootCount
 FROM
 Temp
 WHERE ParentId IS NULL AND GroupId = 1
 GROUP BY SomeInfo
 ) AS newestTable ON originalTable.SomeInfo = newestTable.SomeInfo AND originalTable.Created = newestTable.Created
/*WHERE
(
 originalTable.SomeInfo = 1
)*/
ORDER BY newestTable.Created ASC
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;

P.S. Also How to apply outer limit offset and filters in the subquery to avoid grouping over the complete table used in subquery in Postgresql looks interesting.

<edit:

It looks like

SELECT
 Id,
 SomeInfo,
 GroupId,
 ParentId,
 MAX(Created) OVER(PARTITION BY SomeInfo) AS Created,
 COUNT(Id) OVER(PARTITION BY SomeInfo) AS RootCount,
 CASE WHEN COUNT(Id) OVER(PARTITION BY SomeInfo) > 1 THEN 1 ELSE 0 END AS IsMulti
FROM
 Temp
WHERE
(
 GroupId = 1 AND ParentId IS NULL
)
ORDER BY Created ASC
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;

gets close to there. The problem is, though, there are now two result rows and it appears to me this is due to the original INNER JOIN joining back to Temp that cull it to one row. I wonder if there is a way to apply the conditions somehow either before or after the windowing to match more closely the original query. (And this isn't the same query, to be clear. There's just so little data, hence the queries look like being close to each other.)

Paul White
95.3k30 gold badges439 silver badges689 bronze badges
asked Jun 11, 2019 at 21:25
1
  • The Created column is some point in time (monotonically increasing) that is the time the entry is created. In hindsight I should not have simplified this case and I should have used DATETIME. The match of values between identity and created is accidental on my part. The Id is needed, it's something that is needed from the parent table (in some other sense it could be any value from originalTable. I tried to replicate a case I have seen IRL that I think could be done better. I probably work on this tomorrow again (it's almost midnight here, need to sleep). :) Commented Jun 13, 2019 at 19:54

1 Answer 1

2

So it looks like to me what you are missing is the "Return only the top Created record for each instance". So you are getting all rows, and then watever its top Created value is for the same SomeInfo record. Unfortunately you can't just add the MAX(Created) = Created into the base WHERE clause.

If you just wrap the whole thing in a CTE you can then just add a MAX(Created) = Created into the WHERE and get what you are looking for (not that i think CTE's are the anwer for everything).

WITH CTE (ID, SomeInfo, GroupID, ParentID, Created, MaxCreated, RootCount, IsMulti)
AS
(
 SELECT
 Id,
 SomeInfo,
 GroupId,
 ParentId,
 Created,
 MAX(Created) OVER(PARTITION BY SomeInfo) AS MaxCreated,
 COUNT(Id) OVER(PARTITION BY SomeInfo) AS RootCount,
 CASE WHEN COUNT(Id) OVER(PARTITION BY SomeInfo) > 1 THEN 1 ELSE 0 END AS IsMulti
 FROM
 Temp
)
SELECT ID, SomeInfo, GroupID, ParentID, MaxCreated AS Created, RootCount, IsMulti
FROM CTE
WHERE
(
 GroupId = 1 
 AND ParentId IS NULL
 AND Created = MaxCreated
)
ORDER BY MaxCreated ASC
OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;

In my quick test it has the same execution plan and does not take any additional execution time (See execution plan added below). (now it is a small result set so it is probably something you will still need to test with.)

Execution Plan from Test

Hopefully that is more of what you are looking for.

answered Jun 14, 2019 at 20:45
2
  • You are a persistent guy. Much appreciated your effort, do tell if my added explanation doesn't make sense and I clarify. Commented Jun 17, 2019 at 20:23
  • @Veksi The OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY; should be filtering the final result set to the first five rows of whatever is returned in the order specified. I guess I don't understand what you are looking to have accomplished with the 2nd CTE. Maybe I just need more data to to see the specific use case. Would we be able to get those details in your Question post? Commented Jun 17, 2019 at 20:38

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.