GROUP BY... and hit a wall. GROUP BY squashes all of a user's orders into one row. But you wanted to keep every order AND add a rank. You need both. That is exactly what window functions do, and they are the move that turns you into a SQL ninja.
The idea in one line
A window function computes a value across a group of rows (a "window") while keeping every single row, unlike GROUP BY, which collapses them into one.
The metaphor: the bakery queue
You are standing in line at the bakery. A little screen says you are customer number 4. You can see your position in the line. But here is the thing: you are still you, still standing there as yourself. You did not get blended into "the line." You kept your identity and also got a number.
That is a window function. Every row stays itself, and you bolt on extra info computed from the rows around it (your position, the running total, the person before you).
GROUP BY is the opposite. It is like the baker saying "I do not care about individuals, there are 12 people total." Twelve people become one number. Useful sometimes, but you lost everybody.
The shape: OVER (PARTITION BY ... ORDER BY ...)
A window function always has an OVER (...) clause. That is what makes it a window function.
SELECT
name,
total,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC) AS rank_in_user
FROM orders;
Read OVER (...) in plain English:
PARTITION BY user_id --> split rows into one queue per user
ORDER BY total DESC --> inside each queue, line them up biggest first
ROW_NUMBER() --> hand each row its position number
PARTITION BY is like having a separate bakery line per user. ORDER BY decides who stands where. No PARTITION BY? Then it is one big single line for the whole table.
Numbering rows: ROW_NUMBER, RANK, DENSE_RANK
These three all number rows, but they treat ties differently. Say two orders have the same total of 100.
total ROW_NUMBER RANK DENSE_RANK
150 1 1 1
100 2 2 2
100 3 2 2
80 4 4 3
-
ROW_NUMBER(): always 1, 2, 3, 4. No ties allowed, it just picks an order. Great for "give me exactly one row each."
-
RANK(): ties share a number, then it skips ahead (1, 2, 2, 4). Like the Olympics, two silvers and no bronze.
-
DENSE_RANK(): ties share a number, but it does not skip (1, 2, 2, 3). No gap.
Pick ROW_NUMBER when you need a unique number, RANK/DENSE_RANK when ties should genuinely tie.
Running totals with SUM() OVER
Add ORDER BY inside the window and SUM becomes a running total (each row adds itself to everything before it):
SELECT
id,
total,
SUM(total) OVER (ORDER BY created_at) AS running_total
FROM orders;
id total running_total
1 50 50
2 30 80 (50 + 30)
3 20 100 (80 + 20)
Every row is still here, and now each one knows the total so far. Try doing that with GROUP BY. You cannot, because GROUP BY would have crushed these into one row.
Peeking at neighbors: LAG and LEAD
LAG() looks at the previous row, LEAD() looks at the next one. Perfect for "compare this order to the user's last order."
SELECT
user_id,
total,
LAG(total) OVER (PARTITION BY user_id ORDER BY created_at) AS previous_total
FROM orders;
user_id total previous_total
7 50 NULL (first order, nobody before)
7 80 50 (look back one row)
7 60 80
Now subtract total - previous_total and you instantly see if a user is spending more or less than last time.
The killer real case: top 3 orders per user
Here is the request from the top of the post: rank each user's orders, biggest first. ROW_NUMBER with PARTITION BY nails it:
SELECT
user_id,
total,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY total DESC) AS rn
FROM orders;
Same trick ranks products by price inside each category:
SELECT
category,
name,
price,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank
FROM products;
Notice the difference from GROUP BY:
GROUP BY category --> ONE row per category (you lose the products)
ROW_NUMBER() OVER (...) --> EVERY product stays, plus a rank column
That is the whole point. The window keeps everybody and adds a column.
Gotcha: you cannot filter a window result in WHERE
This is the big one. Window functions run after WHERE. So you cannot say WHERE price_rank <= 3, the rank does not exist yet when WHERE runs.
-- WRONG: price_rank is not known during WHERE
SELECT category, name,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank
FROM products
WHERE price_rank <= 3; -- error!
The fix is to wrap it in a subquery or a CTE (remember Part 10?), then filter on the outside:
-- RIGHT: compute the rank inside, filter it outside
WITH ranked AS (
SELECT category, name,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS price_rank
FROM products
)
SELECT category, name
FROM ranked
WHERE price_rank <= 3; -- now it works, the column exists out here
That CTE wrapper is the standard "top N per group" pattern. Memorize it.
Gotchas juniors hit
-
Filtering on a window in WHERE. It is not there yet. Wrap it in a CTE or subquery and filter outside.
-
Forgetting PARTITION BY. Without it, your "rank per user" becomes "rank across the whole table." The partition is what resets the count for each group.
-
Mixing up RANK and DENSE_RANK on ties.
RANK leaves gaps (1, 2, 2, 4), DENSE_RANK does not (1, 2, 2, 3). Pick on purpose.
Recap
- A window function keeps every row and adds a value computed from nearby rows.
GROUP BY collapses rows; windows do not.
- Shape:
function OVER (PARTITION BY ... ORDER BY ...).
-
ROW_NUMBER, RANK, DENSE_RANK number rows and differ on ties. SUM() OVER (ORDER BY ...) gives running totals. LAG/LEAD peek at neighbors.
- "Top N per group" =
ROW_NUMBER() inside a CTE, then filter in the outer WHERE.
- Windows run after
WHERE, so filter their results in a wrapper, not in WHERE.
Your turn
Write a query that gives each user only their single most expensive order (use ROW_NUMBER() and a CTE wrapper). Which column do you PARTITION BY, and what do you keep where rn = 1? Explain it to a friend and the ninja headband is yours.
And that is the series. You went from "what even is a database" all the way to window functions, the stuff a lot of working devs still find scary. You can query, filter, join, group, change data safely, model relationships, use subqueries and CTEs, index for speed, wrap things in transactions, block SQL injection, and now slice data with windows. That is a real, job-ready SQL toolkit.
So go build something. A tiny dashboard, a leaderboard, a "top products" report, anything that makes you write real queries against real data. That is how it sticks. Want to revisit a topic or share the series with a friend? Head back to the README for the full roadmap. Congratulations, ninja. You finished.