I hear it constantly: window functions intimidate people.
The name doesn't help, and neither does the syntax; most people see OVER() for the first time and assume something complicated is happening. Window functions clicked for me when I started thinking about them like Excel: running totals, row-to-row comparisons, ranked lists, but on tables with millions of rows.
This post walks through two of the most useful ones — ROW_NUMBER() and LAG() — with real examples from a dataset of Pixar films and their IMDb scores.
What Makes a Window Function Different from a Regular Function
A regular SQL function (YEAR(), FLOOR(), COUNT()) applies a transformation to one row at a time, or collapses many rows into one aggregate value. You get either a row-level result or a single summary number.
A window function does something in between: it performs a calculation across a set of rows without collapsing them. Every row keeps its identity in the output. The function just adds a new column.
Aggregates destroy rows. Window functions preserve them.
ROW_NUMBER(): The Simplest Window Function
The simplest example is ROW_NUMBER(), which assigns a sequential integer to each row. Say you've joined a films table and a reviews table and the original ID column has gaps. ROW_NUMBER() gives you a clean new one:
SELECT
ROW_NUMBER() OVER() AS new_id,
film,
imdb_score
FROM films f
INNER JOIN reviews r ON f.film = r.film
The OVER() clause is what makes this a window function. An empty OVER() means: apply this to the entire result set as one window.
You can also control the window more precisely. PARTITION BY splits it into groups (like GROUP BY, but without collapsing rows). ORDER BY sorts the rows before the function runs. A window function with PARTITION BY decade ORDER BY imdb_score DESC lets you rank movies within each decade without losing any rows from the output.
But even the basic version is already useful. Start there.
LAG(): Comparing a Row to the Previous Row
One of the most useful window functions for analysts is LAG(), which pulls a value from the previous row in the result set. This lets you calculate period-over-period differences without a self-join.
To find the difference in IMDb score between each Pixar movie and the one released before it:
SELECT
ROW_NUMBER() OVER() AS new_id,
film,
imdb_score,
LAG(imdb_score) OVER() AS imdb_score_prior,
ROUND(imdb_score - LAG(imdb_score) OVER(), 1) AS imdb_diff
FROM films f
INNER JOIN reviews r ON f.film = r.film
The result shows each film's score, the previous film's score, and the difference between them. You can see Cars dropped sharply from what came before it. Coco rebounded. The franchise arc is right there in the data.
One thing to know: the first row will have NULL in the prior score column. There's no previous movie for the first entry. That NULL becomes relevant as soon as you try to filter the results.
The Problem That Window Functions Create (And How to Solve It)
Here's something that catches a lot of SQL writers off guard. After building the query above, you want to filter out that NULL row: WHERE imdb_diff IS NOT NULL. You write it, run it, and get an error: "Unknown column imdb_diff."
This happens because of execution order. SQL doesn't run clauses in the order you write them. The actual order is:
• FROM (tables are loaded and joined)
• WHERE (rows are filtered)
• GROUP BY
• HAVING
• SELECT (columns are named and computed)
• ORDER BY
WHERE runs before SELECT. So when WHERE tries to reference imdb_diff, that column hasn't been calculated yet. The fix is to wrap your query in a subquery or CTE so the alias becomes a real column you can then filter on.
The Mental Model That Makes Window Functions Stick
That's the real value of window functions. Running totals (SUM() OVER()), rankings within groups (RANK() OVER(PARTITION BY ...)), comparisons to the next row (LEAD()) — these are all window functions. The syntax varies, but the logic is always the same: define the window with OVER(), and get back a new column without losing any rows.
Work step by step. Add one window function at a time and run the query to check the output before adding the next. The first version is almost never the final one. First write working code, then clean it up.
Want to larn more? My Advanced SQL Querying course covers window functions in depth alongside subqueries, CTEs, and query optimization.

Alice Zhao
Lead Data Science Instructor
Alice Zhao is a seasoned data scientist and author of the book, SQL Pocket Guide, 4th Edition (O'Reilly). She's an adjunct lecturer for Northwestern University's Machine Learning and Data Science program, where she teaches Python, SQL, R, data warehousing and data visualization.
Frequently Asked Questions
What is a window function in SQL?
A window function performs a calculation across a set of rows related to the current row, without collapsing those rows into a single value. Every row keeps its identity in the output; the function just adds a new column. Common examples: ROW_NUMBER() assigns sequential integers, LAG() and LEAD() compare adjacent rows, RANK() and DENSE_RANK() rank within groups, SUM() OVER() creates running totals. All window functions use an OVER() clause to define the window.
What does the OVER() clause do in SQL?
OVER() defines the window, the set of rows the function operates on. An empty OVER() applies the function to the entire result set. OVER(PARTITION BY column) splits the result into groups and applies the function within each group independently. OVER(ORDER BY column) applies the function in a specific row order. These can be combined: OVER(PARTITION BY decade ORDER BY imdb_score DESC) ranks movies within each decade by score.
What does LAG() do in SQL?
LAG() returns the value of a specified column from a previous row in the result set. It's useful for period-over-period comparisons: a movie's score vs. the previous movie's score, a month's revenue vs. the prior month's, or any row-to-row difference. LAG(column) looks back one row by default. LAG(column, 2) looks back two rows. The first row always returns NULL since there's no prior row to reference. LEAD() is the forward-looking equivalent.
What is the difference between a window function and GROUP BY?
GROUP BY collapses rows into one row per group. A window function with PARTITION BY performs calculations within groups but keeps all the original rows. Use GROUP BY when you want aggregated totals (one row per category). Use a window function when you want to add a calculated column to each row without losing any rows from the result.
Why do I get an 'unknown column' error when I try to filter on a window function result?
Because WHERE runs before SELECT. Wrap the query in a subquery or CTE to fix it.




































