PostgreSQL - ROW_NUMBER Function
The PostgreSQL ROW_NUMBER function is a crucial part of window functions, enabling users to assign unique sequential integers to rows within a dataset. This function is invaluable for tasks such as ranking, pagination and identifying duplicates.
The ROW_NUMBER()
function generates a unique number for each row in a result set, starting from 1 for the first row. This numbering is reset whenever the partition changes (if used with the PARTITION BY
clause).
Features of ROW_NUMBER()
- Sequential Numbering: Each row in the result set is assigned a unique number based on the specified order.
- Partitioning: The function can partition the result set into subsets, allowing for separate numbering within each partition.
- Sorting: The order of the assigned numbers can be defined using the
ORDER BY
clause.
Syntax:
The syntax for the ROW_NUMBER()
function is as follows:
ROW_NUMBER() OVER ( [PARTITION BY partition_expression] ORDER BY order_expression )
Explanation of Syntax:
PARTITION BY partition_expression
: This clause is optional and divides the result set into partitions to which theROW_NUMBER()
function is applied. Each partition is numbered independently.ORDER BY order_expression
: This clause is mandatory and specifies the order in which rows are numbered within each partition.
When to Use ROW_NUMBER()
The ROW_NUMBER()
function is particularly useful in the following scenarios:
- Pagination: When displaying large datasets,
ROW_NUMBER()
can help in retrieving a specific range of records. - Ranking: It can be used to assign ranks to rows based on specific criteria.
- Removing Duplicates: It can help identify duplicates by numbering rows within groups.
Examples of PostgreSQL ROW_NUMBER()
Let's go through some practical examples to illustrate the use of the ROW_NUMBER()
function.
Example 1: Basic Usage of ROW_NUMBER()
Consider a table named employees
with the following data:
id | name | department | salary |
---|---|---|---|
1 | Alice | HR | 50000 |
2 | Bob | IT | 60000 |
3 | Charlie | HR | 70000 |
4 | David | IT | 80000 |
5 | Eva | Marketing | 55000 |
Query:
SELECT
id,
name,
department,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;
Output:
id | name | department | salary | row_num |
---|---|---|---|---|
4 | David | IT | 80000 | 1 |
3 | Charlie | HR | 70000 | 2 |
2 | Bob | IT | 60000 | 3 |
5 | Eva | Marketing | 55000 | 4 |
1 | Alice | HR | 50000 | 5 |
Explain this Example:
In this example, the ROW_NUMBER()
function assigns a unique number to each employee based on their salary in descending order.
Example 2: Using ROW_NUMBER() with PARTITION BY Clause
Now, let's say we want to assign row numbers within each department.
Query:
SELECT
id,
name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
Output:
id | name | department | salary | row_num |
---|---|---|---|---|
3 | Charlie | HR | 70000 | 1 |
1 | Alice | HR | 50000 | 2 |
2 | Bob | IT | 60000 | 1 |
4 | David | IT | 80000 | 2 |
5 | Eva | Marketing | 55000 | 1 |
Explain this Example:
In this example, the ROW_NUMBER()
function partitions the employees by department and assigns a row number based on their salary within each department.
Example 3: Combining ROW_NUMBER() with Common Table Expressions (CTE)
Common Table Expressions (CTEs) can enhance the readability of complex queries. Let's say we want to find the highest-paid employee in each department.
Query:
WITH RankedEmployees AS (
SELECT
id,
name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees
)
SELECT *
FROM RankedEmployees
WHERE row_num = 1;
Output:
id | name | department | salary |
---|---|---|---|
3 | Charlie | HR | 70000 |
4 | David | IT | 80000 |
5 | Eva | Marketing | 55000 |
Explain this Example:
In this example, the CTE RankedEmployees
assigns row numbers to each employee within their respective departments, and the main query selects only those with the highest salary.
Features Of PostgreSQL ROW_NUMBER() Function
ROW_NUMBER()
is a window function, meaning it requires anOVER
clause to define the window of rows it operates on.- The
ROW_NUMBER()
function assigns a unique integer to each row starting from 1 without skipping any numbers, even if rows have the same values in the columns used for ordering. - The
PARTITION BY
clause divides the result set into partitions to which theROW_NUMBER()
function is applied independently. WithoutPARTITION BY
, the function treats the entire result set as a single partition. ROW_NUMBER
()
can be combined with other window functions such asRANK()
,DENSE_RANK()
, andNTILE()
to achieve different types of ranking and partitioning.