introduction

Window functions are a powerful feature in SQL used to perform calculations across a set of rows related to the current row. Unlike aggregate functions, window functions do not aggregate rows into a single output; They return a result for each row while maintaining the context of the data set.

In this article, we’ll explore some commonly used SQL window functions (ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LEAD()and LAG()) with examples.

Sample table: sales data

We will use the following sales table to illustrate the window’s functions:

Sales ID

Customer ID

project

region

amount

Sale date

1

101

Laptop

north

1200

01-05-2023

2

102

tablet

north

800

02-15-2023

3

103

phone

north

800

03-10-2023

4

104

tablet

north

500

2023-04-01

5

105

Laptop

south

1300

05-05-2023

6

106

tablet

south

700

06-20-2023

7

107

phone

West

900

07-15-2023

8

108

Laptop

east

1300

2023-08-10

1.ROW_NUMBER()

The ROW_NUMBER() function assigns a unique number to each row within a section, ordered by a specific column.

a task: Assign a unique row number to each sale within a region based on the sale amount (highest to lowest).

SELECT SalesID, Region, Amount,
       ROW_NUMBER() OVER (PARTITION BY Region ORDER BY Amount DESC) AS RowNum
FROM Sales;

a result:

Sales ID

region

amount

Row number

1

north

1200

1

2

north

800

2

3

north

800

3

4

north

500

4

5

south

1300

1

6

south

700

2

7

West

900

1

8

east

1300

1

2. sort()

The RANK() function assigns a rank to each row within a section. Rows with the same values ​​get the same order, and the next order is skipped.

a task: Ranking of sales within each region by amount (from highest to lowest).

SELECT SalesID, Region, Amount,
       RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS Rank
FROM Sales;

a result:

Sales ID

region

amount

Rank

1

north

1200

1

2

north

800

2

3

north

800

2

4

north

500

4

5

south

1300

1

6

south

700

2

7

West

900

1

8

east

1300

1

Main advantage:

  • For the North region, both sum = 800 rows share second place.
  • The next rank is skipped (for example, rank 3 is missing) and jumps to rank 4.

3.DENSE_RANK()

The DENSE_RANK() function assigns ranks like RANK(), but does not skip ranks after links.

a task: Assign density ranks to sales within each region by amount (highest to lowest).

SELECT SalesID, Region, Amount,
       DENSE_RANK() OVER (PARTITION BY Region ORDER BY Amount DESC) AS DenseRank
FROM Sales;

a result:

Sales ID

region

amount

DenseRank

1

north

1200

1

2

north

800

2

3

north

800

2

4

north

500

3

5

south

1300

1

6

south

700

2

7

West

900

1

8

east

1300

1

Main advantage:

  • For the North region, both sum = 800 rows share second place.
  • The next rank is 3, without skipping ranks.

4. ntel()

The NTILE() function divides rows into a specified number of approximately equal groups.

a task: Divide all sales into 4 groups based on amount in descending order.

SELECT SalesID, Amount,
       NTILE(4) OVER (ORDER BY Amount DESC) AS Quartile
FROM Sales;

a result:

Sales ID

amount

Quarter

5

1300

1

8

1300

1

1

1200

2

7

900

2

2

800

3

3

800

3

4

500

4

6

700

4

5. Lead()

LEAD() retrieves the value from the next row within the same partition.

a task: Compare each sale amount to the next sale amount, sorted by date of sale.

SELECT SalesID, Amount, 
       LEAD(Amount) OVER (ORDER BY SaleDate) AS NextAmount
FROM Sales;

a result:

Sales ID

amount

Next amount

1

1200

800

2

800

800

3

800

500

4

500

1300

5

1300

700

6

700

900

7

900

1300

8

1300

void

6. lag()

LAG() Retrieves the value from the previous row within the same section.

a taskCompare each sale amount with the previous sale amount, sorted by date of sale.

SELECT SalesID, Amount, 
       LAG(Amount) OVER (ORDER BY SaleDate) AS PrevAmount
FROM Sales;

a result:

Sales ID

amount

PrevAmount

1

1200

void

2

800

1200

3

800

800

4

500

800

5

1300

500

6

700

1300

7

900

700

8

1300

900

conclusion

SQL window functions such as ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LEAD(), and LAG() provide powerful ways to analyze data within partitions.

Key takeaways:

  • ROW_NUMBER() Assigns a unique identifier to each row.
  • RANK() and DENSE_RANK() They differ in how they handle relationships (skipping vs. not skipping).
  • NTILE() Useful for dividing rows into statistical groups.
  • LEAD() and LAG() Allow comparisons with adjacent rows.

By mastering these functions, you can handle complex analytics and classification tasks effectively!


Thank you for taking the time to explore data insights with me. I appreciate your participation. If you found this information helpful, I invite you to follow or connect with me on LinkedIn. Happy exploring!👋

By BBC

Leave a Reply

Your email address will not be published. Required fields are marked *