Finding Highest-Paid Employee by Department

mediumSQL

Lesson

SQL Joins and Window Functions for Complex Aggregations

When working with relational databases, you often need to combine data from multiple tables and perform complex aggregations. This requires mastering two key SQL concepts: joins and window functions.

Joins allow you to combine related data from different tables based on common columns. The most common types are INNER JOIN (returns only matching records), LEFT JOIN (returns all records from the left table), and RIGHT JOIN (returns all records from the right table). When joining multiple tables, you typically start with your main entity and join related tables one by one.

Window functions are powerful tools that perform calculations across related rows without collapsing the result set like GROUP BY does. They use the OVER clause to define a "window" of rows for the calculation. The PARTITION BY clause divides the result set into groups, while ORDER BY determines the sequence within each partition.

Common window functions include:

  • ROW_NUMBER(): Assigns unique numbers to rows within each partition
  • RANK(): Assigns ranks with gaps for ties
  • DENSE_RANK(): Assigns ranks without gaps for ties

For finding "top N" records per group, you typically use ROW_NUMBER() or RANK() with PARTITION BY to identify the desired records, then filter the results. This pattern is extremely common in business analytics where you need to find the best performer in each category, most recent records per customer, or highest values per group.

Common Table Expressions (CTEs) using WITH clauses make complex queries more readable by breaking them into logical steps. This is especially helpful when you need to apply multiple transformations or aggregations.

Example
1-- Find the top 2 products by sales in each category 2WITH ranked_products AS ( 3 SELECT 4 category_name, 5 product_name, 6 total_sales, 7 ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY total_sales DESC) as sales_rank 8 FROM products p 9 JOIN categories c ON p.category_id = c.category_id 10) 11SELECT category_name, product_name, total_sales 12FROM ranked_products 13WHERE sales_rank <= 2 14ORDER BY category_name, sales_rank;
L2The CTE creates a temporary result set that we can reference in the main query
L6PARTITION BY creates separate ranking groups for each category
L6ORDER BY DESC ensures highest sales get rank 1
L11Filter the ranked results to get only top 2 per category

Key Takeaways

  • •Window functions with PARTITION BY let you rank or number rows within groups without collapsing the result set
  • •CTEs make complex queries more readable by breaking logic into named, reusable steps
  • •The pattern of ranking with window functions then filtering is essential for "top N per group" problems
Loading...