When working with relational databases, you often need to combine data from multiple tables and perform complex aggregations. This requires mastering two key SQL concepts: joins and window functions.
Joins allow you to combine related data from different tables based on common columns. The most common types are INNER JOIN (returns only matching records), LEFT JOIN (returns all records from the left table), and RIGHT JOIN (returns all records from the right table). When joining multiple tables, you typically start with your main entity and join related tables one by one.
Window functions are powerful tools that perform calculations across related rows without collapsing the result set like GROUP BY does. They use the OVER clause to define a "window" of rows for the calculation. The PARTITION BY clause divides the result set into groups, while ORDER BY determines the sequence within each partition.
Common window functions include:
ROW_NUMBER(): Assigns unique numbers to rows within each partitionRANK(): Assigns ranks with gaps for tiesDENSE_RANK(): Assigns ranks without gaps for tiesFor finding "top N" records per group, you typically use ROW_NUMBER() or RANK() with PARTITION BY to identify the desired records, then filter the results. This pattern is extremely common in business analytics where you need to find the best performer in each category, most recent records per customer, or highest values per group.
Common Table Expressions (CTEs) using WITH clauses make complex queries more readable by breaking them into logical steps. This is especially helpful when you need to apply multiple transformations or aggregations.
1-- Find the top 2 products by sales in each category
2WITH ranked_products AS (
3 SELECT
4 category_name,
5 product_name,
6 total_sales,
7 ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY total_sales DESC) as sales_rank
8 FROM products p
9 JOIN categories c ON p.category_id = c.category_id
10)
11SELECT category_name, product_name, total_sales
12FROM ranked_products
13WHERE sales_rank <= 2
14ORDER BY category_name, sales_rank;