Relational Division: Customers Who Ordered All Products in a Category

mediumSQL

Lesson

Understanding Relational Division in SQL

Relational division is one of the most challenging SQL operations to master, but it's incredibly powerful for answering "find all X that are related to every Y" questions. Unlike basic joins that combine data, relational division finds records that have complete relationships with all members of another set.

The Problem Pattern

Relational division problems follow this pattern: "Find all customers who bought every product in category X" or "Find all students who passed every required course." The key word is every - we're looking for complete coverage, not partial matches.

Why It's Tricky

SQL doesn't have a built-in division operator like it has for joins. We have to construct the logic ourselves using aggregation and counting. The challenge is ensuring we count correctly and handle edge cases like duplicate orders or empty categories.

The COUNT Approach

The most intuitive approach uses counting: if a customer ordered every product in a category, then COUNT(DISTINCT products they ordered in that category) should equal COUNT(DISTINCT all products in that category). This requires careful use of DISTINCT to handle duplicate orders.

Alternative Approaches

Advanced developers sometimes use double negation with NOT EXISTS: "Find customers where there does NOT EXIST a product in the category that the customer did NOT order." While logically elegant, this can be harder to read and optimize.

Relational division appears frequently in business applications: customer loyalty analysis, compliance checking (ensuring all requirements are met), and quality assurance (verifying all test cases pass).

Example
1-- Example: Find students who completed ALL required courses 2SELECT 3 s.student_id, 4 s.name 5FROM 6 students s 7 JOIN enrollments e ON s.student_id = e.student_id 8 JOIN courses c ON e.course_id = c.course_id 9WHERE 10 c.required = 1 11GROUP BY 12 s.student_id, s.name 13HAVING 14 COUNT(DISTINCT c.course_id) = ( 15 SELECT COUNT(*) 16 FROM courses 17 WHERE required = 1 18 );
L8Filter to only required courses before grouping
L12COUNT DISTINCT handles duplicate enrollments correctly
L13Subquery counts total required courses for comparison

Key Takeaways

  • •Relational division finds records with complete relationships to all members of another set
  • •Use COUNT(DISTINCT) with subqueries to compare actual coverage against required coverage
  • •Always handle duplicates and edge cases like empty categories in your division logic
Loading...