FroquizFroquiz
HomeQuizzesSenior ChallengeGet CertifiedBlogAbout
Sign InStart Quiz
Sign InStart Quiz
Froquiz

The most comprehensive quiz platform for software engineers. Test yourself with 10000+ questions and advance your career.

LinkedIn

Platform

  • Start Quizzes
  • Topics
  • Blog
  • My Profile
  • Sign In

About

  • About Us
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2026 Froquiz. All rights reserved.Built with passion for technology
Blog & Articles

SQL Joins and Aggregations Deep Dive: INNER, LEFT, CROSS, SELF, GROUP BY and HAVING

Go beyond basic SQL joins. Covers INNER, LEFT, RIGHT, FULL, CROSS, and SELF joins with practical examples, GROUP BY with ROLLUP, HAVING vs WHERE, and common aggregation patterns.

Yusuf SeyitoğluMarch 18, 20263 views9 min read

SQL Joins and Aggregations Deep Dive: INNER, LEFT, CROSS, SELF, GROUP BY and HAVING

Joins and aggregations are the core of relational data querying. Most developers know the basics, but interviews and real analytical queries require a deeper understanding: when FULL OUTER JOIN is the right tool, how SELF JOIN works, what ROLLUP adds, and when to use HAVING vs WHERE.

INNER JOIN

Returns only rows that have matching values in both tables:

sql
-- Orders with their customer names SELECT o.id AS order_id, c.name AS customer_name, o.total, o.created_at FROM orders o INNER JOIN customers c ON c.id = o.customer_id WHERE o.total > 100 ORDER BY o.created_at DESC; -- Customers excluded: those with no orders -- Orders excluded: those with no matching customer (orphaned rows)

LEFT JOIN (LEFT OUTER JOIN)

Returns all rows from the left table, and matching rows from the right. Non-matching right rows become NULL:

sql
-- All customers, whether or not they have orders SELECT c.name, COUNT(o.id) AS order_count, SUM(o.total) AS total_spent, MAX(o.created_at) AS last_order_date FROM customers c LEFT JOIN orders o ON o.customer_id = c.id GROUP BY c.id, c.name ORDER BY total_spent DESC NULLS LAST; -- Find customers who have NEVER placed an order SELECT c.name, c.email FROM customers c LEFT JOIN orders o ON o.customer_id = c.id WHERE o.id IS NULL; -- NULL on the right side = no match = no orders

RIGHT JOIN

Mirror of LEFT JOIN — returns all rows from the right table, matching rows from the left:

sql
-- All orders, even if the customer was deleted SELECT o.id, o.total, c.name FROM customers c RIGHT JOIN orders o ON c.id = o.customer_id; -- In practice, RIGHT JOIN is rarely used -- you can always rewrite as LEFT JOIN -- by swapping the table order. LEFT JOIN is more readable.

FULL OUTER JOIN

Returns all rows from both tables. Non-matching rows on either side get NULLs:

sql
-- Find all mismatches: orders without customers AND customers without orders SELECT c.id AS customer_id, c.name AS customer_name, o.id AS order_id, o.total FROM customers c FULL OUTER JOIN orders o ON c.id = o.customer_id WHERE c.id IS NULL OR o.id IS NULL; -- c.id IS NULL: orders with no matching customer (orphaned) -- o.id IS NULL: customers with no orders -- Reconcile two lists: find items in list A, list B, or both SELECT a.product_id AS in_warehouse_a, b.product_id AS in_warehouse_b, COALESCE(a.quantity, 0) + COALESCE(b.quantity, 0) AS total_quantity FROM warehouse_a a FULL OUTER JOIN warehouse_b b ON a.product_id = b.product_id;

CROSS JOIN

Returns the Cartesian product — every row in the left table combined with every row in the right:

sql
-- Generate all possible size/color combinations SELECT s.name AS size, c.name AS color, s.name || '-' || c.name AS sku_suffix FROM sizes s CROSS JOIN colors c; -- 5 sizes × 8 colors = 40 rows -- Generate a date range (using generate_series in PostgreSQL) SELECT d::date AS report_date, EXTRACT(DOW FROM d) AS day_of_week FROM generate_series( '2025-01-01'::date, '2025-12-31'::date, '1 day'::interval ) d; -- Fill gaps in time-series data using CROSS JOIN WITH dates AS ( SELECT generate_series( date_trunc('month', NOW()) - interval '11 months', date_trunc('month', NOW()), interval '1 month' )::date AS month ), users AS ( SELECT DISTINCT user_id FROM events ) SELECT u.user_id, d.month, COUNT(e.id) AS event_count FROM users u CROSS JOIN dates d LEFT JOIN events e ON e.user_id = u.user_id AND date_trunc('month', e.created_at) = d.month GROUP BY u.user_id, d.month ORDER BY u.user_id, d.month;

SELF JOIN

A table joined to itself. Used for hierarchical data and comparing rows within the same table:

sql
-- Employee-manager hierarchy SELECT e.name AS employee, m.name AS manager, e.department FROM employees e LEFT JOIN employees m ON m.id = e.manager_id; -- e is the employee, m is their manager (same table, different alias) -- Find employees earning more than their manager SELECT e.name AS employee, e.salary AS employee_salary, m.name AS manager, m.salary AS manager_salary FROM employees e JOIN employees m ON m.id = e.manager_id WHERE e.salary > m.salary; -- Find duplicate email addresses SELECT a.email, a.id AS id_1, b.id AS id_2 FROM users a JOIN users b ON a.email = b.email AND a.id < b.id; -- a.id < b.id prevents showing (1,2) and (2,1) -- each pair once

GROUP BY Deep Dive

sql
-- Basic aggregation SELECT department, COUNT(*) AS headcount, AVG(salary) AS avg_salary, MIN(salary) AS min_salary, MAX(salary) AS max_salary, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary, SUM(salary) AS total_payroll FROM employees GROUP BY department ORDER BY total_payroll DESC; -- GROUP BY multiple columns SELECT department, EXTRACT(YEAR FROM hire_date) AS hire_year, COUNT(*) AS hired FROM employees GROUP BY department, hire_year ORDER BY department, hire_year; -- Filter groups with HAVING (applied AFTER aggregation) SELECT customer_id, COUNT(*) AS order_count, SUM(total) AS total_spent FROM orders WHERE created_at >= NOW() - INTERVAL '1 year' -- filter rows BEFORE grouping GROUP BY customer_id HAVING COUNT(*) >= 5 -- filter groups AFTER aggregation AND SUM(total) > 500 ORDER BY total_spent DESC;

WHERE vs HAVING

sql
-- WHERE filters individual rows before grouping -- HAVING filters groups after aggregation -- This is WRONG (cannot use aggregate in WHERE): SELECT department, COUNT(*) FROM employees WHERE COUNT(*) > 10 -- ERROR: aggregate functions not allowed in WHERE GROUP BY department; -- Correct: use HAVING SELECT department, COUNT(*) AS headcount FROM employees GROUP BY department HAVING COUNT(*) > 10; -- You CAN use both: SELECT department, COUNT(*) AS headcount FROM employees WHERE salary > 50000 -- filter rows first (only high earners) GROUP BY department HAVING COUNT(*) > 5; -- then filter groups (departments with 5+)

ROLLUP and CUBE

Generate subtotals and grand totals automatically:

sql
-- ROLLUP: subtotals for each level of hierarchy SELECT COALESCE(department, 'All Departments') AS department, COALESCE(CAST(EXTRACT(YEAR FROM hire_date) AS TEXT), 'All Years') AS year, COUNT(*) AS headcount, SUM(salary) AS total_salary FROM employees GROUP BY ROLLUP(department, EXTRACT(YEAR FROM hire_date)) ORDER BY department, year; -- Output includes: -- (Engineering, 2023): subtotal for Engineering 2023 -- (Engineering, NULL): subtotal for all Engineering -- (NULL, NULL): grand total for all departments -- CUBE: all combinations of subtotals GROUP BY CUBE(department, region, role) -- Generates subtotals for every combination: all 8 combinations of 3 dimensions

Common Aggregation Patterns

Conditional aggregation (FILTER or CASE)

sql
-- Count by category in one query SELECT COUNT(*) FILTER (WHERE status = 'completed') AS completed, COUNT(*) FILTER (WHERE status = 'pending') AS pending, COUNT(*) FILTER (WHERE status = 'cancelled') AS cancelled, COUNT(*) AS total FROM orders; -- Equivalent with CASE WHEN (portable across databases) SELECT COUNT(CASE WHEN status = 'completed' THEN 1 END) AS completed, COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending, SUM(CASE WHEN status = 'completed' THEN total ELSE 0 END) AS completed_revenue FROM orders; -- Pivot: rows to columns SELECT product_id, SUM(CASE WHEN EXTRACT(QUARTER FROM sale_date) = 1 THEN amount ELSE 0 END) AS q1, SUM(CASE WHEN EXTRACT(QUARTER FROM sale_date) = 2 THEN amount ELSE 0 END) AS q2, SUM(CASE WHEN EXTRACT(QUARTER FROM sale_date) = 3 THEN amount ELSE 0 END) AS q3, SUM(CASE WHEN EXTRACT(QUARTER FROM sale_date) = 4 THEN amount ELSE 0 END) AS q4 FROM sales GROUP BY product_id;

Running totals

sql
SELECT order_date, daily_revenue, SUM(daily_revenue) OVER (ORDER BY order_date) AS cumulative_revenue FROM daily_sales;

Common Interview Questions

Q: What is the difference between WHERE and HAVING?

WHERE filters individual rows before they are grouped — it cannot reference aggregate functions. HAVING filters groups after GROUP BY aggregation — it can reference aggregate functions. Use WHERE to reduce the rows entering aggregation (more efficient), and HAVING to filter the resulting groups.

Q: What does a CROSS JOIN produce and when would you use it?

A CROSS JOIN produces the Cartesian product — every row in the first table paired with every row in the second table. N rows × M rows = N×M results. Use it to generate all combinations (product SKUs from sizes and colors), to fill time-series gaps by joining a date spine with entity lists, or to generate test data.

Q: How does a SELF JOIN work?

A SELF JOIN joins a table to itself by aliasing it with two different names. It is used for hierarchical data (employees and their managers), for finding relationships between rows in the same table (pairs of users in the same city), or for duplicate detection. The join condition typically compares different columns or adds a constraint like a.id < b.id to avoid duplicates.

Practice SQL on Froquiz

Joins and aggregations are core SQL interview topics at every level. Test your SQL knowledge on Froquiz — from basic queries to advanced window functions and CTEs.

Summary

  • INNER JOIN: only matching rows from both tables
  • LEFT JOIN: all left rows, matching right rows (NULLs for no match) — use WHERE right.id IS NULL to find unmatched
  • FULL OUTER JOIN: all rows from both tables — unmatched rows get NULLs on the other side
  • CROSS JOIN: Cartesian product — every row × every row — use for combinations and date spines
  • SELF JOIN: table joined to itself with two aliases — use for hierarchies and pair comparisons
  • WHERE filters rows before grouping; HAVING filters groups after aggregation
  • ROLLUP generates hierarchical subtotals; CUBE generates all combinations
  • COUNT(*) FILTER (WHERE condition) is the cleanest conditional aggregation syntax

About Author

Yusuf Seyitoğlu

Author →

Other Posts

  • Python Concurrency: Threading, Multiprocessing and the GIL ExplainedMar 18
  • Java Design Patterns: Singleton, Factory, Builder, Observer, Strategy and MoreMar 18
  • CSS Architecture: BEM, SMACSS, Utility-First and Modern ApproachesMar 18
All Blogs