The GROUP BY
clause is used to:
Group rows that have the same values in specified columns into summary rows — like totals, counts, averages, etc.
✅ It’s most commonly used with aggregate functions like:
SUM()
COUNT()
AVG()
MAX()
MIN()
Why Use GROUP BY
?
Without GROUP BY
, aggregate functions summarize the entire table.
With GROUP BY
, you can:
- Summarize data by categories.
- Get subtotals for each group.
- Analyze data in a more meaningful way.
Simple Example
Table: Sales
sale_id | customer_id | amount |
---|---|---|
1 | 101 | 100 |
2 | 101 | 150 |
3 | 102 | 200 |
4 | 103 | 250 |
Without GROUP BY
Total sales:
SELECT SUM(amount) FROM Sales;
➡️ 100 + 150 + 200 + 250 = 700
With GROUP BY
Total sales per customer:
SELECT customer_id, SUM(amount)
FROM Sales
GROUP BY customer_id;
Result:
customer_id | SUM(amount) |
---|---|
101 | 250 |
102 | 200 |
103 | 250 |
✅ Here:
- It groups rows by
customer_id
. - For each group, it calculates the sum of
amount
.
Key Points
Feature | Description |
---|---|
Groups data | Rows with the same value in the group-by columns are combined. |
Works with aggregates | Calculates SUM() , COUNT() , etc., per group. |
Reduces number of rows | Returns one row per group, not per original row. |
Columns rule | All selected columns must either be grouped or aggregated. |
What Happens Internally?
Step-by-step:
- Scan all the rows.
- Group rows with the same
customer_id
. - Apply the
SUM(amount)
function to each group. - Return one row per group.
⚡ Golden Rule
When you use GROUP BY
, every column in the SELECT
must be:
- In the
GROUP BY
list OR - Be an aggregate function.
❌ This is wrong:
SELECT customer_id, amount FROM Sales GROUP BY customer_id;
Because amount
is not aggregated.
✅ Correct:
SELECT customer_id, SUM(amount) FROM Sales GROUP BY customer_id;
Real-Life Analogy
Imagine you have a list of purchases:
GROUP BY
is like grouping all purchases by customer before totaling them.- It’s like summarizing all receipts per customer.
In Short
GROUP BY = Group similar rows together and apply aggregation on each group.