PostgreSQL - GROUP BY clause

The GROUP BY clause in PostgreSQL is used to group rows that have the same values in specified columns into summary rows, much like the way the GROUP BY clause works in other SQL databases. It's often used with aggregate functions like COUNT(), MAX(), MIN(), SUM(), AVG(), etc., to perform an operation on each group of rows.

Syntax:

SELECT column1, column2, aggregate_function(column3)
FROM table_name
WHERE condition
GROUP BY column1, column2
ORDER BY column1, column2;

Examples:

Grouping by a Single Column:
Assume we have a table sales with columns product and amount_sold.
To find out the total amount sold for each product:
```
SELECT product, SUM(amount_sold) AS total_sold
FROM sales
GROUP BY product;
```
Grouping by Multiple Columns:
If the sales table also has a date column, and you want to know the total amount sold for each product on each date:
```
SELECT date, product, SUM(amount_sold) AS total_sold
FROM sales
GROUP BY date, product
ORDER BY date, product;
```
Using GROUP BY with HAVING:
The HAVING clause is used to filter the results after grouping. For instance, to find products that have sold more than 100 units in total:
```
SELECT product, SUM(amount_sold) AS total_sold
FROM sales
GROUP BY product
HAVING SUM(amount_sold) > 100;
```
Note: The key difference between WHERE and HAVING is that WHERE filters rows before they are grouped, while HAVING filters after the grouping.
Grouping by Expressions:
You can also group by the result of an expression. For instance, if you want to group sales by the year:
```
SELECT EXTRACT(YEAR FROM date) AS sales_year, SUM(amount_sold) AS total_sold
FROM sales
GROUP BY EXTRACT(YEAR FROM date);
```

Tips:

Every column in the SELECT clause that isn't used in an aggregate function must be listed in the GROUP BY clause. Otherwise, PostgreSQL will raise an error.
Be cautious when using GROUP BY on columns that can have null values. All null values are considered equal, and they will be grouped together.
Using GROUP BY can impact performance, especially on large tables. Always ensure your columns used in the GROUP BY clause are indexed if they are frequently used in queries, and monitor the performance of your queries.

How to use GROUP BY in PostgreSQL:
- Description: Group rows based on specified columns.
- Code:
```
SELECT department, COUNT(*) FROM employees GROUP BY department;
```
PostgreSQL GROUP BY example:
- Description: Basic example of using GROUP BY.
- Code:
```
SELECT country, AVG(population) FROM cities GROUP BY country;
```
Aggregation functions with GROUP BY in PostgreSQL:
- Description: Utilize aggregation functions like SUM, AVG, MAX, MIN with GROUP BY.
- Code:
```
SELECT department, SUM(salary) FROM employees GROUP BY department;
```
Multiple columns in GROUP BY in PostgreSQL:
- Description: Group by multiple columns simultaneously.
- Code:
```
SELECT department, job_title, AVG(salary) FROM employees GROUP BY department, job_title;
```
HAVING clause with GROUP BY in PostgreSQL:
- Description: Apply conditions after grouping using HAVING.
- Code:
```
SELECT department, AVG(salary) FROM employees GROUP BY department HAVING AVG(salary) > 50000;
```
GROUP BY vs. DISTINCT in PostgreSQL:
- Description: Understand the difference between GROUP BY and DISTINCT.
- Code:
```
SELECT DISTINCT department FROM employees;
```

ORDER BY with GROUP BY in PostgreSQL:

Description: Sort the result set of a GROUP BY query.

Code:

SELECT department, AVG(salary) FROM employees GROUP BY department ORDER BY AVG(salary) DESC;

Nested GROUP BY in PostgreSQL:
- Description: Perform grouping within grouped results.
- Code:
```
SELECT country, city, AVG(population) FROM cities GROUP BY country, city;
```

GROUP BY and JOIN in PostgreSQL:

Description: Combine GROUP BY with JOIN for more complex queries.

Code:

SELECT departments.name, AVG(employees.salary)
FROM departments
JOIN employees ON departments.id = employees.department_id
GROUP BY departments.name;

Using GROUP BY for date and time aggregations in PostgreSQL:

Description: Aggregate data based on date and time.

Code:

SELECT date_trunc('month', order_date) AS month, SUM(total_amount)
FROM orders
GROUP BY date_trunc('month', order_date);

GROUP BY with ROLLUP and CUBE in PostgreSQL:
- Description: Explore subtotals and grand totals with ROLLUP and CUBE.
- Code:
```
SELECT department, job_title, AVG(salary)
FROM employees
GROUP BY ROLLUP (department, job_title);
```

GROUP BY and window functions in PostgreSQL:

Description: Combine GROUP BY with window functions for advanced analytics.

Code:

SELECT department, job_title, AVG(salary),
       RANK() OVER (PARTITION BY department ORDER BY AVG(salary) DESC) AS rank_within_department
FROM employees
GROUP BY department, job_title;

GROUP BY with CASE statement in PostgreSQL:

Description: Use CASE within GROUP BY for conditional aggregations.

Code:

SELECT CASE WHEN age < 30 THEN 'Young'
            WHEN age BETWEEN 30 AND 50 THEN 'Middle-aged'
            ELSE 'Senior' END AS age_group,
       AVG(salary)
FROM employees
GROUP BY age_group;

Conditional aggregations with GROUP BY in PostgreSQL:

Description: Conditionally aggregate data based on specific criteria.

Code:

SELECT department,
       AVG(CASE WHEN job_title = 'Manager' THEN salary ELSE 0 END) AS avg_manager_salary,
       AVG(CASE WHEN job_title = 'Developer' THEN salary ELSE 0 END) AS avg_dev_salary
FROM employees
GROUP BY department;

GROUP BY and subqueries in PostgreSQL:

Description: Use subqueries with GROUP BY for more complex analyses.

Code:

SELECT department, AVG(subquery.total_sales)
FROM employees
JOIN (SELECT employee_id, SUM(sale_amount) AS total_sales
      FROM sales
      GROUP BY employee_id) AS subquery
ON employees.id = subquery.employee_id
GROUP BY department;

Combining GROUP BY with other clauses in PostgreSQL:
- Description: Mix GROUP BY with WHERE, JOIN, and other clauses for comprehensive queries.
- Code:
```
SELECT department, AVG(salary)
FROM employees
WHERE hire_date > '2022-01-01'
GROUP BY department;
```