SQL Tutorial
SQL Clauses / Operators
SQL-Injection
SQL Functions
SQL Queries
PL/SQL
MySQL
SQL Server
Misc
A correlated subquery is a type of subquery that uses values from the outer query. For every row processed by the outer query, the subquery is executed once, often leading to more resource-intensive operations compared to a non-correlated subquery. The correlated subquery is dependent on the outer query, hence the term "correlated."
To illustrate, consider two tables: employees
and orders
. Let's say you want to find all employees who have at least one order:
SELECT e.employee_name FROM employees e WHERE EXISTS ( SELECT 1 FROM orders o WHERE o.employee_id = e.employee_id );
In this example, for each employee in the employees
table, the subquery checks the orders
table to see if an order exists for that employee. The subquery is correlated because it references the e.employee_id
from the outer query.
Execution Order: The outer query is executed first, but only up to the point where the correlated subquery needs to be executed. Then, for each row in the outer query, the correlated subquery is executed.
Performance: Since the correlated subquery might be executed multiple times (once for each row selected by the outer query), it can be slower than a non-correlated subquery. Proper indexing and optimization are crucial.
Reference to Outer Query: A defining characteristic of correlated subqueries is that they reference columns from the outer query. This is what differentiates them from standard subqueries.
SELECT e.employee_name, e.salary FROM employees e WHERE e.salary > ( SELECT AVG(e2.salary) FROM employees e2 WHERE e2.department_id = e.department_id );
SELECT e.employee_name, o.order_amount FROM employees e JOIN orders o ON e.employee_id = o.employee_id WHERE o.order_amount = ( SELECT MAX(o2.order_amount) FROM orders o2 WHERE o2.employee_id = e.employee_id );
Optimization: Correlated subqueries can often be rewritten as joins, which can be more performant in some cases. Analyze the query execution plan to determine if the correlated subquery is the most efficient approach.
Avoid Overuse: Because correlated subqueries can be resource-intensive, avoid using them excessively or without justification. Always look for alternative approaches and test for performance.
RDBMS Differences: The support and optimization of correlated subqueries can vary among RDBMSs. Always refer to the documentation of the specific system you're using and be aware of its behavior and limitations regarding correlated subqueries.
How to Use Correlated Subqueries: Correlated subqueries refer to an outer query from within the inner subquery.
SELECT column1, (SELECT MAX(column2) FROM your_table t2 WHERE t1.column1 = t2.column1) AS max_value FROM your_table t1;
SQL Correlated Subquery vs. Regular Subquery:
SELECT column1, (SELECT MAX(column2) FROM your_table) AS max_value FROM your_table;
SELECT column1, (SELECT MAX(column2) FROM your_table t2 WHERE t1.column1 = t2.column1) AS max_value FROM your_table t1;
Correlated Subqueries in WHERE Clause:
SELECT column1 FROM your_table t1 WHERE column2 > (SELECT AVG(column2) FROM your_table t2 WHERE t1.column1 = t2.column1);
Examples of Correlated Subqueries in SELECT Statement:
SELECT column1, (SELECT COUNT(*) FROM your_table t2 WHERE t1.column1 = t2.column1) AS count_per_group FROM your_table t1;
SQL Correlated Subquery in FROM Clause:
SELECT t1.column1, t2.avg_value FROM your_table t1 JOIN (SELECT column1, AVG(column2) AS avg_value FROM your_table GROUP BY column1) t2 ON t1.column1 = t2.column1;
Correlated Subqueries with EXISTS Keyword:
SELECT column1 FROM your_table t1 WHERE EXISTS (SELECT 1 FROM another_table t2 WHERE t1.column1 = t2.column1);
Nested Correlated Subqueries in SQL:
SELECT column1, (SELECT MAX(column2) FROM (SELECT * FROM your_table) t2 WHERE t1.column1 = t2.column1) AS max_value FROM your_table t1;
Handling NULL Values in Correlated Subqueries:
SELECT column1, (SELECT COALESCE(MAX(column2), 0) FROM your_table t2 WHERE t1.column1 = t2.column1) AS max_value FROM your_table t1;
Correlated Subqueries with Aggregates (SUM, AVG, COUNT):
SELECT column1, (SELECT AVG(column2) FROM your_table t2 WHERE t1.column1 = t2.column1) AS avg_per_group FROM your_table t1;