SQL | Distinct Clause

The DISTINCT clause in SQL is used to remove duplicates from the result set of a query. It ensures that the output consists only of unique values for the specified columns.

Syntax:

SELECT DISTINCT column1, column2, ...
FROM table_name;

Examples:

Selecting Unique Values from a Single Column:
If you have a users table and you want to know all the unique cities where your users are from, you would use:
```
SELECT DISTINCT city FROM users;
```
Selecting Unique Combinations from Multiple Columns:
If you wanted to know all the unique combinations of city and state from the same users table:
```
SELECT DISTINCT city, state FROM users;
```
This would list all unique city-state pairs.
Using DISTINCT with COUNT:
If you wanted to know how many unique cities are represented in the users table:
```
SELECT COUNT(DISTINCT city) FROM users;
```

Things to Note:

Performance: Using DISTINCT can be resource-intensive on large datasets because the database has to compare each row against all others to eliminate duplicates. Always consider if there's a more efficient way to retrieve the data you need, especially when working with large tables.
Order of Columns: In scenarios where you are selecting distinct values from multiple columns, the order of columns can matter. For example, SELECT DISTINCT city, state might produce different results than SELECT DISTINCT state, city in terms of the order of the output, but the combinations will be the same.
Null Values: If there are null values in the columns, they are considered by the DISTINCT clause. Two rows with null values in the same column will be considered duplicate rows.
Alternative to DISTINCT: Sometimes, you can achieve the same effect as DISTINCT using a GROUP BY clause, especially when you need aggregate functions for each unique value or combination.

In essence, the DISTINCT clause is a powerful tool for data retrieval, but like all SQL commands, it should be used judiciously, keeping performance considerations in mind.

How to Use DISTINCT in SQL:
- Description: The DISTINCT clause is used to retrieve unique values from a specific column in a result set.
- Example:
```
SELECT DISTINCT column_name
FROM table_name;
```
Selecting Unique Values with DISTINCT in SQL:
- Description: Retrieves distinct (unique) values from a specified column.
- Example:
```
SELECT DISTINCT department
FROM employees;
```
DISTINCT vs. GROUP BY in SQL:
- Description: DISTINCT is used to retrieve unique values from one or more columns, while GROUP BY is used for aggregating data.
- Example (DISTINCT):
```
SELECT DISTINCT department
FROM employees;
```
- Example (GROUP BY):
```
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;
```
Using DISTINCT with Multiple Columns in SQL:
- Description: Retrieves unique combinations of values from multiple columns.
- Example:
```
SELECT DISTINCT department, location
FROM employees;
```
DISTINCT and ORDER BY in SQL:
- Description: Retrieves unique values and orders them based on a specified column.
- Example:
```
SELECT DISTINCT department
FROM employees
ORDER BY department DESC;
```

DISTINCT with JOIN in SQL:

Description: Retrieves unique values after performing a JOIN operation.

Example:

SELECT DISTINCT customers.customer_id, customers.customer_name
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id;

Combining WHERE and DISTINCT in SQL:
- Description: Retrieves unique values based on specified conditions.
- Example:
```
SELECT DISTINCT department
FROM employees
WHERE salary > 50000;
```
DISTINCT and Aggregate Functions in SQL:
- Description: Retrieves unique values along with aggregate functions.
- Example:
```
SELECT DISTINCT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
```
Handling NULL Values with DISTINCT in SQL:
- Description: NULL values are considered equal and collapse into a single unique value.
- Example:
```
SELECT DISTINCT manager_id
FROM employees;
```
Limitations of DISTINCT Clause in SQL:
- Description: DISTINCT does not work well with large datasets, and it might affect performance.
- Example:
```
SELECT DISTINCT column_name
FROM large_table;
```

DISTINCT in Subqueries in SQL:

Description: Retrieves unique values from a subquery.

Example:

SELECT employee_id
FROM employees
WHERE department_id IN (SELECT DISTINCT department_id FROM departments WHERE location = 'New York');

Alternatives to DISTINCT in SQL:

Alternative 1 (GROUP BY):

SELECT column_name
FROM table_name
GROUP BY column_name;

Alternative 2 (Window Functions):

SELECT DISTINCT column_name, ROW_NUMBER() OVER (PARTITION BY column_name ORDER BY some_column) AS row_num
FROM table_name
WHERE row_num = 1;