Sunday, August 11, 2024

Correlated subqueries - interview questions

 Correlated subqueries are a common topic in SQL interviews because they demonstrate a candidate's ability to understand complex query structures. Here are some interview questions related to correlated subqueries:

1. Basic Understanding:

  • Question: What is a correlated subquery, and how does it differ from a regular subquery?
  • Answer: A correlated subquery is a subquery that references columns from the outer query, meaning the subquery is executed once for each row processed by the outer query. A regular (non-correlated) subquery is independent of the outer query and is executed only once.

2. Execution Process:

  • Question: How does the SQL engine execute a correlated subquery?
  • Answer: In a correlated subquery, the SQL engine first processes a row from the outer query, then executes the subquery using that row's data. This process is repeated for each row in the outer query, meaning the subquery can be executed multiple times.

3. Use Cases:

  • Question: When would you use a correlated subquery instead of a JOIN or a regular subquery?
  • Answer: Correlated subqueries are useful when you need to compare each row in the outer query with a specific subset of data from another table, especially when the comparison involves aggregations or complex conditions that are difficult to express with JOINs.

4. Example Creation:

  • Question: Write an SQL query using a correlated subquery to find employees whose salaries are above the average salary in their department.
  • Answer:
SELECT employee_id, salary
FROM employees e
WHERE salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department_id = e.department_id
);

This query compares each employee's salary to the average salary of their respective department.

5. Performance Considerations:

  • Question: What are the performance implications of using correlated subqueries, and how can they be optimized?
  • Answer: Correlated subqueries can be slower than other methods because they are executed once for each row in the outer query. Optimization strategies include rewriting the query using JOINs or common table expressions (CTEs), indexing the columns involved, or ensuring that the subquery returns a limited number of rows.

6. Complex Conditions:

  • Question: How would you write a correlated subquery to find all customers who have placed more orders than the average number of orders for all customers?
  • Answer:
SELECT customer_id
FROM orders o1
WHERE (
    SELECT COUNT(*)
    FROM orders o2
    WHERE o2.customer_id = o1.customer_id
) > (
    SELECT AVG(order_count)
    FROM (
        SELECT COUNT(*) AS order_count
        FROM orders
        GROUP BY customer_id
    ) avg_orders
);

This query finds customers whose order count exceeds the average order count.

7. Comparing Multiple Rows:

  • Question: Can a correlated subquery return multiple rows? How would you handle this situation?
  • Answer: Yes, a correlated subquery can return multiple rows. If the outer query expects a single value, you must use IN, ANY, or ALL to handle multiple rows, or aggregate the results of the subquery using functions like MAX, MIN, or SUM.

8. Correlated Subqueries in UPDATE Statements:

  • Question: How can you use a correlated subquery in an UPDATE statement?
  • Answer:
UPDATE employees e1
SET salary = salary * 1.10
WHERE salary < (
    SELECT AVG(salary)
    FROM employees e2
    WHERE e2.department_id = e1.department_id
);

This query increases the salary of employees whose salary is below the average salary of their department.

9. Avoiding Common Pitfalls:

  • Question: What are some common pitfalls when using correlated subqueries, and how can you avoid them?
  • Answer: Common pitfalls include performance degradation due to multiple executions of the subquery and incorrect results due to misunderstanding the relationship between the outer and inner queries. To avoid these, ensure the subquery logic is correct, consider alternative query structures, and use indexing to improve performance.

10. Combining with Other SQL Features:

  • Question: How would you use a correlated subquery with the EXISTS predicate? Provide an example.
  • Answer:
SELECT e1.employee_id, e1.first_name
FROM employees e1
WHERE EXISTS (
    SELECT 1
    FROM employees e2
    WHERE e2.manager_id = e1.employee_id
);

This query finds all employees who are managers by checking if any other employees report to them.

No comments:

Post a Comment