fbpx
Wikipedia

Correlated subquery

In a SQL database query, a correlated subquery (also known as a synchronized subquery) is a subquery (a query nested inside another query) that uses values from the outer query. Because the subquery may be evaluated once for each row processed by the outer query, it can be slow.

Examples edit

Correlated subqueries in the WHERE clause edit

Here is an example for a typical correlated subquery. In this example, the objective is to find all employees whose salary is above average for their department.

 SELECT employee_number, name  FROM employees emp  WHERE salary > (  SELECT AVG(salary)  FROM employees  WHERE department = emp.department); 

In the above query the outer query is

 SELECT employee_number, name  FROM employees emp  WHERE salary > ... 

and the inner query (the correlated subquery) is

 SELECT AVG(salary)  FROM employees  WHERE department = emp.department 

In the above nested query the inner query has to be re-executed for each employee. (A sufficiently smart implementation may cache the inner query's result on a department-by-department basis, but even in the best case the inner query must be executed once per department.)

Correlated subqueries in the SELECT clause edit

Correlated subqueries may appear elsewhere besides the WHERE clause; for example, this query uses a correlated subquery in the SELECT clause to print the entire list of employees alongside the average salary for each employee's department. Again, because the subquery is correlated with a column of the outer query, it must be re-executed for each row of the result.[citation needed]

 SELECT employee_number,  name,  (SELECT AVG(salary)   FROM employees  WHERE department = emp.department) AS department_average  FROM employees emp 

Correlated subqueries in the FROM clause edit

It is generally meaningless to have a correlated subquery in the FROM clause because the table in the FROM clause is needed to evaluate the outer query, but the correlated subquery in the FROM clause can't be evaluated before the outer query is evaluated, causing a chicken-and-egg problem. Specifically, MariaDB lists this as a limitation in its documentation.[1]

However, in some database systems, it is allowed to use correlated subqueries while joining in the FROM clause, referencing the tables listed before the join using a specified keyword, producing a number of rows in the correlated subquery and joining it to the table on the left. For example, in PostgreSQL, adding the keyword LATERAL before the right-hand subquery,[2] or in Microsoft SQL Server, using the keyword CROSS APPLY or OUTER APPLY instead of JOIN[3] achieves the effect.

Computation of correlated subqueries edit

A commonly used computational method for a correlated subquery is to rewrite it into an equivalent flat query[4] (a process known as flattening[5][6][7][8]). The algorithm development in this direction has an advantage of low complexity. Because this is a customized approach, existing database systems cannot flatten arbitrary correlated subqueries by following certain general rules. In addition, this approach requires high engineering efforts to implement flattening algorithms into a database engine. A general computational approach is to directly execute the nested loop by iterating all tuples of the correlated columns from the outer query block and executing the subquery as many times as the number of outer-loop tuples.[9] This simple approach has an advantage of general-purpose because it is not affected by the type of correlated operators or subquery structures. However, it has a high computational complexity. A GPU acceleration approach is used to significantly improve the performance of the nested method of high algorithmic complexity by exploiting massive parallelism and device memory locality on GPU,[10] which accomplishes the goal for both general-purpose software design and implementation and high performance in subquery processing.

References edit

  1. ^ "Subquery Limitations". MariaDB Knowledgebase. Retrieved 2020-12-24.
  2. ^ "Table Expressions - LATERAL Subqueries". postgresql.org. Retrieved 2023-01-21.
  3. ^ "FROM clause plus JOIN, APPLY, PIVOT (Transact-SQL)". docs.microsoft.com. 2019-06-01. Retrieved 2020-12-24.
  4. ^ Kim, Won (September 1982). "On Optimizing an SQL-like Nested Query" (pdf). ACM Transactions on Database Systems. 7 (3): 443–469. doi:10.1145/319732.319745. S2CID 4374300.
  5. ^ "The SQLite Query Optimizer Overview - 11. Subquery Flattening". SQLite. Retrieved 2023-01-21. When a subquery occurs in the FROM clause of a SELECT, the simplest behavior is to evaluate the subquery into a transient table, then run the outer SELECT against the transient table. Such a plan can be suboptimal since the transient table will not have any indexes and the outer query (which is likely a join) will be forced to do a full table scan on the transient table. To overcome this problem, SQLite attempts to flatten subqueries in the FROM clause of a SELECT. This involves inserting the FROM clause of the subquery into the FROM clause of the outer query and rewriting expressions in the outer query that refer to the result set of the subquery. ...
  6. ^ "Flattening FROM Clause Subqueries". Vertica. Retrieved 2023-01-21. FROM clause subqueries are always evaluated before their containing query. In some cases, the optimizer flattens FROM clause subqueries so the query can execute more efficiently.
  7. ^ "Flattening a subquery into a normal join". Apache Derby. Retrieved 2023-01-21. Statements that include such subqueries can be flattened into joins only if the subquery does not introduce any duplicates into the result set...
  8. ^ "Chapter 15: Abstract Query Plan Guide - Flattened subqueries". Sybase. Retrieved 2023-01-21. Some subqueries can be flattened into joins. ...
  9. ^ Selinger, P. Griffiths; Astrahan, M. M.; Chamberlin, D. D.; Lorie, R. A.; Price, T. G. (1979). Access Path Selection in a Relational Database Management System (pdf). Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, Boston, Massachusetts. SIGMOD '79. New York, NY, USA: Association for Computing Machinery. pp. 23–34. doi:10.1145/582095.582099. ISBN 089791001X. 10.1145/582095.582099.
  10. ^ Floratos, Sofoklis; Xiao, Mengbai; Wang, Hao; Guo, Chengxin; Yuan, Yuan; Lee, Rubao; Zhang, Xiaodong (2021). NestGPU: Nested Query Processing on GPU. IEEE 37th International Conference on Data Engineering. pp. 1008–1019.

External links edit

  • Correlated subquery with examples

correlated, subquery, database, query, correlated, subquery, also, known, synchronized, subquery, subquery, query, nested, inside, another, query, that, uses, values, from, outer, query, because, subquery, evaluated, once, each, processed, outer, query, slow, . In a SQL database query a correlated subquery also known as a synchronized subquery is a subquery a query nested inside another query that uses values from the outer query Because the subquery may be evaluated once for each row processed by the outer query it can be slow Contents 1 Examples 1 1 Correlated subqueries in the WHERE clause 1 2 Correlated subqueries in the SELECT clause 1 3 Correlated subqueries in the FROM clause 2 Computation of correlated subqueries 3 References 4 External linksExamples editCorrelated subqueries in the WHERE clause edit Here is an example for a typical correlated subquery In this example the objective is to find all employees whose salary is above average for their department SELECT employee number name FROM employees emp WHERE salary gt SELECT AVG salary FROM employees WHERE department emp department In the above query the outer query is SELECT employee number name FROM employees emp WHERE salary gt and the inner query the correlated subquery is SELECT AVG salary FROM employees WHERE department emp department In the above nested query the inner query has to be re executed for each employee A sufficiently smart implementation may cache the inner query s result on a department by department basis but even in the best case the inner query must be executed once per department Correlated subqueries in the SELECT clause edit Correlated subqueries may appear elsewhere besides the WHERE clause for example this query uses a correlated subquery in the SELECT clause to print the entire list of employees alongside the average salary for each employee s department Again because the subquery is correlated with a column of the outer query it must be re executed for each row of the result citation needed SELECT employee number name SELECT AVG salary FROM employees WHERE department emp department AS department average FROM employees emp Correlated subqueries in the FROM clause edit It is generally meaningless to have a correlated subquery in the FROM clause because the table in the FROM clause is needed to evaluate the outer query but the correlated subquery in the FROM clause can t be evaluated before the outer query is evaluated causing a chicken and egg problem Specifically MariaDB lists this as a limitation in its documentation 1 However in some database systems it is allowed to use correlated subqueries while joining in the FROM clause referencing the tables listed before the join using a specified keyword producing a number of rows in the correlated subquery and joining it to the table on the left For example in PostgreSQL adding the keyword LATERAL before the right hand subquery 2 or in Microsoft SQL Server using the keyword CROSS APPLY or OUTER APPLY instead of JOIN 3 achieves the effect Computation of correlated subqueries editA commonly used computational method for a correlated subquery is to rewrite it into an equivalent flat query 4 a process known as flattening 5 6 7 8 The algorithm development in this direction has an advantage of low complexity Because this is a customized approach existing database systems cannot flatten arbitrary correlated subqueries by following certain general rules In addition this approach requires high engineering efforts to implement flattening algorithms into a database engine A general computational approach is to directly execute the nested loop by iterating all tuples of the correlated columns from the outer query block and executing the subquery as many times as the number of outer loop tuples 9 This simple approach has an advantage of general purpose because it is not affected by the type of correlated operators or subquery structures However it has a high computational complexity A GPU acceleration approach is used to significantly improve the performance of the nested method of high algorithmic complexity by exploiting massive parallelism and device memory locality on GPU 10 which accomplishes the goal for both general purpose software design and implementation and high performance in subquery processing References edit Subquery Limitations MariaDB Knowledgebase Retrieved 2020 12 24 Table Expressions LATERAL Subqueries postgresql org Retrieved 2023 01 21 FROM clause plus JOIN APPLY PIVOT Transact SQL docs microsoft com 2019 06 01 Retrieved 2020 12 24 Kim Won September 1982 On Optimizing an SQL like Nested Query pdf ACM Transactions on Database Systems 7 3 443 469 doi 10 1145 319732 319745 S2CID 4374300 The SQLite Query Optimizer Overview 11 Subquery Flattening SQLite Retrieved 2023 01 21 When a subquery occurs in the FROM clause of a SELECT the simplest behavior is to evaluate the subquery into a transient table then run the outer SELECT against the transient table Such a plan can be suboptimal since the transient table will not have any indexes and the outer query which is likely a join will be forced to do a full table scan on the transient table To overcome this problem SQLite attempts to flatten subqueries in the FROM clause of a SELECT This involves inserting the FROM clause of the subquery into the FROM clause of the outer query and rewriting expressions in the outer query that refer to the result set of the subquery Flattening FROM Clause Subqueries Vertica Retrieved 2023 01 21 FROM clause subqueries are always evaluated before their containing query In some cases the optimizer flattens FROM clause subqueries so the query can execute more efficiently Flattening a subquery into a normal join Apache Derby Retrieved 2023 01 21 Statements that include such subqueries can be flattened into joins only if the subquery does not introduce any duplicates into the result set Chapter 15 Abstract Query Plan Guide Flattened subqueries Sybase Retrieved 2023 01 21 Some subqueries can be flattened into joins Selinger P Griffiths Astrahan M M Chamberlin D D Lorie R A Price T G 1979 Access Path Selection in a Relational Database Management System pdf Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data Boston Massachusetts SIGMOD 79 New York NY USA Association for Computing Machinery pp 23 34 doi 10 1145 582095 582099 ISBN 089791001X 10 1145 582095 582099 Floratos Sofoklis Xiao Mengbai Wang Hao Guo Chengxin Yuan Yuan Lee Rubao Zhang Xiaodong 2021 NestGPU Nested Query Processing on GPU IEEE 37th International Conference on Data Engineering pp 1008 1019 External links editCorrelated subquery with examples Retrieved from https en wikipedia org w index php title Correlated subquery amp oldid 1210065523, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.