WL#12534: Support LIMIT in recursive common table expression
Affects: Server-8.0
—
Status: Complete
with recursive qn as ( select cast("x" as char(100)) as a from dual union all select concat("x",qn.a) from qn where length(qn.a)<10 limit 3 ) select * from qn;ERROR 1235 (42000): This version of MySQL doesn't yet support 'ORDER BY / LIMIT over UNION in recursive Common Table Expression' We have had a request that LIMIT would terminate the filling of the CTE (above, when "qn" contains 3 rows in total). It is a limit on the total number of rows in the final CTE, not on the number of rows that each iteration may produce. While it was already possible to put LIMIT on the outermost SELECT, putting it inside the definition of the CTE has the advantage that it stops the row generation algorithm as soon as it has produced the requested number of rows; it's thus more efficient. With LIMIT, the content of the recursive CTE is a subset of what it would have been without LIMIT. Which rows end up in this subset depends on the order of the generation of rows, which is implementation-defined (and thus, not precisely specified). This order cannot be influenced by adding ORDER BY as this is forbidden in a recursive CTE. The order is loosely defined here: https://dev.mysql.com/doc/refman/8.0/en/with.html#common-table-expressions- recursive So, this feature is intended to be used as "give me N rows, no matter exactly which ones". Like "give me 10 descendents of this root ancestor". Given our algorithm, it may usually be a breadth-first ordering, but there's no guarantee. Even less if the recursive part contains a JOIN to another table. The used indexes may also influence that. This feature can also be used as a precaution against a runaway recursive CTE; instead of hitting the @cte_max_recursion_depth maximum and stopping with error, the query with LIMIT will stop, without error, after producing that many rows. OFFSET should be supported too. We will still not allow ORDER BY over the UNION. Note that: with recursive qn as ( select cast("x" as char(100)) as a from dual union all (select concat("x",qn.a) from qn where length(qn.a)<10 limit 3) ) select * from qn; will continue throwing: ERROR 1235 (42000): This version of MySQL doesn't yet support 'ORDER BY / LIMIT / SELECT DISTINCT in recursive query block of Common Table Expression' This WL has a side-effect for a non-recursive CTE or derived table: if it has UNION DISTINCT and LIMIT/OFFSET we now do direct materialization i.e. instead of having one tmp table for storing the result of UNION and then another tmp table for storing the result of LIMIT (forming the derived table or CTE), the first tmp table is not created anymore: we have the UNION rows flowing directly into the LIMIT filter and ending up stored into the final result tmp table (the derived table or CTE). This will probably improve performance. See also BUG#92857 Support LIMIT inside definition of recursive CTE
F-1 In a recursive CTE, which is made of a UNION, a global "LIMIT N" should be allowed. It should make the CTE algorithm stop after N rows have been generated; the CTE should contain those rows. The LIMIT clause may contain an OFFSET; for LIMIT N OFFSET M, the CTE algorithm should stop after M+N rows have been generated, and the CTE should contain the last N.
Reminder: "direct materialization" is the ability, when we have a derived table containing a UNION, to use a single tmp table for both the derived table and the UNION; precisely, the UNION is a set of iterators, and the derived table's materialization runs them to fill itself. In this WL, we make direct materialization work even if the UNION has LIMIT/OFFSET. While this is necessary for the WL, this particular change isn't limited to recursive CTEs, as it also applies to non-recursive CTEs and derived tables. It may help performance for them. Assuming there is LIMIT N OFFSET M: - we pass limit=M+N to the constructor of MaterializeIterator; this way the materialization stops after writing M+N rows. - If there is OFFSET, we add a LimitOffsetIterator over MaterializeIterator, to eliminate the first M rows.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.