WL#13000: Iterator UNION
Affects: Server-8.0 — Status: Complete
Implement UNION, UNION ALL, WITH RECURSIVE and unqualified SELECT COUNT(*) in the iterator executor.
No (new) functional requirements. All queries should remain the same. Performance schema results should be the same, except that examined_rows for zero-row tables could change by one. Some warnings may end up on different line numbers, and order of unsorted queries could become different due to BNL being turned off. No (new) non-functional requirements. Performance should not regress.
This worklog will move the iterator executor up from the level of a single JOIN to the level of a query block, including support for UNION / UNION ALL. This includes: - Optimization no longer special-cases non-unions (“simple queries”). - Execution no longer distinguishes between unions and non-unions. - Subselect engines no longer distinguish between unions non-unions. New iterators that we need (most of this functionality is moved away from hard- coded initialization in the executor): - Query blocks that are known to return zero rows will get a new zero-row iterator. - COUNT(*) is now handled using an iterator. - Old-style information schema tables (those not coming from the data dictionary) is filled in using a new iterator. UNION handling ============== MaterializeIterator is significantly extended. It can now materialize multiple query blocks into the same table (implementing UNION / UNION ALL). There is also a new StreamingIterator that doesn't actually materialize, just does the field copying between the JOINs as materialize would; together with AppendIterator (which simply does sequential Read() on many iterators at a time), this allows for implementing UNION ALL through iterators (instead of using query results). Furthermore, UNIONs can be materialized directly into a derived table without going through an extra hop; SELECT_LEX_UNIT (query expressions) can return just a set of sub-iterators which the derived table can then give directly to MaterializeIterator. Mixed UNION DISTINCT / UNION ALL queries are usually handled through a combination of MaterializeIterator and StreamingIterator/AppendIterator (ie., some parts materialized and some parts just streamed through). However, for direct materialization, everything needs to be materialized anyway, so we do deduplication by hash field and turn it off once we get to the UNION ALL query blocks. (The old behavior of turning off indexes to get the same effect is removed, as it is not safe when there are other indexes on the derived table.) Recursive CTE ============= This is pretty much the same execution strategy as in the existing executor, except that we don't have the added complication of having to move the rows from the temporary table into the derived table during execution (since we no longer have a separate temporary table for these queries). We add some new helpers in MaterializeIterator to run repeated materialization of repeated query blocks, and add a new basic row iterator called FollowTailIterator that deals with reading only new rows and counting CTE iterations. There's an issue here in that if the parent query expression of a recursive CTE wanted to use BNL/BKA, we could end up in a situation where the query expression would have to run in the pre-iterator executor, but that would not be capable of runing the iterator-planned recursive CTE. To break this deadlock, we turn off join buffering (BNL/BKA) globally for a query if we see a recursive CTE -- this restriction can be lifted later.
Copyright (c) 2000, 2023, Oracle Corporation and/or its affiliates. All rights reserved.