WL#12788: Iterator executor analytics queries
Affects: Server-8.0
—
Status: Complete
Building on the work in WL#12074 and WL#12470, this worklog seeks to broaden the scope of the analytics queries the iterator executor can handle, by supporting window functions, rollup, and final deduplication (used on DISTINCT queries that can't deduplicate directly in the final temporary table).
No (new) functional requirements. All queries should remain the same. Even performance schema results should be the same. Some warnings may end up on different line numbers. No (new) non-functional requirements. Performance should not regress.
The new features are: Windowing functions: Two new iterators, WindowingIterator and BufferingWindowingIterator. These work pretty much like the existing aggregate iterators, except that the aggregates are windowing aggregates. They reuse most of the existing WF functions behind the scenes. Rollup: The biggest challenge with rollup is that we need to replace the grouping items with NULLs in the rollup rows, even for non-nullable fields. In the old executor, which was push-based, this was solved by switching the field list when pushing the additional rows. With the current iterator design, which is pull- based and has no explicit field list (arguably a shortcoming, but not one which is too easily fix), we cannot easily do this. Instead, we replace the field list with a set of Item_refs, which can then be pointed to different places when rollup rows are to be returned. Final deduplication: Normally, deduplication (SELECT DISTINCT) happens by adding a unique index to the final temporary table. However, in the cases where we do aggregation directly into a temporary table, we cannot use such an index, since rows change during query execution and thus cannot be deduplicated on-the-fly. (E.g., consider SELECT DISTINCT COUNT(*) FROM t1 GROUP BY f1.) The old executor solves this by adding a post-pass that actually deletes rows from the temporary table; for small tables, it uses a hash table to deduplicate, but for larger ones, it uses an O(n^2) algorithm based on pairwise comparison, which is extremely slow. Neither fits very well in an iterator design, and thus, we replace this step by filesort, which is consistently O(n log n) with a small constant factor. Filesort needs to be extended with support for deduplicating rows, which is done as part of this work. As part of this worklog, we are also finally removing the concept of execution stages (creating sort index, sorting for group, sorting for order, sorting result, removing duplicates); all of execution will now simply become "executing". Execution stages have not given out meaningful information for many, many years as execution has been overlapping, and with iterators, they overlap even more, so it is meaningless to say that we are doing one stage or the other.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.