WL#3985: Subquery optimization: smart choice between semi-join and materialization
Affects: Server-6.0
—
Status: Complete
At the moment, subqueries that can be handled by both semi-join and materialization are always handled using some semi-join strategy. The goal of this WL entry is to do some cost or heuristics-based choice between semi-join and materialization.
1. Task setting 1.1. Considerations about when each strategy is better 2. Fitting materialization into join optimization and execution 2.1 Changes in the join executor 2.1.1 Plan refinement step changes 2.2 Changes in the join optimizer 2.2.1 Cost of materialization 2.2.2 Fitting materialization into the join optimizer 2.2.3 Materialization and join buffering 3. User interface 3.1 EXPLAIN output changes 3.2 Control from SQL level 3.2.1 @@optimizer_switch 3.2.1 Subquery hints 4. Extras and Todos 4.1 Correlation wrt constant tables 4.2 Igor's idea: check for subquery "late" At the moment, subqueries that can be handled by both semi-join and materialization are always handled using some semi-join strategy. The goal of this WL entry is to do some cost or heuristics-based choice between semi-join and materialization. Semi-join vs materialization **************************** 1. Task setting =============== We consider subqueries that can be handled by both semi-join and materialization strategies. This means that: * the subquery is an IN-subquery * the subquery a [top-level] AND-part of the WHERE/ON clause - it satisifes all semi-join's requirements, i.e. it is a single select w/o grouping, aggregates, etc. * The subquery's parent join has enough join bits so that it is possible to convert subquery into semi-join. * the subquery is uncorrelated (this is required by materialization) At the moment, semi-join is always chosen over materialization. However, there are cases where materialization is faster than semi-join. The goal of this WL is to develop a code to recognize such cases. 1.1. Considerations about when each strategy is better ------------------------------------------------------ Semi-join is better when the subquery is 'strongly correlated', and there are indexes which will allow us to scan a small fraction of total subquery's record combinations. If that is not the case, semi-join execution will cause repeated enumeration of the same big set of record combinations, and in that case it will pay off if we do materialization: run the subquery once, create the lookup index, a and do index lookups instead of re-running the subquery. psergey-todo: do these words matter anymore: (thesis about "it makes no sense to do interleaving with uncorrelated subquery" and IN-equality based counterexample) All of the above should be automatically taken into account by cost-based materialization-aware join optimizer. 2. Fitting materialization into join optimization and execution =============================================================== The idea is to delay the choice until the join optimizer. This can be done as follows: * Convert the subquery into semi-join nest, mark the semi-join nest as "suitable for materialization" (note: according to Monty, the limit on number of bits in table_map should not be an issue. We'll be able to switch to bigger table_map if the need arises. But so far we haven't seen any demand for this) * Add materialization as another way to execute suitable semi-join nests. Actually, this means we're introducing bushy join execution plans. 2.1 Changes in the join executor -------------------------------- Subquery materialization will be handled in a manner similar to join buffering. Subquery tables will occupy a continuous range in the join order, i.e. the join order will have form: ot1 ... otK (( it1 it2 ... itN )) ot{k+1} ... When the NL-join execution reaches table it1 for the first time, regular join execution will be interrupted to perform materialization. The procedure of materialization is to enumerate all (it1, ..., itN) record combinations and put them into a temporary table. After that we'll make a lookup in the temporary table which will show whether the execution should continue to ot{k+1} or return to otK. For all subsequent cases, whenever NL-join execution reaches table it1, NL-join execution in the ((it1; itN)) range will be buypassed and replaced with a single lookup in the materialized temporary table. 2.1.1 Plan refinement step changes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to prepare for the execution step, we'll need perform these actions at plan refinement step: * Create the materialization temporary table * Set up JOIN_TAB::next_select functions to do subquery materializaion and index lookup. 2.2 Changes in the join optimizer --------------------------------- 2.2.1 Cost of materialization ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Without materialization, cost and fanout of a partial join order are calculated using these recursive formulas (taken from best_extension_by_limited_search()): current_record_count= record_count * join->positions[idx].records_read; current_read_time= read_time + join->positions[idx].read_time; read_time= record_count * cost(table_access) or, when the table uses join buffering, read_time= number_of_times_join_buffer_is_refilled * cost(table_access) Materialization is different, for it we have: current_record_count= record_count; // fanout==1 always mat_read_time= materialize_cost + record_count * lookup_cost materialize_cost is a cost to materialize subquery's result: materialize_cost= subquery_run_cost + // cost to execute the subquery storage_cost // cost to populate the temp. table subquery_run_cost is calculated as cost of running a join of subquery tables. storage_cost is the cost of populating the temporary table. It should depend on * number of records we expect to have in the table * record size * whether the table is expected to be HEAP or MyISAM table. The table will be MyISAM if it is larger than max. heap table size. The formula should be rougly as follows: storage_cost= (record_size * #records < max_heap_table_size)? TEMP_HEAP_TABLE_RECORD_COST * #records : TEMP_DISK_TABLE_RECORD_COST * #records 2.2.2 Fitting materialization into the join optimizer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We need values of outer columns before we can make a lookup. This means that subquery tables must be located after all of the outer correlated tables (psergey-todo: put a note re how this works with equality propagation) NOTE: Except for the strategy where we scan the materialized table instead of making lookups. In that case subquery tables may be located before the outer tables. Within our current cost model, a lookup in the temporary table that gives at most one match is the best possible access method (its cost is 1.0, regardless of the table size). We also always have fanout >= 1.0, which means that we ignore the fact the join might have no outputs and we will never have to materialize the subquery. As a consequence, we have this heuristic: * Materialization tables should be put right after the last outer correlated table (psergey-todo: this rule may become more complicated if we take into account equality propagation). psergey-todo: detailed look at Materialization vs FirstMatch costings. psergey-todo: Take testcase BUG#28257 and BUG#11254 and check whether the fastest strategy will be chosen with the formulas we have here. Let's also note that if there is a join order that can be executed using materialization, that join order can be also executed with first match strategy, and the choice between the two depends on the available access methods and costs. This hints that we should use the "fork" apporach: best_extension_by_limited_search() { ... bool considered_materialization= (semi_join_map)0; for (JOIN_TAB **pos= join->best_ref + idx ; (s= *pos) ; pos++) { loop_start: considering_materialization_now= FALSE; /* At the location where we would call best_access_path(): */ if (next table belongs to materialization-suitable semi-join nest SJM && !(considered_materialization.bit_set(SJM.sj_bit) && no tables of SJM are in the join order yet && all outer tables used in SJM's IN-equality are in the join order prefix) { // Consider executing SJM using materialization: put all subquery's tables into the join order prefix; add subquery cost; // this is instead of best_access_path /* Save this so that when we have a join prefix of (ot1,ot2) and possible extensions of (it1, it2), we don't consider materialization two times for ot1 o2 materialize(it1, it2), and ot1 o2 materialize(it2, it1) */ considered_materialization.set_bit(SJM.sj_bit); // This is to restore things correctly at the end of the loop iteration considering_materialization_now= TRUE; } else best_access_path(...) ... Here goes the heuristic pruning code and the recursive best_extension_by_limited_search call ... if (considering_materialization_now) { remove subquery's tables from the join order prefix; // Now consider the same extension but w/o materialization considering_materialization_now= FALSE; goto loop_start; } else { // Regular loop iteration end restore_prev_nj_state(s); restore_prev_sj_state(remaining_tables, s); } } ... } psergey-todo: What if the number of tables in the subquery exceeds the depth of the heuristics? Check if everything is ok in that case: do comparisons of of partial plans like join_prefix(prefix_tables, materialization(#subquery_tables)) vs join_prefix(prefix_tables, number-less-than-subquery_tables) make sense? Or we should adjust for different numbers of tables? In general case the join optimizer will consider many join orders that use materialization, so it makes sense to not re-calculate the optimal way to execute a subquery every time. We will pre-calculate subquery's cost and output cardinality before starting the main join optimization run. The pre-calculation step is just an invocation of the join optimizer for the contents of the subquery's semi-join nest. (we might need to make certain adjustments if the subquery's semi-join nest is contained within some outer join nest. That is to be covered in the LLD) 2.2.3 Materialization and join buffering ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It is possible to use join buffering for subquery tables, except for the first one. However, we must take into account that the join buffer should contain only record combinations of the subquery tables (and not record combinations of all preceding tables). TODO ^ the above is possible only because subquery is run only once, right? Materialization strategy doesn't prevent use of join buffering for the outer tables. There is however a difference that outer tables' join buffers must not contain subquery's tables rows, as they is not needed and will only waste join buffer space. 3. User interface ================= 3.1 EXPLAIN output changes -------------------------- Let the subquery's tables have select_type='MATERIALIZE'. This doesn't make EXPLAIN output any wider. (TODO: check if the above guarantees unambiguous interpretation of EXPLAIN output when there are several nested subqueries) 3.2 Control from SQL level -------------------------- 3.2.1 @@optimizer_switch ~~~~~~~~~~~~~~~~~~~~~~~~ Join optimizer will not consider materialization if @@optimizer_switch has 'no_materialization' flag set. If @@Optimizer_switch includes 'no_semijoin' flag, subquery will not be converted into a semi-join nest at all. (NOTE: the other option would be to convert the subquery into semi-join nest but then force a join order that uses semi-join materialization strategy. ATM there is no meaningful difference between the two but in the future there might be.) 3.2.1 Subquery hints ~~~~~~~~~~~~~~~~~~~~ Subquery may have: SELECT [ /* MATERIALIZE */ | /* NO_MATERIALIZE */ | /* FLATTEN */ ] ... or something like that. This is low priority. 4. Extras and Todos =================== 4.1 Correlation wrt constant tables ----------------------------------- Suppose a subquery satisfies the semi-join criteria but doesn't meet materialization's requirements because it is correlated. If we later find that it is correlated only wrt constant tables, then the subquery can be considered uncorrelated and handled with materalization. Should we bother detecting/handling this case? (will need to check how expensive this is when we get the LLD draft) 4.2 Igor's idea: check for subquery "late" ------------------------------------------ The idea is that the check whether we should do materialization should be done "late", i.e. it should have the following form: in best_extension_by_limited_search(), right after best_access_path() call: if all of the subquery tables form a suffix in the current join order prefix, consider also materialization. TODO: relative merits of both approaches? Late materialization detection - doesn't that try the same plan too many times? that is, we'll consider materialization for every permutation of subquery's tables. (and we cant' try materialization only for some particular permutation as it might occur that that permutation is not considered because of heuristic pruning). + less query plans to consider. At the moment SergeyP doesn't think late detection is worth it.
The following is a structured todo list. Each item is a milestone, implementation is to be performed in the order items are listed M1. Materialization detector --------------------------- Let subquery-predicate -> semi-join nest conversion code mark the semi-join nests that can be handled by materialization, or create an appropriate bool TABLE_LIST::is_materializable_sj_nest() function (The criteria are easy - just check whether the original subquery predicate was uncorrelated). M2. Make the join optimizer support materialization -------------------------------------------------- M2.1 Choose join sub-order for materialization join nests -------------------------------------------------------- M2.2 Consider/choose materialization in join optimization -------------------------------------------------------- Make the decision between early and late materialization join order detection (at the moment leaning towards early detection) and implement it in the join optimizer. The result should be that we produce a marked join order but it is still executed with duplicate elimination. M3. Make join executor support materialization --------------------------------------------- Create a Next_select_func-compatible function that will implement the materialization strategy. M3.1 Support for EXPLAIN ------------------------ Let EXPLAIN show the materialization strategy. This is to be done together with M3. M4. Support sequential scans of materialized table -------------------------------------------------- This is the last milestone where we will add support for * Performing the materialization step before we've read the outer tables * Sequentially scanning the temporary table. M4A. Insideout order for sj-materialization ------------------------------------------- (this part was discovered and added later in development. To be done as the last milestone). Optimization ~~~~~~~~~~~~ Make the at_sjm_pos() check also check if we could use inside-out-sjm. The criteria are the same except that we don't require the outer-correlated tables to be in the join prefix. The cost of inside-out-sjm is composed of: * The cost to populate the temp. table, once per join execution * The cost to do a full temp. table scan. Since this is a full scan, it can be used together with join buffering. . Note: for tuple-based subqueries we could also do index prefix lookups in the materialization table, i.e. we could have a subquery (tbl1.a, tbl2) IN ( SELECT ie1, ie2 FROM inner_table WHERE), a join order of tbl1, SJM(inner_table), tbl2 and have SJM do lookups using materialization_temptable.keypart1=tbl1.a. We will not implement this. Execution ~~~~~~~~~ Execution stage will: * Do materialization in the way SJ-Materialization code does it * Then use full scans for subquery table. The said scan may be done using join buffering. Details: * same sjm-materialize function, but using a flag which will tell whether to do an index scan or full scan. NO. Not exactly same as this may be using join buffering. Q: Why do that? Do we really need join buffering here? AA: ok may disable the join buffering for the 1st implementation. * If a subsequent table will use join buffering, its join_init_cache() call must take sjm-scan into account and put the materialized column into join buffer. Q: what if the materialized column is not base table column but a function? A: disable this case for now. InsideOut orders for non-semi-join subqueries ============================================= Consider this case: a subquery is an AND-part of the WHERE but can't be executed by semi-join runtime because it has GROUP BY/ORDER BY .. LIMIT/etc: SELECT ... FROM ot WHERE oe IN (SELECT ie FROM it WHERE ... GROUP BY ...) It is possible to use a variant of insideout join order for such queries, and it can be a great advantage to do so. It is not possible to use semi-join materialization code for such subqueries though, so we'll use special handling: Optimizer part -------------- The subquery will be represented by a JOIN_TAB element of a special kind. The element will allow two kinds of accesses: * lookup (with cost and all characteristics of an access method) * full scan (this is what we're mainly after) Join optimizer will be able to pick position for the join tab so the cost is minimized. This will map to the execution strategies described further. Note: don't do all this to subqueries that are within outer joins (is that possible?) Executioner part ---------------- * lookup. Done in a similar way to semi-join lookup. * full scan: we'll need to use subselect_hash_engine code to produce the temporary table and then we'll be able to scan it as usual. Other possible subquery materialization gotchas =============================================== * insideout + range access MUST use HA_MRR_SORTED. Unordered DS-MRR or cluster scan will screw up everything. * range+insideout. Why chosen so rarely?? * Timour's subqueries must show "MATERIALIZE" to be consistent with our EXPLAIN lines. WL#3985 Additional specification ================================ This document describes the differences between the original WL#3985 spec and the committed code.1. Join optimizer changes 1.1. First Match optimization 1.1.1 Problems when using FirstMatch and late detection 1.1.2 Solution: forks 1.3 SJ-Materialization 1.4 Sj-Materialization scan 1.5 DuplicateWeedout 1. Join optimizer changes ========================= In order to make a cost-based choice between semi-join and materialization, we need to calculate the costs and fanout of every subquery semi-join strategy. 1.1. First Match optimization ----------------------------- FirstMatch strategy can be used to resolve one or several interleaved semi-join nests, and its applicability criteria are: 1. The join order has form (ot|nt)+ (it|nt)+ ot* ^ ^ (1) (2) 2. Tables between points (1) and (2) do not use join buffering There is no extra cost in using FirstMatch strategy. The fanout produced by sj-inner tables is removed at point (2). 1.1.1 Problems when using FirstMatch and late detection ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The requirement not to use join buffering presents a problem - use of join buffering affects the access costs and optimal order for tables within the (1)...(2) range, so if we implement "late detection" we can get into scenarios like this: * We run the join optimizer and allow to use join buffering * At some point we figure we're at location (2) // and where the // corresponding point (1) is * We want to have correct join cost value, so we go back to (1) and re-run the join optimizer with join buffering disabled. The problem is that if we diable join buffering we may figure that there exist tables in the (1)-(2) join suborder that should not have been put there. In other words: if we denote best_extension(join_prefix, N) = {ta1, ... taN} best_extension(join_prefix, N, disable_join_buffer=TRUE)= {tb1, ... tbN} then list_compare({ta1, ... taN}, {tb1, ... tbN}) != 0 and even set_compare({ta1, ... taN}, {tb1, ... tbN}) != 0 This makes it very difficult to use "late detection" for FirstMatch and have correct cost value and not introduce limitations that could cause best join order to be never considered. 1.1.2 Solution: forks ~~~~~~~~~~~~~~~~~~~~~ Solution to the above problem is to use early detection and a "fork": * whenever we discover we are at position (1), create a separate branch in join order enumeration. The branch will be used to check the FirstMatch strategy. Within the branch - join order extension is limited so that FirstMatch strategy's requirements are met (don't add inner tables which do not have all of their outer tables in the prefix before point (1)). - join buffering is disabled until point (2) is reached. Introduction of branches increases the number of different combinations enumerated by the join optimizer. Each semi-join nest causes an increase equivalent to addition of one more join table. 1.2 LooseScan optimization --------------------------- LooseScan is similar to FirstMatch: - it doesn't allow to use join buffering within it's range - it places limits on what tables are allowed in its duplicate-generating range. - In addition, it uses a special access method for the 1st table. All of the above can be addressed by forking, so we use forks approach for this case, too. FirstMatch and LooseScan's applicability criteria are mutually exclusive so we don't ever get into two forks at once. 1.3 SJ-Materialization ---------------------- SJ-Materialization is handled with pre-optimization and "late detection". Pre-optimization means that we compute the optimal sub-join order for materialization before the main join optimization start. Late detection works as described in the main WL#3985 spec: ot1 ... otN (it1 .. itN) ^ ^ (1) (2) Once we have detected we are at position (2), we adjust the join prefix output cardinality and cost values accordingly. Since SJ-Materialization ignores the results of join optimizer's work in the (1)-(2) range, we don't need any join optimizer forks. 1.4 Sj-Materialization scan --------------------------- SJM-scan has additional complication with fanout. Consider a join order with SJM-SCAN: SJM-SCAN(it1 .. itN) ot1 ... otN nt1 .. ^ ^ (1) (2) At position (1) join prefix output cardinality is a product of fanouts of tables it1 ... itN. In general case, it is greater than 1. On the other hand, at point (2), join prefix cardinality is a product of outer tables only. This can get even more complicated if there are two SJM scans. Consider the query and the join order SELECT ... FROM otA1, otB1, otB1, otB2 WHERE expr(otA1, otA2) IN (SELECT ... FROM itA1 .. itAN) AND expr(otB1, otB2) IN (SELECT ... FROM itB1 .. itBK) SJM-SCAN(itA1 .. itAN) otA1 SJM-SCAN(itB1 .. itBK) otB1 otA2 otB1 ^ ^ (1) (2) Here the fanout introduced by (itA1 .. itAN) is removed at point (1) and fanout introduced by (itB1 .. itBK) 1.5 DuplicateWeedout -------------------- DuplicateWeedout does not create additional forks. It is detected late and it has a special property that it is the default catch-all method. ... ot1 it1 .... itN x======================x -- duplicate generation area (2) \_ consider duplicate weedout when we get here The point when we consider DuplicateWeedout is the latest point where we consider any semi-join optimization strategy. This is useful - if we see that we have arrived at point (2) and there are duplicates that are not handled by any semi-join strategy, we unconditionally use duplicate weedout (and add its cost). Otherwise, we compare the cost of duplicate weedout with costs of other strategies.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.