WL#9158: Join Order Hints
Affects: Server-8.0
—
Status: Complete
This task is about hints what allows to control table order for the join execution. Currently MySQL has limited possibility to do this. STRAIGHT_JOIN hint can be used to force the optimizer to join the tables is the order in which they are listed in the FROM clause. The goal of this task is to add more join order hints using new hint infrastructure. Proposed hints are: JOIN_FIXED_ORDER similar to existing STRAIGHT_JOIN hint, will be used as replacement of the old hint later. JOIN_ORDER instructs the optimizer to use the specified table order. JOIN_PREFIX instructs the optimizer to use the specified table order for the first tables of the join execution plan. JOIN_SUFFIX instructs the optimizer to use the specified table order for the last tables of the join execution plan.
User Documentation
Functional requirements:
F-1: Hints must be enclosed into /*+ */ comment.
F-2: Hints must be specified after SELECT|INSERT|REPLACE|UPDATE|DELETE key words.
F-3: EXPLAIN must understand hints.
F-4: Active hints must be printed in EXPLAIN warning.
F-5: Subsequent conflicting/duplicating hints are ignored with warning.
F-6: Unresolved hints cause warning.
Non-Functional requirements:
NONE
1. JOIN_FIXED_ORDER JOIN_FIXED_ORDER([@QB_NAME]) Hint forces the optimizer to join the tables is the order in which they are listed in the FROM clause. 2. JOIN_ORDER hint. Instructs the optimizer to use the specified table order. The hint can be applied to a set of tables. Tables that are not specified may be placed anywhere in the join order, even in between specified tables. Specification: JOIN_ORDER([@QB_NAME] table[, table]...) 3. JOIN_PREFIX hint. Instructs the optimizer to use the specified table order for the first tables of the join execution plan. The hint can be applied to a set of tables. All other join tables will come after the given tables. Specification: JOIN_PREFIX([@QB_NAME] table[, table]...) 4. JOIN_SUFFIX hint. Instructs the optimizer to use the specified table order for the last tables of the join execution plan. The hint can be applied to a set of tables. All other join tables will come before the given tables. Specification: JOIN_SUFFIX([@QB_NAME] table[, table]...) Note that hints should be able to control behavior of the semi-join tables which are merged to the outer query block. Example: SELECT /*+ JOIN_PREFIX(t2, t5@subq2, t4@subq1) JOIN_ORDER(t4@subq1, t3) JOIN_SUFFIX(t1) */ count(*) FROM t1 JOIN t2 JOIN t3 WHERE t1.f1 IN (SELECT /*+ QB_NAME(subq1) */ f1 FROM t4) AND t2.f1 IN (SELECT /*+ QB_NAME(subq2) */ f1 FROM t5); If sub-queries subq1, subq2 are converted into semi-join, tables t4@subq1, t5@subq2 are merged to the outer query block. In this case, the hint specified in the outer query block should be able to control behavior of t4@subq1, t5@subq2 tables. Resolving conflicting hints. 1. Conflicting hints In some cases hints can conflict. For example if JOIN_ORDER and JOIN_PREFIX have table orders what are impossible to apply at the same time: SELECT /*+ JOIN_ORDER(t1, t2) JOIN_PREFIX(t2, t1) */ ... FROM t1,t2; In this case, the first specified hint is applied and next conflicting hints are ignored without warning. If it's impossible to apply valid hint, it will be silently ignored without any warning. 2. Only one JOIN_PREFIX and JOIN_SUFFIX hint of each type will be applied. Second specified hint of the same type is ignored with warning. JOIN_ORDER can be specified several times. For example, /*+ JOIN_PREFIX(t1) JOIN_PREFIX(t2) */ Second JOIN_PREFIX is ignored with warning. /*+ JOIN_PREFIX(t1) JOIN_SUFFIX(t2) */ Both hints are applicable, no warning in this case. /*+ JOIN_ORDER(t1, t2) JOIN_ORDER(t2, t3) */ Both hints are applicable, no warning in this case. 3. Ignored hints. Hint is ignored if a table specified in the hint has a circular dependency. For example: /*+ JOIN_ORDER(t1, t2) JOIN_PREFIX(t2, t1) */ JOIN_ORDER hint sets table 't2' dependent on t1. JOIN_PREFIX is ignored since it table 't1' can not be dependent on 't2'. Ignored hints are not printed in in EXPLAIN message. 4. Interaction with const tables. MySQL optimizer places const tables first in the table order, and the position of a const table can not be affected by hints. References to const tables in join order hints will be ignored. Hint is still applicable. For example: JOIN_ORDER(t1, CONST_T, t2) is equivalent to JOIN_ORDER(t1, t2). Note that the accepted hints shown in the warning message of EXPLAIN, will include const tables as they were specified. 5. Interaction with the types of JOINs. MySQL supports several type of joins, LEFT|RIGHT|INNER|CROSS|STRAIGHT_JOIN joins. If the hint conflicts with the specified type of JOIN, it is ignored without warning. For example: SELECT /*+ JOIN_PREFIX(t1, t2) */FROM t2 LEFT JOIN t1; Here is conflict between requested table order in the hint and JOIN order(LEFT JOIN).
Setting of the necessary table order is carried out by changing the table dependencies. Optimizer selects requested order using updated dependencies. Table dependencies are stored in JOIN_TAB::dependent field. See sql_select.h /** The set of tables that this table depends on. Used for outer join and straight join dependencies. */ table_map dependent; Dependencies are set in JOIN::make_join_plan() method, see JOIN::init_planner_arrays(), JOIN::propagate_dependencies() methods. How it works at the current moment. Consider we have following statement: SELECT * FROM t1 LEFT JOIN t2 ON t1.f1 = t2.f1 LEFT JOIN (t3 JOIN t4 ON t3.f1 = t4.f1) ON t1.f1 = t3.f1; Dependencies are set according to join type: 1. t1 has no dependencies 2. t2 is dependent on t1, JOIN_TAB(t2)::dependent is set to 1(first table in the join). 3. t3 is dependent on t2 and indirectly on t1(since t2 dependent on t1) JOIN_TAB(t3)::dependent is set to 3(first and second tables). 4. t4 is dependent on t2 and indirectly on t1 JOIN_TAB(t4)::dependent is set to 3(first and second tables). Note that JOIN_TAB::dependent value has sum of directly and indirectly dependent tables. JOIN_PREFIX modifies table dependencies(if possible) in such a way that tables specified in the hint are set first in the join order. Consider we have following statement: SELECT /*+ JOIN_PREFIX(t3, t2)*/ * FROM t1 JOIN t2 JOIN t3; Before applying of the hint tables, t1, t2, t3 have no dependencies. After applying: 1. t3 has no dependencies. 2. t2 is dependent on t3, JOIN_TAB(t2)::dependent is set to 4(third table in the join) 3. t1 should be last, so JOIN_TAB(t1)::dependent is set to 6 (second and third tables in the join). Note that we should check that hint tables have no dependencies on other tables, otherwise they can not be set first and the hint is ignored. JOIN_SUFFIX modifies table dependencies(if possible) in such a way that tables specified in the hint are set last in the join order. Consider we have following statement: SELECT /*+ JOIN_SUFFIX(t3, t2)*/ * FROM t1 JOIN t2 JOIN t3; Before applying of the hint tables, t1, t2, t3 have no dependencies. After applying: 1. t1 has no dependencies. 2. t3 is dependent on t1, JOIN_TAB(t3)::dependent is set to 1(first table in the join) 3. t2 should be last, so JOIN_TAB(t2)::dependent is set to 5 (first and third tables in the join). Note that we should check that tables not specified in the hint are not dependent on hint tables. In this case hint tables can not be set last and the hint is ignored. JOIN_ORDER modifies table dependencies(if possible) in such a way that tables specified in the hint are set in the specified order. Consider we have following statement: SELECT /*+ JOIN_ORDER(t3, t2)*/ * FROM t1 JOIN t2 JOIN t3; Before applying of the hint tables, t1, t2, t3 have no dependencies. After applying: 1. t1 has no dependencies. 2. t3 has no dependencies. 3. t2 is dependent on t3, JOIN_TAB(t2)::dependent is set to 4(third table in the join) Note that we should check that previous hint table are not dependent on subsequent, otherwise requested order can not be set and the hint is ignored. If table specified in the hint belongs to nested join, we need to update dependencies of all tables of the nested join with the same dependency as for the hint table. It's necessary since interleaving of nested join tables should satisfy condition checked in the Optimize_table_order::check_interleaving_with_nj() method(see comment for this method for more detailed explanation). Procedure of applying hints is called from JOIN::make_join_plan() after methods which update dependency information. Code changes: sql/lex.h Added to 'static const SYMBOL symbols' { SYM_H("JOIN_PREFIX", JOIN_PREFIX_HINT)}, { SYM_H("JOIN_SUFFIX", JOIN_SUFFIX_HINT)}, { SYM_H("JOIN_ORDER", JOIN_ORDER_HINT)}, { SYM_H("JOIN_FIXED_ORDER", JOIN_FIXED_ORDER_HINT)}, sql/sql_lex_hints.cc Added to void Hint_scanner::add_hint_token_digest() following cases: case JOIN_PREFIX_HINT: case JOIN_SUFFIX_HINT: case JOIN_ORDER_HINT: case JOIN_FIXED_ORDER_HINT: sql/sql_hints.yy Added grammar for JOIN_PREFIX, JOIN_SUFFIX, JOIN_ORDER, JOIN_FIXED_ORDER hints. sql/sql_optimizer.cc Added Opt_hints_qb::apply_join_order_hints(JOIN *join) method call. This function resolves hint table names, checks and updates table dependencies according to the hint requirements. Method JOIN::propagate_dependencies() is split on two methods: JOIN::propagate_dependencies() - propagate table dependencies and check circular dependencies. JOIN::init_key_dependencies() - Initialize key dependencies for join tables. sql/sql_planner.cc: Optimize_table_order::advance_sj_state(): added a change to prevent that inner tables of different semijoin nests are interleaved for MatScan. sql/parse_tree_hints.h PT_qb_level_hint class: Added field: Hint_param_table_list table_list; // hint tables Added constructor: PT_qb_level_hint(const LEX_CSTRING qb_name_arg, bool switch_state_arg, enum opt_hints_enum hint_type_arg, const Hint_param_table_list &table_list_arg) : PT_hint(hint_type_arg, switch_state_arg), qb_name(qb_name_arg), args(0), table_list(table_list_arg) {} Added function: virtual Hint_param_table_list *get_table_list() { return &table_list; } sql/parse_tree_hints.cc: Modified: PT_qb_level_hint::contextualize(Parse_context *pc) Added JOIN_PREFIX_HINT_ENUM, JOIN_SUFFIX_HINT_ENUM, JOIN_ORDER_HINT_ENUM, JOIN_FIXED_ORDER_HINT_ENUM cases. Modified: PT_qb_level_hint::append_args() Added processing of JOIN_PREFIX_HINT_ENUM, JOIN_SUFFIX_HINT_ENUM, JOIN_ORDER_HINT_ENUM. sql/opt_hints.h enum opt_hints_enum: Added JOIN_PREFIX_HINT_ENUM, JOIN_SUFFIX_HINT_ENUM, JOIN_ORDER_HINT_ENUM, JOIN_FIXED_ORDER_HINT_ENUM. struct st_opt_hint_info: Added new field bool irregular_hint; ///< true if hint requires some special handling. Class Opt_hints: Added method /** Function prints hints which are non-standard and don't fit into existing hint infrastructure. @param thd pointer to THD object. @param str pointer to result String object. */ virtual void print_irregular_hints(THD *thd, String *str) Main problems is that join order hints have several non-standard dependencies: 1. Hint is depending on the place where it was specified, i.e. second specified hint is depending on the first one: /*+ JOIN_SUFFIX(t1, t2) JOIN_ORDER(t3, t4) */ in this case, JOIN_ORDER is applicable only if tables t3, t4 are not dependent on t2, t1. Join order hints should be processed in the specified order. So we need to place all specified hints into array and process them in a sequence. 2. Order of specified tables is important too since we should set table dependencies according to this order. We can not put tables in child (table level) array of existing hint infrastructure. We loose the order in this case. So join order hints(PT_qb_level_hint object) are stored in Opt_hints_qb::join_order_hints array and tables of the hint are stored in PT_qb_level_hint::table_list array. Class Opt_hints_qb: Added new fields // Array of join order hints Mem_root_array<PT_qb_level_hint*, true> join_order_hints; ulonglong join_order_hints_ignored; // ignored join order hints /** Function checks if join order hints are applicable and records table dependencies if possible. @param join JOIN object */ void apply_join_order_hints(JOIN *join); Simplified algorithm is following: Save original table dependencies; For each join_order_hints element { For each hint table { Find table in JOIN::tables; if (table is found) { if (table is const) continue; // ignore const table if (possible to set necessary dependencies) { // Note that we check dependencies only for hint tables // and skip dependency check for other tables. set dependencies; set dependencies for nested join tables; } else { hint is conflicting; restore original table dependencies; } } } Add dependencies that are related to non-hint tables; // Check if not specified tables have correct dependencies; if (JOIN::propagate_dependencies()) restore original table dependencies; } void register_join_order_hint(PT_qb_level_hint* hint_arg); Added new function: /** Append table name. @param thd pointer to THD object @param str pointer to String object */ virtual void append_table_name(THD *thd, String *str); sql/opt_hints.cc struct st_opt_hint_info opt_hint_info[] Added {"JOIN_PREFIX", false, false, true}, {"JOIN_SUFFIX", false, false, true}, {"JOIN_ORDER", false, false, true}, {"JOIN_FIXED_ORDER", false, true, false}, Opt_hints::print(THD *thd, String *str, enum_query_type query_type); Function changed to allow print non standard hints. Added new function: /** Function compares hint table name and TABLE_LIST table name. @param hint_table hint table name @param table Pointer to TABLE_LIST object @return true if table names are equal, false otherwise. */ static bool compare_table_name(const Hint_param_table *hint_table, const TABLE_LIST* table) Added new functions: /** Function updates dependencies for nested joins. If table specified in the hint belongs to nested join, we need to update dependencies of all tables of the nested join with the same dependency as for the hint table. @param JOIN pointer to JOIN object @param table pointer to TABLE_LIST object @param hint_tab_map map of the tables, specified in the hint */ static void update_nested_join_deps(JOIN *join, const TABLE_LIST* table, table_map hint_tab_map) /** Function resolves hint tables, checks and sets table dependencies according to the hint. If the hint is ignored due to circular table dependencies, original dependencies are restored. @param JOIN pointer to JOIN object @param hint_table_list hint table list @param type hint type @return false if hint is applied, true otherwise. */ static bool set_join_hint_deps(JOIN *join, const Hint_param_table_list *hint_table_list, opt_hints_enum type) /** Function returns dependencies used for updating table dependencies depending on hint type. @param type hint type @param hint_tab_map hint table map @param table_map table map @return table dependencies. */ static table_map get_other_dep(opt_hints_enum type, table_map hint_tab_map, uint table_map); Added new class: /** Auxiliary class is used to save/restore table dependencies. */ class Join_order_hint_handler : public Sql_alloc
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.