MySQL 8.4.0
Source Code Documentation
JOIN Class Reference

#include <sql_optimizer.h>

Classes

struct  TemporaryTableToCleanup
 

Public Types

enum class  RollupState { NONE , INITED , READY }
 
enum  { ORDERED_INDEX_VOID , ORDERED_INDEX_GROUP_BY , ORDERED_INDEX_ORDER_BY }
 
enum  enum_plan_state { NO_PLAN , ZERO_RESULT , NO_TABLES , PLAN_READY }
 State of execution plan. Currently used only for EXPLAIN. More...
 
using Override_executor_func = bool(*)(JOIN *, Query_result *)
 A hook that secondary storage engines can use to override the executor completely. More...
 

Public Member Functions

 JOIN (THD *thd_arg, Query_block *select)
 
 JOIN (const JOIN &rhs)=delete
 
JOINoperator= (const JOIN &rhs)=delete
 
Query_expressionquery_expression () const
 Query expression referring this query block. More...
 
bool plan_is_const () const
 True if plan is const, ie it will return zero or one rows. More...
 
bool plan_is_single_table ()
 True if plan contains one non-const primary table (ie not including tables taking part in semi-join materialization). More...
 
bool contains_non_aggregated_fts () const
 Returns true if any of the items in JOIN::fields contains a call to the full-text search function MATCH, which is not wrapped in an aggregation function. More...
 
bool optimize (bool finalize_access_paths)
 Optimizes one query block into a query execution plan (QEP.) More...
 
void reset ()
 Reset the state of this join object so that it is ready for a new execution. More...
 
bool prepare_result ()
 Prepare join result. More...
 
void destroy ()
 Clean up and destroy join object. More...
 
bool alloc_func_list ()
 Make an array of pointers to sum_functions to speed up sum_func calculation. More...
 
bool make_sum_func_list (const mem_root_deque< Item * > &fields, bool before_group_by, bool recompute=false)
 Initialize 'sum_funcs' array with all Item_sum objects. More...
 
void copy_ref_item_slice (uint dst_slice, uint src_slice)
 Overwrites one slice of ref_items with the contents of another slice. More...
 
void copy_ref_item_slice (Ref_item_array dst_arr, Ref_item_array src_arr)
 
bool alloc_ref_item_slice (THD *thd_arg, int sliceno)
 Allocate a ref_item slice, assume that slice size is in ref_items[0]. More...
 
void set_ref_item_slice (uint sliceno)
 Overwrite the base slice of ref_items with the slice supplied as argument. More...
 
uint get_ref_item_slice () const
 
mem_root_deque< Item * > * get_current_fields ()
 Returns the clone of fields_list which is appropriate for evaluating expressions at the current stage of execution; which stage is denoted by the value of current_ref_item_slice. More...
 
bool optimize_rollup ()
 Optimize rollup specification. More...
 
bool finalize_table_conditions (THD *thd)
 Remove redundant predicates and cache constant expressions. More...
 
void join_free ()
 Release memory and, if possible, the open tables held by this execution plan (and nested plans). More...
 
void cleanup ()
 Cleanup this JOIN. More...
 
bool clear_fields (table_map *save_nullinfo)
 Set all column values from all input tables to NULL. More...
 
void restore_fields (table_map save_nullinfo)
 Restore all result fields for all tables specified in save_nullinfo. More...
 
bool generate_derived_keys ()
 Add keys to derived tables'/views' result tables in a list. More...
 
void finalize_derived_keys ()
 For each materialized derived table/view, informs every TABLE of the key it will (not) use, segregates used keys from unused keys in TABLE::key_info, and eliminates unused keys. More...
 
bool get_best_combination ()
 Set up JOIN_TAB structs according to the picked join order in best_positions. More...
 
bool attach_join_conditions (plan_idx last_tab)
 Attach outer join conditions to generated table conditions in an optimal way. More...
 
bool update_equalities_for_sjm ()
 Update equalities and keyuse references after semi-join materialization strategy is chosen. More...
 
bool add_sorting_to_table (uint idx, ORDER_with_src *order, bool sort_before_group)
 Add Filesort object to the given table to sort if with filesort. More...
 
bool decide_subquery_strategy ()
 Decides between EXISTS and materialization; performs last steps to set up the chosen strategy. More...
 
void refine_best_rowcount ()
 Refine the best_rowcount estimation based on what happens after tables have been joined: LIMIT and type of result sink. More...
 
table_map calculate_deps_of_remaining_lateral_derived_tables (table_map plan_tables, uint idx) const
 Finds the dependencies of the remaining lateral derived tables. More...
 
bool clear_sj_tmp_tables ()
 Remove all rows from all temp tables used by NL-semijoin runtime. More...
 
bool clear_corr_derived_tmp_tables ()
 Empties all correlated materialized derived tables. More...
 
void clear_hash_tables ()
 
void mark_const_table (JOIN_TAB *table, Key_use *key)
 Move const tables first in the position array. More...
 
enum_plan_state get_plan_state () const
 See enum_plan_state. More...
 
bool is_optimized () const
 
void set_optimized ()
 
bool is_executed () const
 
void set_executed ()
 
const Cost_model_servercost_model () const
 Retrieve the cost model object to be used for this join. More...
 
bool fts_index_access (JOIN_TAB *tab)
 Check if FTS index only access is possible. More...
 
QEP_TAB::enum_op_type get_end_select_func ()
 
bool propagate_dependencies ()
 Propagate dependencies between tables due to outer join relations. More...
 
bool push_to_engines ()
 Handle offloading of query parts to the underlying engines, when such is supported by their implementation. More...
 
AccessPathroot_access_path () const
 
void set_root_access_path (AccessPath *path)
 
void change_to_access_path_without_in2exists ()
 If this query block was planned twice, once with and once without conditions added by in2exists, changes the root access path to the one without in2exists. More...
 
void refresh_base_slice ()
 In the case of rollup (only): After the base slice list was made, we may have modified the field list to add rollup group items and sum switchers, but there may be Items with refs that refer to the base slice. More...
 
AccessPathcreate_access_paths_for_zero_rows () const
 Create access paths with the knowledge that there are going to be zero rows coming from tables (before aggregation); typically because we know that all of them would be filtered away by WHERE (e.g. More...
 

Public Attributes

Query_block *const query_block
 Query block that is optimized and executed using this JOIN. More...
 
THD *const thd
 Thread handler. More...
 
JOIN_TABjoin_tab {nullptr}
 Optimal query execution plan. More...
 
QEP_TABqep_tab {nullptr}
 Array of QEP_TABs. More...
 
JOIN_TAB ** best_ref {nullptr}
 Array of plan operators representing the current (partial) best plan. More...
 
JOIN_TAB ** map2table {nullptr}
 mapping between table indexes and JOIN_TABs More...
 
TABLEsort_by_table {nullptr}
 
Prealloced_array< TemporaryTableToCleanup, 1 > temp_tables
 
Prealloced_array< Filesort *, 1 > filesorts_to_cleanup {PSI_NOT_INSTRUMENTED}
 
uint tables {0}
 Before plan has been created, "tables" denote number of input tables in the query block and "primary_tables" is equal to "tables". More...
 
uint primary_tables {0}
 Number of primary input tables in query block. More...
 
uint const_tables {0}
 Number of primary tables deemed constant. More...
 
uint tmp_tables {0}
 Number of temporary tables used by query. More...
 
uint send_group_parts {0}
 
bool streaming_aggregation {false}
 Indicates that the data will be aggregated (typically GROUP BY), and that it is already processed in an order that is compatible with the grouping in use (e.g. More...
 
bool grouped
 If query contains GROUP BY clause. More...
 
bool do_send_rows {true}
 If true, send produced rows using query_result. More...
 
table_map all_table_map {0}
 Set of tables contained in query. More...
 
table_map const_table_map
 Set of tables found to be const. More...
 
table_map found_const_table_map
 Const tables which are either: More...
 
table_map deps_of_remaining_lateral_derived_tables {0}
 This is the bitmap of all tables which are dependencies of lateral derived tables which are not (yet) part of the partial plan. More...
 
ha_rows send_records {0}
 
ha_rows found_records {0}
 
ha_rows examined_rows {0}
 
ha_rows row_limit {0}
 
ha_rows m_select_limit {0}
 
ha_rows fetch_limit {HA_POS_ERROR}
 Used to fetch no more than given amount of rows per one fetch operation of server side cursor. More...
 
POSITIONbest_positions {nullptr}
 This is the result of join optimization. More...
 
POSITIONpositions {nullptr}
 
Override_executor_func override_executor_func = nullptr
 
double best_read {0.0}
 The cost of best complete join plan found so far during optimization, after optimization phase - cost of picked join order (not taking into account the changes made by test_if_skip_sort_order()). More...
 
ha_rows best_rowcount {0}
 The estimated row count of the plan with best read time (see above). More...
 
double sort_cost {0.0}
 Expected cost of filesort. More...
 
double windowing_cost {0.0}
 Expected cost of windowing;. More...
 
mem_root_deque< Item * > * fields
 
List< Cached_itemgroup_fields {}
 
List< Cached_itemgroup_fields_cache {}
 
List< Cached_itemsemijoin_deduplication_fields {}
 
Item_sum ** sum_funcs {nullptr}
 
Temp_table_param tmp_table_param
 Describes a temporary table. More...
 
MYSQL_LOCKlock
 
RollupState rollup_state
 
bool implicit_grouping
 True if aggregated but no GROUP BY. More...
 
bool select_distinct
 At construction time, set if SELECT DISTINCT. More...
 
bool group_optimized_away {false}
 If we have the GROUP BY statement in the query, but the group_list was emptied by optimizer, this flag is true. More...
 
bool simple_order {false}
 
bool simple_group {false}
 
enum JOIN:: { ... }  ORDERED_INDEX_VOID
 
bool skip_sort_order {false}
 Is set if we have a GROUP BY and we have ORDER BY on a constant or when sorting isn't required. More...
 
bool need_tmp_before_win {false}
 If true we need a temporary table on the result set before any windowing steps, e.g. More...
 
bool has_lateral {false}
 If JOIN has lateral derived tables (is set at start of planning) More...
 
Key_use_array keyuse_array
 Used and updated by JOIN::make_join_plan() and optimize_keyuse() More...
 
mem_root_deque< Item * > * tmp_fields = nullptr
 Array of pointers to lists of expressions. More...
 
int error {0}
 set in optimize(), exec(), prepare_result() More...
 
uint64_t hash_table_generation {0}
 Incremented each time clear_hash_tables() is run, signaling to HashJoinIterators that they cannot keep their hash tables anymore (since outer references may have changed). More...
 
ORDER_with_src order
 ORDER BY and GROUP BY lists, to transform with prepare,optimize and exec. More...
 
ORDER_with_src group_list
 
Prealloced_array< Item_rollup_group_item *, 4 > rollup_group_items
 
Prealloced_array< Item_rollup_sum_switcher *, 4 > rollup_sums
 
List< Windowm_windows
 Any window definitions. More...
 
bool m_windows_sort {false}
 True if a window requires a certain order of rows, which implies that any order of rows coming out of the pre-window join will be disturbed. More...
 
bool m_windowing_steps {false}
 If we have set up tmp tables for windowing,. More...
 
Explain_format_flags explain_flags {}
 Buffer to gather GROUP BY, ORDER BY and DISTINCT QEP details for EXPLAIN. More...
 
Itemwhere_cond
 JOIN::having_cond is initially equal to query_block->having_cond, but may later be changed by optimizations performed by JOIN. More...
 
Itemhaving_cond
 Optimized HAVING clause item tree (valid for one single execution). More...
 
Itemhaving_for_explain
 Saved optimized HAVING for EXPLAIN. More...
 
Table_reftables_list
 Pointer set to query_block->get_table_list() at the start of optimization. More...
 
COND_EQUALcond_equal {nullptr}
 
plan_idx return_tab {0}
 
Ref_item_arrayref_items
 ref_items is an array of 4+ slices, each containing an array of Item pointers. More...
 
uint current_ref_item_slice
 The slice currently stored in ref_items[0]. More...
 
uint recursive_iteration_count {0}
 Used only if this query block is recursive. More...
 
const char * zero_result_cause {nullptr}
 <> NULL if optimization has determined that execution will produce an empty result before aggregation, contains a textual explanation on why result is empty. More...
 
bool child_subquery_can_materialize {false}
 True if, at this stage of processing, subquery materialization is allowed for children subqueries of this JOIN (those in the SELECT list, in WHERE, etc). More...
 
bool allow_outer_refs {false}
 True if plan search is allowed to use references to expressions outer to this JOIN (for example may set up a 'ref' access looking up an outer expression in the index, etc). More...
 
List< TABLEsj_tmp_tables {}
 
List< Semijoin_mat_execsjm_exec_list {}
 
bool group_sent {false}
 Exec time only: true <=> current group has been sent. More...
 
bool calc_found_rows {false}
 If true, calculate found rows for this query block. More...
 
bool with_json_agg
 This will force tmp table to NOT use index + update for group operation as it'll cause [de]serialization for each json aggregated value and is very ineffective (times worse). More...
 
bool needs_finalize {false}
 Whether this query block needs finalization (see FinalizePlanForQueryBlock()) before it can be actually used. More...
 
bool select_count {false}
 

Private Member Functions

bool send_row_on_empty_set () const
 Return whether the caller should send a row even if the join produced no rows if: More...
 
bool attach_join_condition_to_nest (plan_idx first_inner, plan_idx last_tab, Item *join_cond, bool is_sj_mat_cond)
 Helper for JOIN::attach_join_conditions(). More...
 
bool create_intermediate_table (QEP_TAB *tab, const mem_root_deque< Item * > &tmp_table_fields, ORDER_with_src &tmp_table_group, bool save_sum_fields)
 Create a temporary table to be used for processing DISTINCT/ORDER BY/GROUP BY. More...
 
void optimize_distinct ()
 Optimize distinct when used on a subset of the tables. More...
 
bool optimize_fts_query ()
 Function sets FT hints, initializes FT handlers and checks if FT index can be used as covered. More...
 
bool check_access_path_with_fts () const
 Checks if the chosen plan suffers from a problem related to full-text search and streaming aggregation, which is likely to cause wrong results or make the query misbehave in other ways, and raises an error if so. More...
 
bool prune_table_partitions ()
 Prune partitions for all tables of a join (query block). More...
 
void init_key_dependencies ()
 Initialize key dependencies for join tables. More...
 
void set_prefix_tables ()
 Assign set of available (prefix) tables to all tables in query block. More...
 
void cleanup_item_list (const mem_root_deque< Item * > &items) const
 
void set_semijoin_embedding ()
 Set semi-join embedding join nest pointers. More...
 
bool make_join_plan ()
 Calculate best possible join order and initialize the join structure. More...
 
bool init_planner_arrays ()
 Initialize scratch arrays for the join order optimization. More...
 
bool extract_const_tables ()
 Extract const tables based on row counts. More...
 
bool extract_func_dependent_tables ()
 Extract const tables based on functional dependencies. More...
 
void update_sargable_from_const (SARGABLE_PARAM *sargables)
 Update info on indexes that can be used for search lookups as reading const tables may has added new sargable predicates. More...
 
bool estimate_rowcount ()
 Estimate the number of matched rows for each joined table. More...
 
void optimize_keyuse ()
 Update some values in keyuse for faster choose_table_order() loop. More...
 
void set_semijoin_info ()
 Set the first_sj_inner_tab and last_sj_inner_tab fields for all tables inside the semijoin nests of the query. More...
 
void adjust_access_methods ()
 An utility function - apply heuristics and optimize access methods to tables. More...
 
void update_depend_map ()
 Update the dependency map for the tables. More...
 
void update_depend_map (ORDER *order)
 Update the dependency map for the sort order. More...
 
void make_outerjoin_info ()
 Fill in outer join related info for the execution plan structure. More...
 
bool init_ref_access ()
 Initialize ref access for all tables that use it. More...
 
bool alloc_qep (uint n)
 
void unplug_join_tabs ()
 
bool setup_semijoin_materialized_table (JOIN_TAB *tab, uint tableno, POSITION *inner_pos, POSITION *sjm_pos)
 Setup the materialized table for a semi-join nest. More...
 
bool add_having_as_tmp_table_cond (uint curr_tmp_table)
 Add having condition as a filter condition, which is applied when reading from the temp table. More...
 
bool make_tmp_tables_info ()
 Init tmp tables usage info. More...
 
void set_plan_state (enum_plan_state plan_state_arg)
 Sets the plan's state of the JOIN. More...
 
bool compare_costs_of_subquery_strategies (Subquery_strategy *method)
 Tells what is the cheapest between IN->EXISTS and subquery materialization, in terms of cost, for the subquery's JOIN. More...
 
ORDERremove_const (ORDER *first_order, Item *cond, bool change_list, bool *simple_order, bool group_by)
 Remove all constants and check if ORDER only contains simple expressions. More...
 
int replace_index_subquery ()
 Check whether this is a subquery that can be evaluated by index look-ups. More...
 
bool optimize_distinct_group_order ()
 Optimize DISTINCT, GROUP BY, ORDER BY clauses. More...
 
void test_skip_sort ()
 Test if an index could be used to replace filesort for ORDER BY/GROUP BY. More...
 
bool alloc_indirection_slices ()
 
void create_access_paths ()
 Convert the executor structures to a set of access paths, storing the result in m_root_access_path. More...
 
void create_access_paths_for_index_subquery ()
 
AccessPathcreate_root_access_path_for_join ()
 
AccessPathattach_access_paths_for_having_and_limit (AccessPath *path) const
 
AccessPathattach_access_path_for_update_or_delete (AccessPath *path) const
 

Private Attributes

bool optimized {false}
 flag to avoid double optimization in EXPLAIN More...
 
bool executed {false}
 Set by exec(), reset by reset(). More...
 
enum_plan_state plan_state {NO_PLAN}
 Final execution plan state. Currently used only for EXPLAIN. More...
 
AccessPathm_root_access_path = nullptr
 An access path you can read from to get all records for this query (after you create an iterator from it). More...
 
AccessPathm_root_access_path_no_in2exists = nullptr
 If this query block contains conditions synthesized during IN-to-EXISTS conversion: A second query plan with all such conditions removed. More...
 

Member Typedef Documentation

◆ Override_executor_func

A hook that secondary storage engines can use to override the executor completely.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum
Enumerator
ORDERED_INDEX_VOID 
ORDERED_INDEX_GROUP_BY 
ORDERED_INDEX_ORDER_BY 

◆ enum_plan_state

State of execution plan. Currently used only for EXPLAIN.

Enumerator
NO_PLAN 

No plan is ready yet.

ZERO_RESULT 

Zero result cause is set.

NO_TABLES 

Plan has no tables.

PLAN_READY 

Plan is ready.

◆ RollupState

enum class JOIN::RollupState
strong
Enumerator
NONE 
INITED 
READY 

Constructor & Destructor Documentation

◆ JOIN()

JOIN::JOIN ( const JOIN rhs)
delete

Member Function Documentation

◆ attach_join_condition_to_nest()

bool JOIN::attach_join_condition_to_nest ( plan_idx  first_inner,
plan_idx  last_tab,
Item join_cond,
bool  is_sj_mat_cond 
)
private

Helper for JOIN::attach_join_conditions().

Attaches bits of 'join_cond' to each table in the range [first_inner, last_tab], with proper guards. If 'sj_mat_cond' is true, we do not see first_inner (and tables on the same level of it) as inner to anything, as they're at the top from the POV of the materialization of the tmp table. So, if the SJ-mat nest is A LJ B, A will get a part of condition without any guard; B will get another part with a guard on A->found_match. It's like pushing a WHERE.

◆ attach_join_conditions()

bool JOIN::attach_join_conditions ( plan_idx  last_tab)

Attach outer join conditions to generated table conditions in an optimal way.

Parameters
last_tab- Last table that has been added to the current plan. Pre-condition: If this is the last inner table of an outer join operation, a join condition is attached to the first inner table of that outer join operation.
Returns
false if success, true if error.

Outer join conditions are attached to individual tables, but we can analyze those conditions only when reaching the last inner table of an outer join operation. Notice also that a table can be last within several outer join nests, hence the outer for() loop of this function.

Example: SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.a=t3.a) ON t1.a=t2.a

Table t3 is last both in the join nest (t2 - t3) and in (t1 - (t2 - t3)) Thus, join conditions for both join nests will be evaluated when reaching this table.

For each outer join operation processed, the join condition is split optimally over the inner tables of the outer join. The split-out conditions are later referred to as table conditions (but note that several table conditions stemming from different join operations may be combined into a composite table condition).

Example: Consider the above query once more. The predicate t1.a=t2.a can be evaluated when rows from t1 and t2 are ready, ie at table t2. The predicate t2.a=t3.a can be evaluated at table t3.

Each non-constant split-out table condition is guarded by a match variable that enables it only when a matching row is found for all the embedded outer join operations.

Each split-out table condition is guarded by a variable that turns the condition off just before a null-complemented row for the outer join operation is formed. Thus, the join condition will not be checked for the null-complemented row.

◆ clear_hash_tables()

void JOIN::clear_hash_tables ( )
inline

◆ compare_costs_of_subquery_strategies()

bool JOIN::compare_costs_of_subquery_strategies ( Subquery_strategy method)
private

Tells what is the cheapest between IN->EXISTS and subquery materialization, in terms of cost, for the subquery's JOIN.

Input:

  • join->{best_positions,best_read,best_rowcount} must contain the execution plan of EXISTS (where 'join' is the subquery's JOIN)
  • join2->{best_positions,best_read,best_rowcount} must be correctly set (where 'join2' is the parent join, the grandparent join, etc). Output: join->{best_positions,best_read,best_rowcount} contain the cheapest execution plan (where 'join' is the subquery's JOIN).

This plan choice has to happen before calling functions which set up execution structures, like JOIN::get_best_combination().

Parameters
[out]methodchosen method (EXISTS or materialization) will be put here.
Returns
false if success

◆ contains_non_aggregated_fts()

bool JOIN::contains_non_aggregated_fts ( ) const

Returns true if any of the items in JOIN::fields contains a call to the full-text search function MATCH, which is not wrapped in an aggregation function.

◆ copy_ref_item_slice() [1/2]

void JOIN::copy_ref_item_slice ( Ref_item_array  dst_arr,
Ref_item_array  src_arr 
)
inline

◆ copy_ref_item_slice() [2/2]

void JOIN::copy_ref_item_slice ( uint  dst_slice,
uint  src_slice 
)
inline

Overwrites one slice of ref_items with the contents of another slice.

In the normal case, dst and src have the same size(). However: the rollup slices may have smaller size than slice_sz.

◆ cost_model()

const Cost_model_server * JOIN::cost_model ( ) const

Retrieve the cost model object to be used for this join.

Returns
Cost model object for the join

◆ decide_subquery_strategy()

bool JOIN::decide_subquery_strategy ( )

Decides between EXISTS and materialization; performs last steps to set up the chosen strategy.

Returns
'false' if no error
Note
If UNION this is called on each contained JOIN.

◆ finalize_derived_keys()

void JOIN::finalize_derived_keys ( )

For each materialized derived table/view, informs every TABLE of the key it will (not) use, segregates used keys from unused keys in TABLE::key_info, and eliminates unused keys.

◆ finalize_table_conditions()

bool JOIN::finalize_table_conditions ( THD thd)

Remove redundant predicates and cache constant expressions.

Do a final round on pushed down table conditions and HAVING clause. Optimize them for faster execution by removing predicates being obsolete due to the access path selected for the table. Constant expressions are also cached to avoid evaluating them for each row being compared.

Parameters
thdthread handler
Returns
false if success, true if error
Note
This function is run after conditions have been pushed down to individual tables, so transformation is applied to JOIN_TAB::condition and not to the WHERE condition.

◆ fts_index_access()

bool JOIN::fts_index_access ( JOIN_TAB tab)

Check if FTS index only access is possible.

Parameters
tabpointer to JOIN_TAB structure.
Returns
true if index only access is possible, false otherwise.

◆ generate_derived_keys()

bool JOIN::generate_derived_keys ( )

Add keys to derived tables'/views' result tables in a list.

This function generates keys for all derived tables/views of the query_block to which this join corresponds to with help of the Table_ref:generate_keys function.

Returns
false all keys were successfully added.
true OOM error

◆ get_current_fields()

mem_root_deque< Item * > * JOIN::get_current_fields ( )

Returns the clone of fields_list which is appropriate for evaluating expressions at the current stage of execution; which stage is denoted by the value of current_ref_item_slice.

◆ get_plan_state()

enum_plan_state JOIN::get_plan_state ( ) const
inline

See enum_plan_state.

◆ get_ref_item_slice()

uint JOIN::get_ref_item_slice ( ) const
inline
Note
do also consider Switch_ref_item_slice

◆ init_key_dependencies()

void JOIN::init_key_dependencies ( )
inlineprivate

Initialize key dependencies for join tables.

TODO figure out necessity of this method. Current test suite passed without this initialization.

◆ is_executed()

bool JOIN::is_executed ( ) const
inline

◆ is_optimized()

bool JOIN::is_optimized ( ) const
inline

◆ make_outerjoin_info()

void JOIN::make_outerjoin_info ( )
private

Fill in outer join related info for the execution plan structure.

For each outer join operation left after simplification of the original query the function set up the following pointers in the linear structure join->join_tab representing the selected execution plan. The first inner table t0 for the operation is set to refer to the last inner table tk through the field t0->last_inner. Any inner table ti for the operation are set to refer to the first inner table ti->first_inner. The first inner table t0 for the operation is set to refer to the first inner table of the embedding outer join operation, if there is any, through the field t0->first_upper. The on expression for the outer join operation is attached to the corresponding first inner table through the field t0->on_expr_ref. Here ti are structures of the JOIN_TAB type.

EXAMPLE. For the query:

(t2, t3 LEFT JOIN t4 ON t3.a=t4.a)
ON (t1.a=t2.a AND t1.b=t3.b)
WHERE t1.c > 5,
Definition: sql_optimizer.h:133
JOIN(THD *thd_arg, Query_block *select)
Definition: sql_optimizer.cc:168
const std::string SELECT("SELECT")
Name of the static privileges.
@ WHERE
Definition: sql_yacc.h:687
@ FROM
Definition: sql_yacc.h:250
@ LEFT
Definition: sql_yacc.h:324

given the execution plan with the table order t1,t2,t3,t4 is selected, the following references will be set; t4->last_inner=[t4], t4->first_inner=[t4], t4->first_upper=[t2] t2->last_inner=[t4], t2->first_inner=t3->first_inner=[t2], on expression (t1.a=t2.a AND t1.b=t3.b) will be attached to t2->on_expr_ref, while t3.a=t4.a will be attached to *t4->on_expr_ref.

Note
The function assumes that the simplification procedure has been already applied to the join query (see simplify_joins). This function can be called only after the execution plan has been chosen.

◆ mark_const_table()

void JOIN::mark_const_table ( JOIN_TAB tab,
Key_use key 
)

Move const tables first in the position array.

Increment the number of const tables and set same basic properties for the const table. A const table looked up by a key has type JT_CONST. A const table with a single row has type JT_SYSTEM.

Parameters
tabTable that is designated as a const table
keyThe key definition to use for this table (NULL if table scan)

◆ operator=()

JOIN & JOIN::operator= ( const JOIN rhs)
delete

◆ optimize_fts_query()

bool JOIN::optimize_fts_query ( )
private

Function sets FT hints, initializes FT handlers and checks if FT index can be used as covered.

◆ optimize_keyuse()

void JOIN::optimize_keyuse ( )
private

Update some values in keyuse for faster choose_table_order() loop.

◆ optimize_rollup()

bool JOIN::optimize_rollup ( )

Optimize rollup specification.

Allocate objects needed for rollup processing.

Returns
false if success, true if error.

◆ plan_is_const()

bool JOIN::plan_is_const ( ) const
inline

True if plan is const, ie it will return zero or one rows.

◆ plan_is_single_table()

bool JOIN::plan_is_single_table ( )
inline

True if plan contains one non-const primary table (ie not including tables taking part in semi-join materialization).

◆ query_expression()

Query_expression * JOIN::query_expression ( ) const
inline

Query expression referring this query block.

◆ refine_best_rowcount()

void JOIN::refine_best_rowcount ( )

Refine the best_rowcount estimation based on what happens after tables have been joined: LIMIT and type of result sink.

◆ remove_const()

ORDER * JOIN::remove_const ( ORDER first_order,
Item cond,
bool  change,
bool *  simple_order,
bool  group_by 
)
private

Remove all constants and check if ORDER only contains simple expressions.

simple_order is set to true if sort_order only uses fields from head table and the head table is not a LEFT JOIN table.

Parameters
first_orderList of GROUP BY or ORDER BY sub-clauses.
condWHERE condition.
changeIf true, remove sub-clauses that need not be evaluated. If this is not set, then only simple_order is calculated.
[out]simple_orderSet to true if we are only using simple expressions.
group_byTrue if first_order represents a grouping operation.
Returns
new sort order, after const elimination (when change is true).

◆ root_access_path()

AccessPath * JOIN::root_access_path ( ) const
inline

◆ send_row_on_empty_set()

bool JOIN::send_row_on_empty_set ( ) const
inlineprivate

Return whether the caller should send a row even if the join produced no rows if:

  • there is an aggregate function (sum_func_count!=0), and
  • the query is not grouped, and
  • a possible HAVING clause evaluates to TRUE.
Note
: if there is a having clause, it must be evaluated before returning the row.

◆ set_executed()

void JOIN::set_executed ( )
inline

◆ set_optimized()

void JOIN::set_optimized ( )
inline

◆ set_ref_item_slice()

void JOIN::set_ref_item_slice ( uint  sliceno)
inline

Overwrite the base slice of ref_items with the slice supplied as argument.

Parameters
slicenonumber to overwrite the base slice with, must be 1-4 or 4 + windowno.

◆ set_root_access_path()

void JOIN::set_root_access_path ( AccessPath path)
inline

Member Data Documentation

◆ all_table_map

table_map JOIN::all_table_map {0}

Set of tables contained in query.

◆ allow_outer_refs

bool JOIN::allow_outer_refs {false}

True if plan search is allowed to use references to expressions outer to this JOIN (for example may set up a 'ref' access looking up an outer expression in the index, etc).

◆ best_positions

POSITION* JOIN::best_positions {nullptr}

This is the result of join optimization.

Note
This is a scratch array, not used after get_best_combination().

◆ best_read

double JOIN::best_read {0.0}

The cost of best complete join plan found so far during optimization, after optimization phase - cost of picked join order (not taking into account the changes made by test_if_skip_sort_order()).

◆ best_ref

JOIN_TAB** JOIN::best_ref {nullptr}

Array of plan operators representing the current (partial) best plan.

The array is allocated in JOIN::make_join_plan() and is valid only inside this function. Initially (*best_ref[i]) == join_tab[i]. The optimizer reorders best_ref.

◆ best_rowcount

ha_rows JOIN::best_rowcount {0}

The estimated row count of the plan with best read time (see above).

◆ calc_found_rows

bool JOIN::calc_found_rows {false}

If true, calculate found rows for this query block.

◆ child_subquery_can_materialize

bool JOIN::child_subquery_can_materialize {false}

True if, at this stage of processing, subquery materialization is allowed for children subqueries of this JOIN (those in the SELECT list, in WHERE, etc).

If false, and we have to evaluate a subquery at this stage, then we must choose EXISTS.

◆ cond_equal

COND_EQUAL* JOIN::cond_equal {nullptr}

◆ const_table_map

table_map JOIN::const_table_map

Set of tables found to be const.

◆ const_tables

uint JOIN::const_tables {0}

Number of primary tables deemed constant.

◆ current_ref_item_slice

uint JOIN::current_ref_item_slice

The slice currently stored in ref_items[0].

Used to restore the base ref_items slice from the "save" slice after it has been overwritten by another slice (1-3).

◆ deps_of_remaining_lateral_derived_tables

table_map JOIN::deps_of_remaining_lateral_derived_tables {0}

This is the bitmap of all tables which are dependencies of lateral derived tables which are not (yet) part of the partial plan.

(The value is a logical 'or' of zero or more Table_ref.map() values.)

When we are building the join order, there is a partial plan (an ordered sequence of JOIN_TABs), and an unordered set of JOIN_TABs not yet added to the plan. Due to backtracking, the partial plan may both grow and shrink. When we add a new table to the plan, we may wish to set up join buffering, so that rows from the preceding table are buffered. If any of the remaining tables are derived tables that depends on any of the predecessors of the table we are adding (i.e. a lateral dependency), join buffering would be inefficient. (

See also
setup_join_buffering() for a detailed explanation of why this is so.)

For this reason we need to maintain this table_map of lateral dependencies of tables not yet in the plan. Whenever we add a new table to the plan, we update the map by calling Optimize_table_order::recalculate_lateral_deps_incrementally(). And when we remove a table, we restore the previous map value using a Tabel_map_restorer object.

As an example, assume that we join four tables, t1, t2, t3 and d1, where d1 is a derived table that depends on t1:

SELECT * FROM t1 JOIN t2 ON t1.a=t2.b JOIN t3 ON t2.c=t3.d JOIN LATERAL (SELECT DISTINCT e AS x FROM t4 WHERE t4.f=t1.c) AS d1 ON t3.e=d1.x;

Now, if our partial plan is t1->t2, the map (of lateral dependencies of the remaining tables) will contain t1. This tells us that we should not use join buffering when joining t1 with t2. But if the partial plan is t1->d2->t2, the map will be empty. We may thus use join buffering when joining d2 with t2.

◆ do_send_rows

bool JOIN::do_send_rows {true}

If true, send produced rows using query_result.

◆ error

int JOIN::error {0}

set in optimize(), exec(), prepare_result()

◆ examined_rows

ha_rows JOIN::examined_rows {0}

◆ executed

bool JOIN::executed {false}
private

Set by exec(), reset by reset().

Note that this needs to be set during the query (not only when it's done executing), or the dynamic range optimizer will not understand which tables have been read.

◆ explain_flags

Explain_format_flags JOIN::explain_flags {}

Buffer to gather GROUP BY, ORDER BY and DISTINCT QEP details for EXPLAIN.

◆ fetch_limit

ha_rows JOIN::fetch_limit {HA_POS_ERROR}

Used to fetch no more than given amount of rows per one fetch operation of server side cursor.

The value is checked in end_send and end_send_group in fashion, similar to offset_limit_cnt:

  • fetch_limit= HA_POS_ERROR if there is no cursor.
  • when we open a cursor, we set fetch_limit to 0,
  • on each fetch iteration we add num_rows to fetch to fetch_limit

◆ fields

mem_root_deque<Item *>* JOIN::fields

◆ filesorts_to_cleanup

Prealloced_array<Filesort *, 1> JOIN::filesorts_to_cleanup {PSI_NOT_INSTRUMENTED}

◆ found_const_table_map

table_map JOIN::found_const_table_map

Const tables which are either:

  • not empty
  • empty but inner to a LEFT JOIN, thus "considered" not empty for the rest of execution (a NULL-complemented row will be used).

◆ found_records

ha_rows JOIN::found_records {0}

◆ group_fields

List<Cached_item> JOIN::group_fields {}

◆ group_fields_cache

List<Cached_item> JOIN::group_fields_cache {}

◆ group_list

ORDER_with_src JOIN::group_list

◆ group_optimized_away

bool JOIN::group_optimized_away {false}

If we have the GROUP BY statement in the query, but the group_list was emptied by optimizer, this flag is true.

It happens when fields in the GROUP BY are from constant table

◆ group_sent

bool JOIN::group_sent {false}

Exec time only: true <=> current group has been sent.

◆ grouped

bool JOIN::grouped

If query contains GROUP BY clause.

◆ has_lateral

bool JOIN::has_lateral {false}

If JOIN has lateral derived tables (is set at start of planning)

◆ hash_table_generation

uint64_t JOIN::hash_table_generation {0}

Incremented each time clear_hash_tables() is run, signaling to HashJoinIterators that they cannot keep their hash tables anymore (since outer references may have changed).

◆ having_cond

Item* JOIN::having_cond

Optimized HAVING clause item tree (valid for one single execution).

Used in JOIN execution, as last "row filtering" step. With one exception: may be pushed to the JOIN_TABs of temporary tables used in DISTINCT / GROUP BY (see JOIN::make_tmp_tables_info()); in that case having_cond is set to NULL, but is first saved to having_for_explain so that EXPLAIN EXTENDED can still print it. Initialized by Query_block::get_optimizable_conditions().

◆ having_for_explain

Item* JOIN::having_for_explain

Saved optimized HAVING for EXPLAIN.

◆ implicit_grouping

bool JOIN::implicit_grouping

True if aggregated but no GROUP BY.

◆ join_tab

JOIN_TAB* JOIN::join_tab {nullptr}

Optimal query execution plan.

Initialized with a tentative plan in JOIN::make_join_plan() and later replaced with the optimal plan in get_best_combination().

◆ keyuse_array

Key_use_array JOIN::keyuse_array

Used and updated by JOIN::make_join_plan() and optimize_keyuse()

◆ lock

MYSQL_LOCK* JOIN::lock

◆ m_root_access_path

AccessPath* JOIN::m_root_access_path = nullptr
private

An access path you can read from to get all records for this query (after you create an iterator from it).

◆ m_root_access_path_no_in2exists

AccessPath* JOIN::m_root_access_path_no_in2exists = nullptr
private

If this query block contains conditions synthesized during IN-to-EXISTS conversion: A second query plan with all such conditions removed.

See comments in JOIN::optimize().

◆ m_select_limit

ha_rows JOIN::m_select_limit {0}

◆ m_windowing_steps

bool JOIN::m_windowing_steps {false}

If we have set up tmp tables for windowing,.

See also
make_tmp_tables_info

◆ m_windows

List<Window> JOIN::m_windows

Any window definitions.

◆ m_windows_sort

bool JOIN::m_windows_sort {false}

True if a window requires a certain order of rows, which implies that any order of rows coming out of the pre-window join will be disturbed.

◆ map2table

JOIN_TAB** JOIN::map2table {nullptr}

mapping between table indexes and JOIN_TABs

◆ need_tmp_before_win

bool JOIN::need_tmp_before_win {false}

If true we need a temporary table on the result set before any windowing steps, e.g.

for DISTINCT or we have a query ORDER BY. See details in JOIN::optimize

◆ needs_finalize

bool JOIN::needs_finalize {false}

Whether this query block needs finalization (see FinalizePlanForQueryBlock()) before it can be actually used.

This only happens when using the hypergraph join optimizer.

◆ optimized

bool JOIN::optimized {false}
private

flag to avoid double optimization in EXPLAIN

◆ order

ORDER_with_src JOIN::order

ORDER BY and GROUP BY lists, to transform with prepare,optimize and exec.

◆ 

enum { ... } JOIN::ORDERED_INDEX_VOID

◆ override_executor_func

Override_executor_func JOIN::override_executor_func = nullptr

◆ plan_state

enum_plan_state JOIN::plan_state {NO_PLAN}
private

Final execution plan state. Currently used only for EXPLAIN.

◆ positions

POSITION* JOIN::positions {nullptr}

◆ primary_tables

uint JOIN::primary_tables {0}

Number of primary input tables in query block.

◆ qep_tab

QEP_TAB* JOIN::qep_tab {nullptr}

Array of QEP_TABs.

◆ query_block

Query_block* const JOIN::query_block

Query block that is optimized and executed using this JOIN.

◆ recursive_iteration_count

uint JOIN::recursive_iteration_count {0}

Used only if this query block is recursive.

Contains count of all executions of this recursive query block, since the last this->reset().

◆ ref_items

Ref_item_array* JOIN::ref_items
Initial value:
{
nullptr}

ref_items is an array of 4+ slices, each containing an array of Item pointers.

ref_items is used in different phases of query execution.

  • slice 0 is initially the same as Query_block::base_ref_items, ie it is the set of items referencing fields from base tables. During optimization and execution it may be temporarily overwritten by slice 1-3.
  • slice 1 is a representation of the used items when being read from the first temporary table.
  • slice 2 is a representation of the used items when being read from the second temporary table.
  • slice 3 is a copy of the original slice 0. It is created if slice overwriting is necessary, and it is used to restore original values in slice 0 after having been overwritten.
  • slices 4 -> N are used by windowing: all the window's out tmp tables,

    Two windows: 4: window 1's out table 5: window 2's out table

    and so on.

Slice 0 is allocated for the lifetime of a statement, whereas slices 1-3 are associated with a single optimization. The size of slice 0 determines the slice size used when allocating the other slices.

◆ return_tab

plan_idx JOIN::return_tab {0}

◆ rollup_group_items

Prealloced_array<Item_rollup_group_item *, 4> JOIN::rollup_group_items
Initial value:
{
#define PSI_NOT_INSTRUMENTED
Definition: validate_password_imp.cc:42

◆ rollup_state

RollupState JOIN::rollup_state

◆ rollup_sums

Prealloced_array<Item_rollup_sum_switcher *, 4> JOIN::rollup_sums
Initial value:

◆ row_limit

ha_rows JOIN::row_limit {0}

◆ select_count

bool JOIN::select_count {false}

◆ select_distinct

bool JOIN::select_distinct

At construction time, set if SELECT DISTINCT.

May be reset to false later, when we set up a temporary table operation that deduplicates for us.

◆ semijoin_deduplication_fields

List<Cached_item> JOIN::semijoin_deduplication_fields {}

◆ send_group_parts

uint JOIN::send_group_parts {0}

◆ send_records

ha_rows JOIN::send_records {0}

◆ simple_group

bool JOIN::simple_group {false}

◆ simple_order

bool JOIN::simple_order {false}

◆ sj_tmp_tables

List<TABLE> JOIN::sj_tmp_tables {}

◆ sjm_exec_list

List<Semijoin_mat_exec> JOIN::sjm_exec_list {}

◆ skip_sort_order

bool JOIN::skip_sort_order {false}

Is set if we have a GROUP BY and we have ORDER BY on a constant or when sorting isn't required.

◆ sort_by_table

TABLE* JOIN::sort_by_table {nullptr}

◆ sort_cost

double JOIN::sort_cost {0.0}

Expected cost of filesort.

◆ streaming_aggregation

bool JOIN::streaming_aggregation {false}

Indicates that the data will be aggregated (typically GROUP BY), and that it is already processed in an order that is compatible with the grouping in use (e.g.

because we are scanning along an index, or because an earlier step sorted the data in a group-compatible order).

Note that this flag changes value at multiple points during optimization; if it's set when a temporary table is created, this means we aggregate into said temporary table (end_write_group is chosen instead of end_write), but if it's set later, it means that we can aggregate as we go, just before sending the data to the client (end_send_group is chosen instead of end_send).

See also
make_group_fields, alloc_group_fields, JOIN::exec

◆ sum_funcs

Item_sum** JOIN::sum_funcs {nullptr}

◆ tables

uint JOIN::tables {0}

Before plan has been created, "tables" denote number of input tables in the query block and "primary_tables" is equal to "tables".

After plan has been created (after JOIN::get_best_combination()), the JOIN_TAB objects are enumerated as follows:

  • "tables" gives the total number of allocated JOIN_TAB objects
  • "primary_tables" gives the number of input tables, including materialized temporary tables from semi-join operation.
  • "const_tables" are those tables among primary_tables that are detected to be constant.
  • "tmp_tables" is 0, 1 or 2 (more if windows) and counts the maximum possible number of intermediate tables in post-processing (ie sorting and duplicate removal). Later, tmp_tables will be adjusted to the correct number of intermediate tables,
    See also
    JOIN::make_tmp_tables_info.
  • The remaining tables (ie. tables - primary_tables - tmp_tables) are input tables to materialized semi-join operations. The tables are ordered as follows in the join_tab array:
    1. const primary table
    2. non-const primary tables
    3. intermediate sort/group tables
    4. possible holes in array
    5. semi-joined tables used with materialization strategy Total number of tables in query block

◆ tables_list

Table_ref* JOIN::tables_list

Pointer set to query_block->get_table_list() at the start of optimization.

May be changed (to NULL) only if optimize_aggregated_query() optimizes tables away.

◆ temp_tables

Initial value:

◆ thd

THD* const JOIN::thd

Thread handler.

◆ tmp_fields

mem_root_deque<Item *>* JOIN::tmp_fields = nullptr

Array of pointers to lists of expressions.

Each list represents the SELECT list at a certain stage of execution, and also contains necessary extras: expressions added for ORDER BY, GROUP BY, window clauses, underlying items of split items. This array is only used when the query makes use of tmp tables: after writing to tmp table (e.g. for GROUP BY), if this write also does a function's calculation (e.g. of SUM), after the write the function's value is in a column of the tmp table. If a SELECT list expression is the SUM, and we now want to read that materialized SUM and send it forward, a new expression (Item_field type instead of Item_sum), is needed. The new expressions are listed in JOIN::tmp_fields_list[x]; 'x' is a number (REF_SLICE_).

See also
JOIN::make_tmp_tables_info()

◆ tmp_table_param

Temp_table_param JOIN::tmp_table_param

Describes a temporary table.

Each tmp table has its own tmp_table_param. The one here is transiently used as a model by create_intermediate_table(), to build the tmp table's own tmp_table_param.

◆ tmp_tables

uint JOIN::tmp_tables {0}

Number of temporary tables used by query.

◆ where_cond

Item* JOIN::where_cond

JOIN::having_cond is initially equal to query_block->having_cond, but may later be changed by optimizations performed by JOIN.

The relationship between the JOIN::having_cond condition and the associated variable query_block->having_value is so that having_value can be:

  • COND_UNDEF if a having clause was not specified in the query or if it has not been optimized yet
  • COND_TRUE if the having clause is always true, in which case JOIN::having_cond is set to NULL.
  • COND_FALSE if the having clause is impossible, in which case JOIN::having_cond is set to NULL
  • COND_OK otherwise, meaning that the having clause needs to be further evaluated All of the above also applies to the where_cond/query_block->cond_value pair. Optimized WHERE clause item tree (valid for one single execution). Used in JOIN execution if no tables. Otherwise, attached in pieces to JOIN_TABs and then not used in JOIN execution. Printed by EXPLAIN EXTENDED. Initialized by Query_block::get_optimizable_conditions().

◆ windowing_cost

double JOIN::windowing_cost {0.0}

Expected cost of windowing;.

◆ with_json_agg

bool JOIN::with_json_agg

This will force tmp table to NOT use index + update for group operation as it'll cause [de]serialization for each json aggregated value and is very ineffective (times worse).

Server should use filesort, or tmp table + filesort to resolve GROUP BY with JSON aggregate functions.

◆ zero_result_cause

const char* JOIN::zero_result_cause {nullptr}

<> NULL if optimization has determined that execution will produce an empty result before aggregation, contains a textual explanation on why result is empty.

Implicitly grouped queries may still produce an aggregation row.


The documentation for this class was generated from the following files: