WL#6526: FORCE INDEX to avoid index dives when possible
Affects: Server-8.0
—
Status: Complete
Currently when user specifies FORCE INDEX, optimizer still always calculates cost in order to pick access method and join order using index dives. Under some circumstances it's possible to avoid doing this and this could speed up execution by 40%. The goal of this WL is to detect if access methods for tables and join order could be derived from the query and produce query plan without doing index dives in such cases.
F1) For a single table query, index dives are skipped during execution if: a) FORCE INDEX applies to a single index. b) No subquery is present. c) Fulltext Index is not involved. d) No GROUP-BY or DISTINCT clause. e) No ORDER-BY clause. F2) Not applicable to multi-table query. F3) i) EXPLAIN format=traditional FOR CONNECTION will see the following changes: a) No. of "rows" will change to NULL for the table where index dive is skipped. b) The value for "filtered" will be NULL. ii) EXPLAIN format=JSON FOR CONNECTION will see the following changes: a) "rows_examined_per_scan" and "rows_produced_per_join" will not be mentioned when index dive is skipped. b) '"skip_index_dive_due_to_force": true' for the table where index dive is skipped. c) Cost calculations will not be accurate since index dive was skipped. iii) No change in output of EXPLAIN format=traditional and EXPLAIN format=json NF1) Queries that qualify the restrictions should see a reduction in execution time since an index dive will be skipped. NF2) No change for queries that don't qualify.
As explained in http://jorgenloland.blogspot.in/2012/04/on-queries-with-many-values-in-in.html "For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate, and is therefore a good basis to make the best possible execution plan (QEP) for the query. Unfortunately, it involves an index dive for each range. " When a query contains FORCE INDEX hint, it implicitly indicates that the user knows the best index for this query. If the decision over which index will be used is a foregone conclusion, there is no point in doing index dives for cost calcuation. So optimizer should detect such scenarios and avoid performing the dive.
Index dives are mainly called from "handler::multi_range_read_info_const". So the check over whether an index dive is needed can be done here. Changes: 1) TABLE related checks are performed in : bool TABLE::check_skip_dive(THD *thd) { skipped_index_dive= (force_index && pos_in_table_list->index_hints->elements == 1 && !thd->lex->describe && thd->lex->current_select()->qualifies_skip_dive()); return skipped_index_dive; } 2) Query related checks are performed in : bool SELECT_LEX::qualifies_skip_dive() { return (parent_lex->is_single_level_stmt() && single_table_or_straight_join() && !(has_ft_funcs() || is_grouped() || is_distinct() || is_ordered())); } 3) The flag TABLE->skipped_index_dive is used to keep track of whether the index dive was skipped. This will help in providing optimizer trace and explain output.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.