WL#7052: Make best_access_path() more readable
Status: Complete
The responsibility of Optimize_table_order::best_access_path() is to pick the cheapest table access method. It is a large function (700 lines long) that basically does two things: fine-grained cost calculation for possible 'ref' accesses and coarse-grained cost based selection of access method. The function consists of very long and deeply nested if() blocks in combination with variable names that have little or no meaning (like "double best", "double tmp" etc). All this makes the function almost impossible to read. This WL is for refactoring best_access_path() to improve readability. User Documentation ================== Code refactoring. No user docs required.
Functional requirements: ------------------------ F1: best_access_path() readability shall improve by moving ref access cost calculations to a separate member function that will be called from best_access_path(). This will both reduce the length of best_access_path() and make the abstraction level of it more consistent. F2: best_access_path() readability shall improve by giving variables meaningful names. Non-Functional requirements: ---------------------------- NF1: The functionality of best_access_path() shall remain unchanged. Bug reports will be created for any bug found during the work on this WL. NF2: Performance degradations are not accepted.
The overall responsibility of Optimize_table_order::best_access_path() is to pick the cheapest possible table access method. It does this by calculating the cost of all possible 'ref' access methods and compare this to the (mostly) precalculated costs for tables scan/range scan/index scan. While the overall responsibility is easy to understand ("find the cheapest access method"), it is very hard to fully grasp how Optimize_table_order::best_access_path() comes to its conclusion. We have seen a multitude of bugs in this function earlier. The main pain-points are: ------------------------- P1) There is a mismatch between the fine-grained cost calculations done for 'ref' access and the coarse-grained comparison done between different access methods. The table scan and range scan access method data fetch costs are e.g. already calculated when best_access_path() is called. P2) The function is very long (700 lines) P3) The function consists of huge nested if() blocks, especially the part of it that calculates 'ref' access cost. This makes the function hard to read since you can't simply see whether the code you currently read is inside the 'ref' code block or after it. P4) A great number of variables either have names that carry no meaning or names that can easily be confused with something else. P5) Variables are reused in different settings, making it hard to understand what the variable is used for at a particular place in the code. P6) There are hidden bugs because variables are used slightly different than the implementer thought. P7) There are four if-blocks with heuristics that skip calculation of scan costs if "scan cannot be better". But the logic behind these heuristics are not well understood. This WL will fix best_access_path by: ------------------------------------- F1) Moving the 'ref' access cost calculations to a separate member function. This will reduce the size of best_access_path() to half it's size. In addition, the remaining cost calculations and comparisons in best_access_path() will be at the same level of detail. Fixes: P1, P2 F2) Variables will be renamed and new variables introduced so that it's easier to understand the code. Variables will no longer be reused for different purposes. Fixes: P4, P5, P6 F3) New variable names with well-defined values will make it easier to grasp the intention behind the four if-blocks with heuristics for skipping scan cost calculations. Together with improved code comments, this will fix: P7. If they are found to contain bugs, bug reports or separate WLs will be created to fix them. Will not be handled by this WL: ------------------------------- Pain-point P3 (huge if() blocks in the 'ref' access part of the function) will not be fixed by this WL. The 'ref' access part will be factored out to it's own function but a further breakdown into sub-functions will not be done. However, this part of the code will also be easier to read because F2 (variable renaming/new variables) will also be done for this part of the code.
New function created by WL: --------------------------- /** Find the best index to do 'ref' access on for a table. @param tab the table to be joined by the function @param remaining_tables set of tables not included in the partial plan yet. @param idx the index in join->position[] where 'tab' is added to the partial plan. @param found_condition [out] whether or not there exists a condition that filters away rows for this table @param prefix_rowcount estimate for the number of records returned by the partial plan @param ref_depend_map [out] tables the best ref access depends on Unmodified if no 'ref' access is found. @param used_key_parts [out] Number of key parts 'ref' access uses. Unmodified if no 'ref' access is found. @return pointer to Key_use for the index with best 'ref' access, NULL if no 'ref' access method is found. */ Key_use* Optimize_table_order::find_best_ref(JOIN_TAB *tab, table_map remaining_tables, uint idx, bool *found_condition, double prefix_rowcount, table_map *ref_depends_map, uint *used_key_parts) This member function will replace the whole 'ref' analysis part of best_access_path(), i.e., the code inside the following if block: /* This isn't unlikely at all, but unlikely() cuts 6% CPU time on a 20-table search when s->keyuse==0, and has no cost when s->keyuse!=0. */ if(unlikely(s->keyuse != NULL)) { // THIS } Changes to Optimize_table_order::best_access_path() --------------------------------------------------- * Parameter JOIN_TAB *s is renamed: JOIN_TAB *tab * The big if() block for ref access analysis is moved to Optimize_table_order::find_best_ref(). See above * Parameter renaming: double record_count => prefix_rowcount Rationale: make it clearer that this is the estimated number of rows produced by tables earlier in the join sequence. Also make consistent use of row instead of record. * A bunch of variables with no meaningful names are renamed, and multiple usages of generic variables (e.g., "tmp" was used for many different things) have been removed by introducing new variables with meaningful names. Changes to class Key_use: ------------------------- * Renaming: double rowcount => double fanout Rationale: As evident from the documentation of this variable, this is the estimated fanout by using 'ref' access on the given index * Renaming: double cost => double read_cost Rationale: This is the cost of fetching rows from storage engines and does not include server side processing. I.e., ROW_EVALUATION_COST is not reflected in this variable. Changes to struct st_position: ------------------------------ * Renaming: double records_read => double fanout Rationale: same as for Key_use::rowcount * Renaming: double read_time => double read_cost Rationale: Consistency of variable names (Key_use::read_cost) Found bugs: ----------- * BUG#17303649
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.