WL#7052: Make best_access_path() more readable

Status: Complete   —   Priority: Medium

The responsibility of Optimize_table_order::best_access_path() is to pick the
cheapest table access method. It is a large function (700 lines long) that
basically does two things: fine-grained cost calculation for possible 'ref'
accesses and coarse-grained cost based selection of access method. The function
consists of very long and deeply nested if() blocks in combination with variable
names that have little or no meaning (like "double best", "double tmp" etc). All
this makes the function almost impossible to read.

This WL is for refactoring best_access_path() to improve readability.

User Documentation

Code refactoring. No user docs required.
Functional requirements:
F1: best_access_path() readability shall improve by moving ref access cost
    calculations to a separate member function that will be called from 
    best_access_path(). This will both reduce the length of best_access_path()
    and make the abstraction level of it more consistent.
F2: best_access_path() readability shall improve by giving variables 
    meaningful names.

Non-Functional requirements:
NF1: The functionality of best_access_path() shall remain unchanged. Bug 
     reports will be created for any bug found during the work on this WL.
NF2: Performance degradations are not accepted.
The overall responsibility of Optimize_table_order::best_access_path() is to
pick the cheapest possible table access method. It does this by calculating the
cost of all possible 'ref' access methods and compare this to the (mostly)
precalculated costs for tables scan/range scan/index scan.

While the overall responsibility is easy to understand ("find the cheapest
access method"), it is very hard to fully grasp how
Optimize_table_order::best_access_path() comes to its conclusion. We have seen
a multitude of bugs in this function earlier.

The main pain-points are:
P1) There is a mismatch between the fine-grained cost calculations done for 'ref'
    access and the coarse-grained comparison done between different access 
    methods. The table scan and range scan access method data fetch costs 
    are e.g. already calculated when best_access_path() is called. 
P2) The function is very long (700 lines)
P3) The function consists of huge nested if() blocks, especially the part of 
    it that calculates 'ref' access cost. This makes the function hard to 
    read since you can't simply see whether the code you currently read is 
    inside the 'ref' code block or after it.
P4) A great number of variables either have names that carry no meaning or names
    that can easily be confused with something else.
P5) Variables are reused in different settings, making it hard to understand what
    the variable is used for at a particular place in the code.
P6) There are hidden bugs because variables are used slightly different than the
    implementer thought.
P7) There are four if-blocks with heuristics that skip calculation of scan
    costs if "scan cannot be better". But the logic behind these heuristics
    are not well understood.

This WL will fix best_access_path by:
F1) Moving the 'ref' access cost calculations to a separate member function.
    This will reduce the size of best_access_path() to half it's size. In
    addition, the remaining cost calculations and comparisons in
    best_access_path() will be at the same level of detail.  Fixes: P1, P2
F2) Variables will be renamed and new variables introduced so that it's easier
    to understand the code. Variables will no longer be reused for different 
    purposes. Fixes: P4, P5, P6
F3) New variable names with well-defined values will make it easier to grasp
    the intention behind the four if-blocks with heuristics for skipping
    scan cost calculations. Together with improved code comments, this will
    fix: P7. If they are found to contain bugs, bug reports or separate WLs
    will be created to fix them.

Will not be handled by this WL:
Pain-point P3 (huge if() blocks in the 'ref' access part of the function) 
will not be fixed by this WL. The 'ref' access part will be factored out to
it's own function but a further breakdown into sub-functions will not be done.
However, this part of the code will also be easier to read because F2 
(variable renaming/new variables) will also be done for this part of the code.
New function created by WL:

  Find the best index to do 'ref' access on for a table.

  @param tab                     the table to be joined by the function
  @param remaining_tables        set of tables not included in the
                                 partial plan yet.
  @param idx                     the index in join->position[] where 'tab'
                                 is added to the partial plan.
  @param found_condition [out]   whether or not there exists a condition
                                 that filters away rows for this table
  @param prefix_rowcount         estimate for the number of records returned
                                 by the partial plan
  @param ref_depend_map [out]    tables the best ref access depends on
                                 Unmodified if no 'ref' access is found.
  @param used_key_parts [out]    Number of key parts 'ref' access uses. 
                                 Unmodified if no 'ref' access is found.

  @return pointer to Key_use for the index with best 'ref' access, NULL if
          no 'ref' access method is found.
Optimize_table_order::find_best_ref(JOIN_TAB  *tab,
                                    table_map remaining_tables,
                                    uint      idx,
                                    bool      *found_condition,
                                    double    prefix_rowcount,
                                    table_map *ref_depends_map,
                                    uint      *used_key_parts)

This member function will replace the whole 'ref' analysis part of 
best_access_path(), i.e., the code inside the following if block:

    This isn't unlikely at all, but unlikely() cuts 6% CPU time on a 20-table
    search when s->keyuse==0, and has no cost when s->keyuse!=0.
  if(unlikely(s->keyuse != NULL))
     // THIS

Changes to Optimize_table_order::best_access_path()
* Parameter JOIN_TAB *s is renamed: JOIN_TAB *tab
* The big if() block for ref access analysis is moved to 
  Optimize_table_order::find_best_ref(). See above
* Parameter renaming: double record_count => prefix_rowcount
  Rationale: make it clearer that this is the estimated number of
  rows produced by tables earlier in the join sequence. Also make 
  consistent use of row instead of record.
* A bunch of variables with no meaningful names are renamed, and
  multiple usages of generic variables (e.g., "tmp" was used for many 
  different things) have been removed by introducing new variables with
  meaningful names.

Changes to class Key_use:
* Renaming: double rowcount => double fanout
  Rationale: As evident from the documentation of this variable, this 
  is the estimated fanout by using 'ref' access on the given index
* Renaming: double cost     => double read_cost
  Rationale: This is the cost of fetching rows from storage engines and 
  does not include server side processing. I.e., ROW_EVALUATION_COST is
  not reflected in this variable.

Changes to struct st_position:
* Renaming: double records_read => double fanout
  Rationale: same as for Key_use::rowcount
* Renaming: double read_time    => double read_cost
  Rationale: Consistency of variable names (Key_use::read_cost)

Found bugs:
* BUG#17303649