WL#6016: Consolidate plan AM decisions in a single function

Status: Complete   —   Priority: Medium

Currently, the cost estimate on which access method (AM) to use for a
particular table is made in the best_access_path function, but the decision
which AM actually will be used are made in several functions:
make_join_statistics, get_best_combination, create_ref_for_key,
make_join_readinfo, make_join_select, test_if_skip_sort_order.
The goal of this WL is to move all decisions on AM in a single
function and after decision is made it shouldn't change.
Exceptions are:
  make_join_statistics - as it pre-reads const/system tables thus reducing
  set of tables for greedy search.
  test_if_skip_sort_order - which is a subject of WL#5558.

The second goal is to provide clear separation of AM.
For example: currently range index scan is marked as full table scan and
distinguished from the latter only by presence of quick select. Another example
is index scan (the JT_NEXT AM type). Such approach adds confusion and is harder
to follow.

Completing these goals will make optimizer/executor code more clear and ease
development of new features.

User Documentation

No user documentation required.
Since there is no any new user-visible functionality, there is no functional
requirements also.

Non-functional requirements (list of expected side-effects):
R 1) ICP should be used more often
R 2) Index-only reads ("Using index" in extra) should be used more often
R 3) optimizer trace should correctly show picked 'ref' access when it's used
(shown by the explain)
R 4) No other user-visible changes, i.e. queries results and plans should be the 

Main goals of this WL is to split JOIN_TAB in two
parts. This is achieved by implementing following tasks:
1) Settle join plan at a single point, avoid changing access methods (AM)
somewhere in the middle of optimizer (only partially achieved).
2) Clear separation between AM
3) Initialize execution time part of JOIN_TAB as late as possible.

Settle join plan at a single point
make_join_statistics only picks join order. Despite AMs are set in several
cases, code that follows overrides these decisions. This is very
inconvenient as this doesn't allow to see when the plan is ready and
execution part of JOIN_TAB could be initialized. Also this messes the
process of using certain optimizations, e.g. in some cases ICP wasn't used
only because it was tested on a premature plan and got rejected later,
without trying to use them on the final plan.
Due to this, JOIN::get_best_combination should do as much as possible to pick
final execution plan that doesn't need to be adjusted later.
Despite this effort, there are some exceptions left due to
*) test_if_skip_sort_order still can change AM after the plan is settled.
*) make_join_select could switch from JT_ALL to JT_RANGE, or to 
   'range checked for each record'.
*) optimize_fts_query could change from JT_FT to JT_ALL due to insufficient
   number of rows to meet specified LIMIT.

Clear separation between AM
Before this WL JT_ALL type was representing 4 different AMs: table scan,
index scan, range scan, index merge. AMs were distinguished by different
attributes, e.g presence of quick select of a particular type.
This approach seriously messed the code and leaving vast opportunities for
all kinds of mistakes.
JOIN::get_best_combination should set correct AM type for each AM when
finalizing join plan. Types have to be used instead of attributes to determine
the AM being used.

Initialize execution time part of JOIN_TAB as late as possible
The main part of this is done by the make_join_readinfo function. This
function should be called right before JOIN::make_tmp_table_info. This will
allow us to allocate execution part of JOIN_TAB only when the plan
is completely settled and doesn't need to be adjusted further. It leaves
smaller room for error when preparing execution plan, making it more
stable and bug-proof.
Changes are grouped by the goal achieved.

Settle join plan at a single point
*) Index only read could be disabled by UPDATE/DELETE, so in order to let
JOIN::get_best_combination make the correct choice between JT_ALL and
JT_INDEX_SCAN, select_result should check whether a table is used in a
non-select operation. This is done by means of new
select_result::is_non_select_table() function, which is overloaded in
select_result_interceptor class.
UPDATE/DELETE code collects bitmaps of tables that can't be used for index
scan in select_result_interceptor::non_select_tables.

*) When both REF and RANGE AM could be used on the same index, but RANGE uses
longer part of the key, make_join_select could switch to use RANGE over REF.
This code is moved JOIN::get_best_combination.

*) make_join_select was checking for an impossible range found by range
optimizer, but in some cases this info is already available after greedy
search. So in order to bail out in such cases earlier this code was moved to

*) To indicate reversed ref/index scan, test_if_skip_sort_order should set a new
JOIN_TAB::reverse_scan flag when needed, instead of initializing
JOIN_TAB::read_first_record. Also it doesn't cancel already pushed conditions
as they aren't pushed yet at this point (see changes to make_join_readinfo).

Clear separation between AM
*) Since after make_join_statistics we would have almost final plan (with few
exceptions) AM type could be trusted. Thus in many cases attributes check
(e.g. type == JT_ALL && select && select->quick) was replaced by type check
(e.g. type ==JT_RANGE)

*) SQL_SELECT::test_quick_select should maintain correct AM type: when quick is
reset to NULL, AM type is set to JT_ALL, if a new quick is created the type
is set to the type appropriate to the quick created.

*) pick_table_access_method is extended to initialize reading info explicitly
to all types of AMs.

Initialize execution time part of JOIN_TAB as late as possible
This is the biggest change done by this WL. In order to initialize execution
part of plan when the plan is ready and finalized, make_join_readinfo is moved
to the very end of JOIN::optimize, right before call to

*) In order to let make_join_readinfo be moved to the end of JOIN::optimize,
join buffering setup was split in two parts:
First part checks applicability of join buffer, picks buffering strategy
and remembers it in JOIN_TAB::use_join_cache, along with MRR flags (if needed)
in JOIN_TAB::join_cache_flags. This part is still called at the same place as
before - after optimization for index subquery.
Second part is actually allocates selected kind of JOIN_CACHE and sets
JOIN_TAB::next_select to the appropriate value. This part is called from
make_join_readinfo at the end of JOIN:optimize.

*) calls to TABLE::prepare_for_position() and pick_table_access_method() were
moved to make_join_readinfo in order to gather initialization of execution
part of plan in a single place.

*) make_join_readinfo should be the only place that initializes table reading
functions. The rest of the code uses AM type and JOIN_TAB::reverse_scan to
pick the required AM.

*) Setup for AM type ALL would be split in two parts. The first one is
somewhat combined setup for JT_ALL and JT_INDEX_SCAN (first part), and
the second one is the combined setup for JT_RANGE and JT_INDEX_MERGE.
This made code clearer and allowed to correctly report AM type in optimizer
trace ("ref" wasn't reported at all, "index_merge" was reported as "range").