WL#7338: Interface for improved records per key estimates

Affects: Server-5.7   —   Status: Complete   —   Priority: Medium

The optimizer uses the information in rec_per_key to estimate the number of
records to read when doing ref access. rec_per_key is currently an integer value
and for small rec_per_key values there are some undesired effects due to
rounding to nearest integer value. To fix the issue with using integer values,
this worklog will introduce a new record per key estimate that uses float as
data type. An interface that the storage engine will use for providing the
estimates and the optimizer for using the new estimates is defined. This
interface will initially be an alternative to use the current rec_per_key. If
the storage engine provides records per key estimates as float values using the
new interface, these will be used by the optimizer. If not, then the current
rec_per_key values will be used. 

This worklog implements the API and the data structures for storing records per
key estimates as float. 

User Documentation
==================

Internal code changes only. No user implications. No user documentation required.
This worklog will not introduce any functional or non-functional changes. It
just provides an interface that storage engines can use for providing improved
records per key estimates and for the optimizer to use the improved estimates.
This worklog implements support for storage engines to provide records per key
estimates as a float value and an interface that the optimizer will use to
access the more exact records per key estimate.

Data structure for storing records per key estimate:
====================================================

The current records per key estimates are stored in the array named rec_per_key
that is part of the KEY object. For storing the corresponding estimate as a
float value, we add a new array to the KEY struct:

/**
  If this value is found in any of KEY::rec_per_key_float[] elements, then
  this means that the storage engine has not provided a value and it is
  unknown.
*/
#define REC_PER_KEY_UNKNOWN -1.0

  /**
    Array of AVG(#records with the  same field value) for 1st ... Nth key part.
    For temporary heap tables this member is NULL.
    This is the same information as stored in the above rec_per_key array 
    but using float values instead of integer values. If the storage
    engine has supplied values in this array, these will be used. Otherwise
    the value in rec_per_key will be used.
  */
  float *rec_per_key_float;

It will be the responsibility for the storage engines to fill in the new field
when handler::info(HA_STATUS_CONST) is called. If the storage engine does not
provide the records per key estimate in this new array, the estimate from the
existing rec_per_key array will be used by the optimizer.

Interface for storage engines to use when setting the records per key estimate
==============================================================================

Storage engines should use the following member function on the KEY struct
when setting the records per key estimate for a key part:

  /**
    Set the records per key estimate for a key part.

    @param key_part_no     the number of key parts that the estimate includes,
                           must be in [0, KEY::actual_key_parts)
    @param rec_per_key_est new records per key estimate
  */

  void set_records_per_key(uint key_part_no, float rec_per_key_est);

Interface for the optimizer to access records per key estimates:
================================================================

The KEY struct will be extended with the following two member functions that the
optimizer will use for getting information about records per key estimates:

  /** 
    Check if records per key estimate is available for given key part.

    @param key_part_no key part number, must be in [0, KEY::actual_key_parts)

    @return true if records per key estimate is available, false otherwise
  */
  bool has_records_per_key(uint key_part_no) const;

  /**
    Retrieve an estimate for the average number of records per distinct value,
    when looking only at the first key_part_no+1 columns.

    If no record per key estimate is available for this key part
    REC_PER_KEY_UNKNOWN is returned.

    @param key_part_no key part number, must be in [0, KEY::actual_key_parts)

    @return average number of records per distinct value
      @retval REC_PER_KEY_UNKNOWN no records per key estimate available
      @retval != REC_PER_KEY_UNKNOWN record per key estimate
  */
  float records_per_key(uint key_part_no) const;

Alternatives:
=============

The following alternatives were considered when deciding how to best implement
this change.

1. Change the data type for the current rec_per_key values.
===========================================================

Solution: just change the data type used for the current rec_per_key array from
storing ulong values to float values.

Pros: Very simple change. No need to change the current code in the server nor
in the storage engines.

Cons: The storage engines today produce integer estimates. Just inserting these
into a float value would most likely work but it would most likely not provide
more accurate estimates. 

So to ensure that code in storage engines is adjusted to provide
estimates using float, it might be better to let them continue to give the
current estimate until the code in the storage engine is updated and validated
to produce correct estimates.

2. Storage engines could return cardinality estimate instead of rec_per_key
estimates.
===========================================================================

Solution: Instead of having the storage engine return the records per key
estimate, it could return the number of distinct values for each key part. Using
this number the optimizer can compute the records per key part estimate when it
is needed.

The reasons for not suggesting this as the solution for getting more exact
records per key estimates are:

a) In order to use this as a basis for calculating the records per key estimate,
the optimizer code will need to access both the handler (to get the
ha_statistics.records estimate) and the KEY (or key part) object's "number of
distinct records" information. This would make it a bit more costly to compute
and use records per key estimates.

b) There is a difference in how often information about the number of records in
a table is updated and records per key is updated. The records per key estimates
is updated every time the table object is created while the number of records in
a table is updated on the start of every query. In order to use the number of
distinct records for computing records per key estimate, this number had to be
updated every time we start a new query since it will be used together with the
table's record estimate for computing the records per key estimate. Increasing
the frequency for updating this number would make it more costly to use.

c) It would be hard to support the some of the storage engine specific
configuration settings that influence how records per key estimates should be
calculated. For instance, InnoDB has the innodb_stats_method configuration
variable that provides three different ways for how NULL values in an index
should be taken into account when calculating the records per key estimate.
The implementation of the proposed interface: see patch sent to the commit
list on 14:17 CST January 29 2014.