WL#2985: Perform Partition Pruning of Range conditions

Affects: Server-5.1 — Status: Complete

This is part of the solution to the problem of only using
the necessary partitions in a query using partitioned.

The problem is given any WHERE clause with a set of tables
to come up with a set of bitmaps (one bitmap per table) where
the bitmap specifies all partitions to be used for that table.
(Naturally only bitmaps are needed for partitioned tables).

This is another way of phrasing the problem defined in
WL #2537 and WL #2538.


Background
----------
From an email from Mikael Ronstrom to Trudy Pelzer, 2005-11-17:
> After discussing with Mark Matthews I went ahead and made a slight 
> reorganisation of the WL tasks for handling optimisations of 
> partitioning that enables more parallelism in the development.
>
> The split is to implement WL #2537 and WL #2538 by WL#2985, 2986 
> and 2987 instead.
>
> WL 2985 is the Server part of the optimisation, WL 2986 is the 
> partition handler part of the tasks and WL 2987 is the NDB handler 
> part of the optimisation.
>
Cancelled entries are WL#2537 and 2538; see their descriptions
for further details.
-- Trudy Pelzer, 2005-11-23

This document contains:

  - an overview of methods of table partitioning supported by MySQL server
  - a formal specification of the problem of query partition pruning
  - a description of a procedure to perform partition pruning


Partition Tables
----------------

We say that table T is partitioned if there is function PF that maps rows of 
table T into [1,...,n].
This mapping effectively splits the set of rows in table T into n disjoint 
sets. Each of theses sets is of the same structure as table T and we can 
consider it as a separate table called partition.

Thus for a partitioned table T we have a set of partition tables P1,...,Pn such 
that for any given state of the table R(T) the following is true:
R(T)=R(P1) UNION ... R(Pn),
where R(Pi) INTERSECT R(Pj) = 0 for any i and j.

Usually the following types of partition map functions are considered:

  - range partitioning
  - hash partitioning
  - combined partitioning

Range partitioning is defined with the help of partition index I. A sequence of 
disjoint index ranges R1,...,Rn that covers all index values is specified. A 
row r belongs to the partition Pi iff I(r) in Ri.
(In MySQL a partition index over columns C1,...,Ck of a table T is defined by 
an expression E over columns C1,...Ck that returns a scalar value).

We could have not one-to-one correspondence between ranges and partitions. 
Ranges could be divided into groups of ranges RG1,...RGm: RGj={Ri1,...Rij}. In 
this case we have:
(r IN Pi) iff RGi includes a range Ril that contains I(r).

So-called list partitioning supported in MySQL provides us with an example of 
such mapping. List partitioning specifies groups of values that select 
different partitions. 

For the following considerations it does not matter whether partitions are 
defined by ranges or by groups of ranges each group containing maybe a number 
of ranges.

Hash partitioning is defined with the help of hash function
H with different values H1,...Hn. A row belongs to the partition Pi iff H(r)=Hi.

With combined partitioning we have range partitioning on the upper level and 
hash sub-partitioning for each range partition.

The fact that table T is partitioned into partitions P1,...,Pn will be denoted 
as T[P1,...,Pn].

Let Pi1,...,Pik be a subset of partitions P1,...,Pn.
A projection of a set of rows RS from table T into partitions Pi1,...Pik will 
be denoted as RS[Pi1,...,Pik].

Thus the set of all rows from partitions Pi1,...,Pik can be denoted as T
[Pi1,...,Pik].


The Problem of Partition Pruning
--------------------------------

For each query Q over a partitioned table T[P1,...Pn] and some other, possibly 
partitioned tables T1,...,Tm we can consider an operation of partition pruning 
for partitions of the table T. 

If Q is a query over a table and R is a set of rows from this table then Q(R) 
will denote the result set for the query evaluated for the table T populated 
only with rows from the subset R.

In its most general form the operation of partition pruning for a query Q can 
be defined as a function PP that returns a subset of partitions Pi1,...Pik such 
that
   Q(T,T1,...,Tm) = Q(T[Pi1,...,Pik],T1,...Tm).

If this subset is minimal then we say that the function performs full partition 
pruning.

It is obvious that if two different subsets of partitions P={Pi1,...,Pik} and 
P’={Pj1,...,Pjl} can be returned as results of partition pruning then the 
intersect of this subsets P INTERSECT P’ can be yielded as a result of 
partition pruning as well.
In other words:
(Q(T,T1,...,Tm) = Q(T[Pi1,...,Pik],T1,...Tm) AND
 Q(T,T1,...,Tm) = Q(T[Pj1,...,Pjl],T1,...Tm)) =>
 Q(T,T1,...,Tm) = Q(T[P INTERSECT P’],T1,...Tm).
This fact let us to talk about full partition pruning.
 

A procedure of partition pruning
--------------------------------

An efficient procedure of full partition pruning can be constructed for a wide 
class of queries.

Let C1,...,Ck be the columns that form the partition index.
Define a regular index Idx(C1,...,Ck) on table T and perform the procedure that 
extracts the a range condition RC for the query Q.

This condition can be represented by a sequence of disjoint intervals (ranges) 
over index Idx. The intervals can be open, closed and semi-closed.

Here is an example of such a sequence for a single component index over an 
integer column C:
(2, 4), [10, 17], (21, 32], [100, ).

The corresponding RC condition is:
(2 < C AND C < 4 ) OR (10 <= C AND C <=17) OR
(21 < C AND C <= 32) OR (100 <= C).

For any range condition RC extracted by this procedure we have:
  (r IN Q(R)) => RC(proj[Idx](r))=true.

Here proj[Idx](r) denotes the projection of row r on the columns of index Idx. 

The sequence of intervals corresponding to a range condition RC will be denoted 
by S(RC). 

It should be noted that an extraction of the RC sequence of intervals for an 
index is not a trivial operation in a general case even for single component 
indexes.

Let’s consider the following query:
  SELECT * FROM t1,t2 WHERE
     (t1.a < 5 OR t1.b=t2.b OR t1.a > 10) AND
     (t.a <= 20 OR t1.c=t2.c AND t1.a > 0)
with column ‘a’ forming the partition index for partitioned table ‘t1’.

First there will be extracted the formula:
(t1.a < 5  AND t1.a <= 20 OR 
 t1.a > 10 AND t1.a <= 20 OR
 t1.a < 5  AND t1.a > 0   OR
 t1.a > 10 AND t1.a > 0)

This formula will be simplified to:
(t1.a > 10 AND t1.a <= 20) OR (t1.a > 0 AND t1.a < 5)

This formula specifies the following intervals for the partition index:
(0, 5) (10, 20].  

If the query Q is such that:
  - it contains a where condition C
  - it does not contain any grouping or aggregate function
  - the pushdown condition for table T extracted from C is a AND-OR formula
    over comparison predicates
then the extracted range condition RC is guaranteed to be maximal, i.e.
there is no other range condition RC’ such that: RC(proj[Idx](r)) => RC’(proj
[Idx](r)) &
((r IN Q(R)) => RC(proj[Idx](r))=true for any instance of  table T).  

Here by comparison predicates we understand the predicates that use the 
following operators:
=, <>, <, <=, >, >=, BETWEEN, IN, IS NULL, IS NOT NULL.

Let table T be defined over range partitions as follows:
CREATE TABLE T (
 ...
)
PARTITION BY RANGE (C) (
    PARTITION P1 VALUES LESS THAN (L1),
    PARTITION P2 VALUES LESS THAN (L2),
...
    PARTITION Pn VALUES LESS THAN (Ln)
)
where L1,...Ln are constants.
The partition ranges here are (,L1),[L1,L2)...[Ln,).

It’s easy to find a subsequence of this sequence of intervals Ri1,...,Rik such 
that for each Rij from this sequence we have (Rij INTERSECT S(RC) != 0). This 
subsequence will yield the result of partition pruning.

Apparently if RC is maximal range condition the described procedure performs 
full partition pruning.

If range partitions are defined with the help of a growing monotone function F:
CREATE TABLE T (
 ...
)
PARTITION BY RANGE (F(C)) (
    PARTITION P1 VALUES LESS THAN (L1),
    PARTITION P2 VALUES LESS THAN (L2),
...
    PARTITION Pn VALUES LESS THAN (Ln)
)
then to find the partition to be used we have:
  1. to get ranges F(S(RC))
  2. to select the subsequence of partition ranges that are intersected with
     F(S(RC)).

The same procedure can be used if F is an arbitrary function but S(RC) contains 
only single point intervals.

If hash partitioning is defined for table T and the extracted sequence range 
condition S(RC) consists only of single-point intervals we still can perform 
partition pruning: H(S(RC)) will yield the numbers of the partitions to be used.

Implementation of partition pruning
-----------------------------------

In the pseudo-code below list partitioning is considered as a special case of 
range group partitioning (see comments on this in the first section).

As a result of the actions presented by this pseudo-code some partitions and 
sub-partitions of a partition table T are pruned for a given query Q and those 
that have escaped pruning become specially marked.  

If after:
  not elimination,
  having merge,
  outer join elimination,
  equality propagation
there are no ON condition left in the join being processed then:
for each partition table T 
{ 
  1.Find components to form the partition index
    [for combined partitioning an index is created where
     columns used for hash partitioning follow columns   used for range 
     partitioning];
  2.Build internal structures for the partition index to extract a range
    condition for it RC;
  3.Extract the range condition RC;
  4.If (RC is empty) 
      Mark all partitions and optional sub-partitions;
    Else
    {
       If (this is a range partitioning and 
         (a monotone function F is used or
          all RC ranges are singletons) 
         Mark partitions for ranges intersected
         with F(S(RC));
       Else If (this is a hash partitioning and
                S(RC) is a sequence of single-point ranges)
         Mark partitions with indexes H(S(RC));
       Else If (this is a combined partitioning and 
                a monotone function F is used for range partitions) 
    {
         Mark all sub-partitions of partitions for ranges
         intersected with F(S(RC));
         If all RC ranges intersected for such a range
         are singletons S1,...,Sk then unmark
         sub-partitions that are not covered by
         H(S1),...,H(Sk);
    }
  }
}           

Note.
To handle a case where no partitions but some sub-partitions can be pruned we 
should have additionally applied range analysis to an index consisting only of 
the columns used for sub-partitioning.

The easiest way to mark partitions to be used after pruning is probably to add 
a new flag member to the st_table structure used by any handler.
To speed up iterations over marked partitions it makes sense also to add a 
field to link this marked partitions into a chain.

WL#2985 Partition Pruning - LLD
===============================

CONTENTS

1. Where partition pruning is invoked
2. Passing info about pruned partitions to the table handler
3. The pruning function
3.1 Partition index description construction
3.2 Changes in the code invoked from get_mm_tree()
3.2.1 Don't call field->optimize_range() when doing partition pruning
3.2.2 Allow construction of "index merge" tree for single index
3.3 Analysis of the produced SEL_TREE
3.3.1 A natural property: no redundant "partitions-in-interval analysis" calls
3.3.2 Do have redundant "subpartitions-in-interval analysis" calls though
3.4 [Sub]partitions-in-interval analysis
3.4.1 The generic case
3.4.2 The special case: use Partitioning interval analysis
3.4.2.1 Partitioning Interval analysis - interval mapping
3.4.2.1.1 Interval mapping: RANGE partitioning implementation
3.4.2.1.2 Interval mapping: LIST partitioning implementation
3.4.2.1.3 Interval mapping: Detection of monotonically increasing functions
3.4.2.2 Partitioning Interval analysis - interval walking
4. Other notes

1. Where partition pruning is invoked
-------------------------------------
Partition pruning will be invoked from JOIN::optimize(), before the GROUP BY
optimization (opt_sum_query) is applied.

The reason for doing pruning before the opt_sum_query() call is that
opt_sum_query may access the query table(s), and we want to access only
non-pruned away partitions in these table accesses.


Another possible place to invoke partition pruning would be in
make_join_statistics(), right before the get_quick_record_count() call. As
opposed to the "before opt_sum_query()" part, this part of the code is 
executed after the "const" tables have been read, so partition pruning will 
have a broader applicability when invoked here. 

Possible todo for the future:
We could make two partitioning pruning steps for cases where that would allow
us to prune away more partitions. The hypothesis is that if we've got a
SEL_TREE of type MAYBE (e.g. "t1.key < t2.key2") or KEY_SMALLER (e.g. 
"t1.key< 10 AND t1.key< t2.key2") when performing partition pruning before 
the opt_sum_query() call, and later we got some tables marked as "const" 
tables, then we would be able to determine (in O(1) time) if it would make 
sense to run partition pruning again (for some particular remaining non-
const table).


2. Passing info about pruned partitions to the table handler
------------------------------------------------------------
Information about used (ie. non-pruned-away) partitions will be stored
in partition_info::used_partitions bitmap:

bitmap_is_set(&used_partitions, X) <=> "partition with partition id X
                                        is used".

When partition pruning procedure is invoked, all partitions are assumed to be
unused.

The prune_partitions() function will modify the bitmap, so after its return
partition_info::used_partitions will indicate which partitions are used for
the query.

3. The pruning function
-----------------------
The prune_partitions() functions will do the following:

prune_partitions()
{
  construct partition index description;
  call get_mm_tree();
  analyze the produced SEL_TREE;
}


3.1 Construction of partition index description
-----------------------------------------------

table->part_info->[sub]part_field_array holds a duplicate-free array of table 
fields used in [sub]partitioning.

From these two arrays we create a description of a single partitioning index:

  partition_index(partition_fields, [subpartition_fields]).
 
Here 
KEY_PART::length=       field->pack_length_in_rec()
KEY_PART::store_length= { calculate it the same way as it is done in 
                          open_binary_frm() code }


If [sub]partition_fields contains a GEOMETRY field then [sub]partition_fields
is not included in the partition index description. The reason is that we
can't process geometry intervals (they have different semantics to which our
processing logic doesn't apply).
We'll do the same for ENUM fields to be on the safe side (which is probably
redundant)


3.2 Changes in the code invoked from get_mm_tree()
--------------------------------------------------

The following changes are required:

3.2.1 Don't call field->optimize_range() when doing partition pruning
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Currently get_mm_leaf() calls field->optimize_range() to determine if the
index supports scans on non-singlepoint intervals (currently all table engines
support it except HEAP tables with HASH indexes).


Field::optimize_range() calls handler::index_flags(HA_READ_RANGE) so we must
not make field->optimize_range() calls when doing partition pruning.


*perhaps*, when doing partition pruning, we should instead of
field->optimize_range() call make call to the following function:

bool optimize_range_for_partitioning_index()
{
  if (this keypart of partitioning index refers to a type of partitioning that
      doesn't allow partition pruning on non-singlepoint intervals)
  {
    /* 
      example when we get here: we try to construct a "t.field < const"
     interval for "PARTITION BY HASH(somefunc(t.field))" 
    */
    return FALSE; 
  }
  else
    return TRUE;
}

However, I'm not sure if this is worth doing: 
 * We'll not be able to do partition pruning for cases like 
   "t.key >= 10 AND t.key <= 10", where several non-singlepoint intervals form 
   one singlepoint interval.
 * I can't immediately with 100% assurance tell that using the above function
   provides 100% assurance that no singlepoint ranges will be generated (There
   is a second filtering step in records_in_range() implementation,
   ha_heap::records_in_range() does make the check if it is invoked for
   singlepoint interval. The presense of that check implies that range analysis
   code can't guarantee that all constructed intervals will be singlepoint?

Considering the amount time needed to fully figure this out, we'll do this:

when range analysis is invoked from prune_partitions(), let get_mm_leaf() 
assume that field->optimize_range() has returned TRUE. The "analyze the
produced SEL_TREE" step in prune_partitions() will check if the produced
intervals are singlepoint when that is required.

In the worst case this will create only a relatively minor inefficiency.


3.2.2 Allow construction of "index merge" tree for single index
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We've recently discovered this property of the range analyzer: it can create
"index merge" trees for one, single index. This is only done when it is not
possible to create a single range tree at all. Here are simple examples:

"keypart1=c1 OR keypart2=c2",  "keypart1c2"

Here the range optimizer will create an index_merge tree with two subtrees.

This did not cause invalid index_merge plans to be generated because such
single-index index_merge trees were discarded at later stage - one of the
trees doesn't cover first keypart and is not suitable for record retrieval.

For partition pruning case, this property provides some advantage: it allows 
to perform partition pruning for a broader set of conditions.

Consider an example:
let the table be partitioned by 
  PARTITION BY RANGE(t.a) SUBPARTITION BY HASH(t.b),
and let the condition be "t1.a = 1 OR t1.b = 2".  Here the range analyzer will
construct index_merge of two trees, we'll be able to peform partition pruning
for both and produce a union of used partitions.

Code-wise this translates into the following:

We'll introduce a range optimizer parameter that will control the creation of 
index merge trees where the merged trees do not represent valid index scans.

The parameter will be set to FALSE (don't create) when the range analyzer is
invoked from range/index_merge optimizer.

The parameter will be set to TRUE (do create) when the range analyzer is
invoked from prune_partitions().

The code for "analyze the produced SEL_TREE" step in prune_partitions() will 
be able to handle all 3 cases:

  1. The produced SEL_TREE has one SEL_ARG* graph that represents a list of
     intervals. The analysis of SEL_TREE objects of this type is described in
     the next section.
     Example: "part_field1 = c1 OR part_field1 = c2" 
     
  2. The produced SEL_TREE is an "index merge" SEL_TREE, i.e. it represents
       "tree1 OR tree2 OR ... treeN" 
     where tree{i} is a tree like in #1. In this case we'll produce a union of 
     partitions used by each of the trees.
     Example: "part_field1 = c1 OR subpart_field1 = c2" 

  3. The produced SEL_TREE is a list of "index merge" SEL_TREEs, i.e. it
     represents "merge_tree1 AND merge_tree2 ... AND merge_treeN",
     where each merge_tree{i} is of type described in #2.
     In this case we'll produce a set of used partitions for each of
     merge_tree{i} and compute an intersection.
     Example: "(part_field1 < c1 OR subpart_field1 = c2) AND 
               (part_field1 > c3 OR subpart_field1 = c4)" 


3.3 Analysis of the produced SEL_TREE
-------------------------------------

See the previous section for the description of analysis of SEL_TREE objects
that represents merges of several scans. Now we'll describe analysis of
SEL_TREE object that represents a single list of intervals (referred to as
case #1 in the previous section).

The analysis function will be modeled after the check_quick_keys() function
that does a similar job for the range optimizer.

A SEL_ARG graph has 2 "dimensions":  left/right and next_key_part.
The function will traverse the dimensions via self-recursion.

The following picture demonstrates a possible SEL_ARG graph (with lots of
edges removed for simplicity).

The up-down connections are connections via SEL_ARG::left and SEL_ARG::right.
A horizontal connection to the right is the SEL_ARG::next_key_part connection.

(start)
 |                           $
 |   Partitioning keyparts   $  subpartitioning keyparts
 |                           $
 |     ...          ...      $
 |      |            |       $
 | +---------+  +---------+  $  +-----------+  +-----------+
 \-| par1=c1 |--| par2=c2 |-----| subpar1=c3|--| subpar2=c5|    (**)
   +---------+  +---------+  $  +-----------+  +-----------+
        |            |       $        |             |
        |            |       $        |        +-----------+ 
        |            |       $        |        | subpar2=c6|
        |            |       $        |        +-----------+ 
        |            |       $        |
        |            |       $  +-----------+  +-----------+
        |            |       $  | subpar1=c4|--| subpar2=c8|
        |            |       $  +-----------+  +-----------+
        |            |       $         
        |            |       $         
        |       +---------+  $  +-----------+
        |       | par2=c9 |-----| subpar1=c9|
        |       +---------+  $  +-----------+
        |            |       $
        |           ...      $
        |                    $
        |                    $
   +---------+               $  +------------+ 
   | par1>c2 |------------------| subpar1=c10|--...            (***)
   +---------+               $  +------------+
        |                    $
       ...                   $
                             $
(c{i} are marked arbitrarily and dont convey any meaning)

The traversal of left/right pointers will be performed via head/tail recursion
(Mikael has pointed out that we don't actually need recursion here, we can use
a loop instead and remove the possibility of stack overrun when processing 
very long conditions like "key=c1 OR key=c2 OR ... key1M= c1M". For now, we'll
stick with recursion for the sake of code simplicity. The same kind of 
recursion is done in check_quick_keys() and there were no complaints this far).

The traversal of SEL_ARG::next_key_part axis will be performed via recursion as
well.
From the above picture it is apparent that this traversal will proceed 
according to the following scenario:

1. Get the constraints on partitioning fields. 
   Depending on what SEL_ARG graph we've got, we may get constraints for all
   partitioning fields (e.g. see line marked with (**) on the picture), 
   some of them (see line (***) on the picture), or none at all (not displayed
   on the pic.)
   
   If we get suitable (we'll describe below what exactly is suitable)
   constraints for all partitioning fields, (this can be checked when we cross 
   the $-signs line on the picture) we perform the "partitions-in-interval 
   analysis" - obtain a set of partitions that may contain records that match
   the partitioning fields constraints, and proceed.

   If we don't get suitable constraints for all partitioning fields, we set
   the set of used partitions to be all-table-partitions, and proceed anyway.

2. Get the constraints on subpartitioning fields.
   If we get suitable constraints for all subpartitioning fields (this can be 
   fully checked when we reach the right ends of the horizontal chains in the 
   picture), we perform "subpartitions-in-interval analysis" - obtain a set of 
   subpartitions that may contain records that match the subpartitioning 
   fields constraints.
   Note that this set of subpartitions refers to every partition, i.e. the
   result of "subpartitions-in-interval analysis" can be expressed like 
   "within each partition, subpartition X must be used".
   
   If we don't get suitable constraints for all subpartitioning fields, we set
   the set of used subpartitions to be all-subpartitions.
   

3. Having completed steps 1 and 2, we have
   * a set of used partitions,
   * a set of used subpartitions.
   Now we can do this:
   
   for each used partition P
     for each used subpartition SP
      mark P_SP as used;
   
 
The SEL_TREE::left/right and SEL_TREE::next_key_part recursion causes the
steps 1-2-3 to be done for every possible 
 (*start)->next_key_part->... ->next_key_part path in the SEL_ARG graph.

3.3.1 A natural property: no redundant "partitions-in-interval analysis" calls
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Note that the "partitions-in-range analysis" will be done the least possible
number of times. An example for the above picture: for the two paths

path1:
  +---------+  +---------+  $  +-----------+  +-----------+
 -| par1=c1 |--| par2=c2 |-----| subpar1=c3|--| subpar2=c5|
  +---------+  +---------+  $  +-----------+  +-----------+
  
and path2:
  +---------+  +---------+  $  +-----------+  +-----------+
 -| par1=c1 |--| par2=c2 |-----| subpar1=c4|--| subpar2=c8|
  +---------+  +---------+  $  +-----------+  +-----------+

we cross the $-line only once, and therefore "partitions-in-range analysis" will 
be performed only once. This will be natural optimization as we're using 
recursion.

3.3.2 Do have redundant "subpartitions-in-interval analysis" calls though
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The "subpartitions in range analysis" will be performed more times then
strictly necessary.

A SEL_ARG graph may be constructed in a way that the restrictions on
subpartitions are shared. For example, in the picture the "par2=c9" could have
next_key_part pointing to "subpar1=c3". In that case we'll perform
subpartitioning analysis for "subpar1=c3 AND subpar2=c5" two times.

We could use some tricks to avoid that but in my opinion that is not worth 
doing at the current level.


3.4 [Sub]partitions-in-interval analysis
----------------------------------------
This section describes the [sub]partitioning interval analysis.

Task setting: We've got constraints on all fields used in [sub]partitioning,
i.e. we've got an interval.

Goal: Find the partitions that may contain records that satisfy this set of
constraints.

3.4.1 The generic case
~~~~~~~~~~~~~~~~~~~~~~
For any type of [sub]partitioning, the following analysis may be performed:

if (the interval is a singlepoint interval)
{
  /* Ok, the interval has form "field1=const1 AND ... AND fieldN=constN" */
  Save the field constants into the record buffer;

  /*
    Now find into which partition the rows with
    "(field1, ... fieldN) = (const1, ..., constN)"
    will go.
  */
  partition_id= get_part_partition_id(); // or get_subpartition_id()
}
else
{
  /* Can't infer anything */
}

Note that the above code uses only parts of partitioning interface that
already intentionally exposed as public (partition_info::get_part_partition_id, 
partition_info::get_subpartition_id).

3.4.2 The special case: use Partitioning interval analysis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For some types of partitioning it is possible to analyse a broader set of
restrictions then specified in section 3.4.1.
To perform such analysis, we introduce a concept of "Partitioning interval
analysis":

"Partitition interval analysis" is applicable for any type of 
[sub]partitioning done on the value of single field fieldX.

Task setting:
<<
  Given an interval
    "const1 <=? fieldX <=? const2"  (I)
  find a set of partitions that may contain records that have value of fieldX
  contained within the given interval.
>>

The implementation of Partitioning Interval Analysis varies depending on 
partitioning type and partitioning function, and so it will be incapsulated 
within sql_partition.cc.

The interface for Partitioning Interval Analysis will be as follows:

/*
  A type of function that does Partitioning Interval Analysis

  SYNOPSIS
    get_partitions_in_range_iterator()
      interval    IN   Description of interval over partitioning field.
      part_iter   OUT  Initialized iterator that allows to enumerate
                       partitions that cover the given interval.
*/
...

In partition_info class, we'll add two function pointers:

class partition_info 
{
public:
  ...
  /* 
    Pointer to function that performs Partitioning Interval Analysis 
    for partitioning, or NULL if analysis is not possible.
  */
  get_partitions_in_range_iterator get_part_iter_for_interval;
  
  /* Same as above but for subpartitioning */
  get_partitions_in_range_iterator get_subpart_iter_for_interval;

  ...
};
.

Following sections describe the implementations.

3.4.2.1 Partitioning Interval analysis - interval mapping
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Interval mapping can be used when partitioning is done using 
  
   PARTITION BY  unary_increasing_func(fieldX)

Here for the interval (I)

   "const1 CMP1 t.fieldX CMP2 const2"     (I)

we can obtain corresponding interval (II)

  part_func(const1) CMP1X part_func(t.fieldX) CMP2X part_func(const1)   (II)

by "mappping" the interval edges. In both intervals, CMPxx are '<' or '<='.
The conversion rules for comparision operators are as follows:
For strictly increasing functions, CMP1X == CMP1, CMP2X == CMP2.
For non-strictly increasing functions, CMP1X == CMP2X '<='.
(example: 
  t.fieldX < '2005-12-29'  maps to  YEAR(t.fieldX) <= YEAR('2005-12-29'
)

Having obtained the edges of interval (II), we can get the corresponding
partition ids:

3.4.2.1.1 Interval mapping: RANGE partitioning implementation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We'll use the partition_info::range_int_array. This is an ordered array where 
range_int_array[N] describes the interval of partition nuber N:

(range_int_array[i-1] <= part_func(rowX) < range_int_array[i]) => rowX goes to
partition #i.

Having performed the interval mapping, we can obtain (with two binary 
searches) first and last array elements that have non-empty intersection 
with interval (II).  If these two elements have indexes idx1 and idx2, then 
that means that partitions with partition id within [idx1, idx2] range are 
the partitions we'll need to use.

3.4.2.1.2 Interval mapping: LIST partitioning implementation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
partition_info::list_array is an array of  pairs, ordered
by the value field.

Each array element describes a singlepoint interval:

  part_func(rowX) == list_array[i].value => rowX goes to partition #i.

Like in RANGE partitioning, we can find two edge indexes idx1, idx2. Then the 
set of used partitions can be obtained by walking from idx1 to idx2 and
collecting partition ids from list_array[i].partition_id 

3.4.2.1.3 Interval mapping: Detection of monotonically increasing functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To check if Interval Mapping is applicable, we'll need to be able to check if
given Item* tree represents an unary monotonically increasing function.

This will be achieved by doing the following:

Add an enum:
typedef enum monotonicity_info 
{
   NON_MONOTONIC,              /* none of the below holds */
   MONOTONIC_INCREASING,       /* F() is unary and (x < y) => (F(x) <= F(y)) */
   MONOTONIC_STRICT_INCREASING /* F() is unary and (x < y) => (F(x) <  F(y)) */
} enum_monotonicity_info;

Add new virtual method in class Item:

virtual enum_monotonicity_info Item::get_monotonicity_info() const 
{ return NON_MONOTONIC; }

Add non-default implementations for 
 * Item_func_year (that function is MONOTONIC_INCREASING).
 * Item_func_to_days (that function is MONOTONIC_INCREASING for DATETIME
   values and is MONOTONIC_STRICT_INCREASING for DATE values)
 * Item_field (assume it to be always MONOTONIC_STRICT_INCREASING, see
               concerns in section #4)


3.4.2.2 Partitioning Interval analysis - interval walking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This implementation is applicable for any type of partitioning when the
partitioning field has integer type. The idea is that interval (I):

   "const1 <=? t.fieldX <=? const2"

can be replaced with a sequence of singlepoint intervals:

    t.fieldX = const1      OR 
    t.fieldX = const1 + 1  OR
    t.fieldX = const1 + 2  OR
     ...                   OR
    t.fieldX = const2

For each of the singlepoint intervals, we can find the corresponding partition
(see section 3.4.1).

So, the interval analysis is performed by "walking" from const1 to const2 and
collecting partition ids.

It is apparent that this method will pay off if the "walk" is short enough.
For now, we've adopted the fillowing definition of "short enough":

 * number of values to walk must be less then number of [sub]partitions, and 
 * it must be less then some predefined constant MAX_WALK_INTERVAL.

There can be cases when both interval walking and interval mapping are
applicable. It is obvious that interval walking is less CPU-intensive, so
we'll use interval walking only if interval mapping is not applicable.

4. Other notes
--------------
* We assume that BUG#15447 has been fixed and RANGE partitioning code assumes
  that "NULL < -inf"

* We also ignore the problems with BIGINT UNSIGNED fields, as they are not
  handled correctly by partitioning code itself (see BUG#16002)


        


    
    
        Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.