MySQL Internals Manual  /  ...  /  The Index Merge Join Type

7.2.2.5 The Index Merge Join Type

7.2.2.5.1 Overview

Index Merge is used when table condition can be converted to form:

cond_1 OR cond_2 ... OR cond_N

The conditions for conversion are that each cond_i can be used for a range scan, and no pair (cond_i, cond_j) uses the same index. (If cond_i and cond_j use the same index, then cond_i OR cond_j can be combined into a single range scan and no merging is necessary.)

For example, Index Merge can be used for the following queries:

SELECT * FROM t WHERE key1=c1 OR key2<c2 OR key3 IN (c3,c4);

SELECT * FROM t WHERE (key1=c1 OR key2<c2) AND nonkey=c3;

Index Merge is implemented as a container for range key scans constructed from cond_i conditions. When doing Index Merge, MySQL retrieves rows for each of the keyscans and then runs them through a duplicate elimination procedure. Currently the Unique class is used for duplicate elimination.

7.2.2.5.2 Index Merge Optimizer

A single SEL_TREE object cannot be constructed for conditions that have different members of keys in the OR clause, like in condition:

key1 < c1 OR key2 < c2

Beginning with MySQL 5.0, these conditions are handled with the Index Merge method, and its range optimizer structure, class SEL_IMERGE. SEL_IMERGE represents a disjunction of several SEL_TREE objects, which can be expressed as:

sel_imerge_cond = (t_1 OR t_1 OR ... OR t_n)

where each of t_i stands for a SEL_TREE object, and no pair (t_i, t_j) of distinct SEL_TREE objects can be combined into single SEL_TREE object.

The current implementation builds SEL_IMERGE only if no single SEL_TREE object can be built for the part of the query condition it has analyzed, and discards SEL_TREE immediately if it discovers that a single SEL_TREE object can be constructed. This is actually a limitation, and can cause worse row retrieval strategy to be used. E.g. for query:

SELECT * FROM t WHERE (goodkey1=c1 OR goodkey1=c2) AND badkey=c3

scan on badkey will be chosen even if Index Merge on (goodkey1, goodkey) would be faster.

The Index Merge optimizer collects a list of possible ways to access rows with Index Merge. This list of SEL_IMERGE structures represents the following condition:

  (t_11 OR t_12 OR ... OR t_1k) AND
  (t_21 OR t_22 OR ... OR t_2l) AND
   ...                          AND
  (t_M1 OR t_M2 OR ... OR t_mp)

where t_ij is one SEL_TREE and one line is for one SEL_IMERGE object.

The SEL_IMERGE object with minimal cost is used for row retrieval.

In sql/opt_range.cc, see imerge_list_and_list(), imerge_list_or_list(), and SEL_IMERGE class member functions for more details of Index Merge construction.

See the get_index_merge_params function in the same file for Index Merge cost calculation algorithm.

7.2.2.5.3 The range Optimizer

For range queries, the MySQL optimizer builds a SEL_TREE object which represents a condition in this form:

range_cond = (cond_key_1 AND cond_key_2 AND ... AND cond_key_N)

Each of cond_key_i is a condition that refers to components of one key. MySQL creates a cond_key_i condition for each of the usable keys. Then the cheapest condition cond_key_i is used for doing range scan.

A single cond_key_i condition is represented by a pointer-linked network of SEL_ARG objects. Each SEL_ARG object refers to particular part of the key and represents the following condition:

  sel_arg_cond= (inf_val < key_part_n AND key_part_n < sup_val) (1)
                AND next_key_part_sel_arg_cond                  (2)
                OR left_sel_arg_cond                            (3)
                OR right_sel_arg_cond                           (4)
  1. is for an interval, possibly without upper or lower bound, either including or not including boundary values.

  2. is for a SEL_ARG object with condition on next key component.

  3. is for a SEL_ARG object with an interval on the same field as this SEL_ARG object. Intervals between the current and left objects are disjoint and left_sel_arg_cond.sup_val <= inf_val.

  4. is for a SEL_ARG object with an interval on the same field as this SEL_ARG object. Intervals between the current and right objects are disjoint and left_sel_arg_cond.min_val >= max_val.

MySQL is able to convert arbitrary-depth nested AND-OR conditions to the above conjunctive form.

7.2.2.5.4 Row Retrieval Algorithm

Index Merge works in two steps:

Preparation step:

activate 'index only';
foreach key_i in (key_scans \ clustered_pk_scan)
{
  while (retrieve next (key, rowid) pair from key_i)
  {
    if (no clustered PK scan ||
        row doesn't match clustered PK scan condition)
      put rowid into Unique;
  }
}
deactivate 'index only';

Row retrieval step:

for each rowid in Unique
{
  retrieve row and pass it to output;
}
if (clustered_pk_scan)
{
  while (retrieve next row for clustered_pk_scan)
   pass row to output;
}

See: sql/opt_range.cc, QUICK_INDEX_MERGE_SELECT class members for Index Merge row retrieval code.


User Comments
User comments in this section are, as the name implies, provided by MySQL users. The MySQL documentation team is not responsible for, nor do they endorse, any of the information provided here.
Sign Up Login You must be logged in to post a comment.