Index Merge
is used when table condition
can be converted to form:
cond_1 OR cond_2 ... OR cond_N
The conditions for conversion are that each
cond_i
can be used for a range scan, and
no pair (cond_i
,
cond_j
) uses the same index. (If
cond_i
and cond_j
use
the same index, then cond_i OR cond_j
can
be combined into a single range scan and no merging is
necessary.)
For example, Index Merge
can be used for
the following queries:
SELECT * FROM t WHERE key1=c1 OR key2<c2 OR key3 IN (c3,c4);
SELECT * FROM t WHERE (key1=c1 OR key2<c2) AND nonkey=c3;
Index Merge
is implemented as a
“container” for range key scans constructed
from cond_i
conditions. When doing
Index Merge
, MySQL retrieves rows for
each of the keyscans and then runs them through a duplicate
elimination procedure. Currently the
Unique
class is used for duplicate
elimination.
A single SEL_TREE
object cannot be
constructed for conditions that have different members of
keys in the OR
clause, like in condition:
key1 < c1 OR key2 < c2
Beginning with MySQL 5.0, these conditions are handled with
the Index Merge
method, and its range
optimizer structure, class SEL_IMERGE
.
SEL_IMERGE
represents a disjunction of
several SEL_TREE
objects, which can be
expressed as:
sel_imerge_cond = (t_1 OR t_1 OR ... OR t_n)
where each of t_i
stands for a
SEL_TREE
object, and no pair
(t_i
, t_j
) of distinct
SEL_TREE
objects can be combined into
single SEL_TREE
object.
The current implementation builds
SEL_IMERGE
only if no single
SEL_TREE
object can be built for the part
of the query condition it has analyzed, and discards
SEL_TREE
immediately if it discovers that
a single SEL_TREE
object can be
constructed. This is actually a limitation, and can cause
worse row retrieval strategy to be used. E.g. for query:
SELECT * FROM t WHERE (goodkey1=c1 OR goodkey1=c2) AND badkey=c3
scan on badkey
will be chosen even if
Index Merge
on
(goodkey1
, goodkey
)
would be faster.
The Index Merge
optimizer collects a list
of possible ways to access rows with Index
Merge
. This list of SEL_IMERGE
structures represents the following condition:
(t_11 OR t_12 OR ... OR t_1k) AND
(t_21 OR t_22 OR ... OR t_2l) AND
... AND
(t_M1 OR t_M2 OR ... OR t_mp)
where t_ij
is one
SEL_TREE
and one line is for one
SEL_IMERGE
object.
The SEL_IMERGE
object with
minimal cost is used for row retrieval.
In sql/opt_range.cc
, see
imerge_list_and_list()
,
imerge_list_or_list()
, and
SEL_IMERGE
class member functions for
more details of Index Merge
construction.
See the get_index_merge_params
function
in the same file for Index Merge
cost
calculation algorithm.
For range
queries, the MySQL optimizer
builds a SEL_TREE
object which represents
a condition in this form:
range_cond = (cond_key_1 AND cond_key_2 AND ... AND cond_key_N)
Each of cond_key_i
is a condition that
refers to components of one key. MySQL creates a
cond_key_i
condition for each of the
usable keys. Then the cheapest condition
cond_key_i
is used for doing range scan.
A single cond_key_i
condition is
represented by a pointer-linked network of
SEL_ARG
objects. Each
SEL_ARG
object refers to particular part
of the key and represents the following condition:
sel_arg_cond= (inf_val < key_part_n AND key_part_n < sup_val) (1)
AND next_key_part_sel_arg_cond (2)
OR left_sel_arg_cond (3)
OR right_sel_arg_cond (4)
is for an interval, possibly without upper or lower bound, either including or not including boundary values.
is for a
SEL_ARG
object with condition on next key component.is for a
SEL_ARG
object with an interval on the same field as thisSEL_ARG
object. Intervals between the current and “left” objects are disjoint andleft_sel_arg_cond.sup_val <= inf_val
.is for a
SEL_ARG
object with an interval on the same field as thisSEL_ARG
object. Intervals between the current and “right” objects are disjoint andleft_sel_arg_cond.min_val >= max_val
.
MySQL is able to convert arbitrary-depth nested AND-OR conditions to the above conjunctive form.
Index Merge
works in two steps:
Preparation step:
activate 'index only';
foreach key_i in (key_scans \ clustered_pk_scan)
{
while (retrieve next (key, rowid) pair from key_i)
{
if (no clustered PK scan ||
row doesn't match clustered PK scan condition)
put rowid into Unique;
}
}
deactivate 'index only';
Row retrieval step:
for each rowid in Unique
{
retrieve row and pass it to output;
}
if (clustered_pk_scan)
{
while (retrieve next row for clustered_pk_scan)
pass row to output;
}
See:
sql/opt_range.cc
,
QUICK_INDEX_MERGE_SELECT
class members
for Index Merge
row retrieval code.