MySQL 5.6.3 implements a method of joining tables called the Batched Key Access (BKA) join algorithm. BKA can be applied when there is an index access to the table produced by the second join operand. Like the BNL join algorithm, the BKA join algorithm employs a join buffer to accumulate the interesting columns of the rows produced by the first operand of the join operation. Then the BKA algorithm builds keys to access the table to be joined for all rows in the buffer and submits these keys in a batch to the database engine for index lookups. The keys are submitted to the engine through the Multi-Range Read (MRR) interface (see Section 8.13.11, “Multi-Range Read Optimization”). After submission of the keys, the MRR engine functions perform lookups in the index in an optimal way, fetching the rows of the joined table found by these keys, and starts feeding the BKA join algorithm with matching rows. Each matching row is coupled with a reference to a row in the join buffer.
For BKA to be used, the
flag of the
system variable must be set to
on. BKA uses
MRR, so the
mrr flag must also be
on. Currently, the cost estimation for MRR
is too pessimistic. Hence, it is also necessary for
mrr_cost_based to be
for BKA to be used. The following setting enables BKA:
There are two scenarios by which MRR functions execute:
The first scenario is used for conventional disk-based
storage engines such as
MyISAM. For these engines,
usually the keys for all rows from the join buffer are
submitted to the MRR interface at once. Engine-specific
MRR functions perform index lookups for the submitted
keys, get row IDs (or primary keys) from them, and then
fetch rows for all these selected row IDs one by one by
request from BKA algorithm. Every row is returned with an
association reference that enables access to the matched
row in the join buffer. The rows are fetched by the MRR
functions in an optimal way: They are fetched in the row
ID (primary key) order. This improves performance because
reads are in disk order rather than random order.
The second scenario is used for remote storage engines
NDB. A package of
keys for a portion of rows from the join buffer, together
with their associations, is sent by a MySQL Server (SQL
node) to MySQL Cluster data nodes. In return, the SQL node
receives a package (or several packages) of matching rows
coupled with corresponding associations. The BKA join
algorithm takes these rows and builds new joined rows.
Then a new set of keys is sent to the data nodes and the
rows from the returned packages are used to build new
joined rows. The process continues until the last keys
from the join buffer are sent to the data nodes, and the
SQL node has received and joined all rows matching these
keys. This improves performance because fewer key-bearing
packages sent by the SQL node to the data nodes means
fewer round trips between it and the data nodes to perform
the join operation.
With the first scenario, a portion of the join buffer is reserved to store row IDs (primary keys) selected by index lookups and passed as a parameter to the MRR functions.
There is no special buffer to store keys built for rows from the join buffer. Instead, a function that builds the key for the next row in the buffer is passed as a parameter to the MRR functions.