When a replica using row-based replication format applies an
UPDATE
or
DELETE
operation, it must search
the relevant table for the matching rows. The algorithm used to
carry out this process uses one of the table's indexes to carry
out the search as the first choice, and a hash table if there
are no suitable indexes.
The algorithm first assesses the available indexes in the table definition to see if there is any suitable index to use, and if there are multiple possibilities, which index is the best fit for the operation. The algorithm ignores the following types of index:
Fulltext indexes.
Hidden indexes.
Generated indexes.
Multi-valued indexes.
Any index where the before-image of the row event does not contain all the columns of the index.
If there are no suitable indexes after ruling out these index types, the algorithm does not use an index for the search. If there are suitable indexes, one index is selected from the candidates, in the following priority order:
A primary key.
A unique index where every column in the index has a NOT NULL attribute. If more than one such index is available, the algorithm chooses the leftmost of these indexes.
Any other index. If more than one such index is available, the algorithm chooses the leftmost of these indexes.
If the algorithm is able to select a primary key or a unique
index where every column in the index has a NOT
NULL
attribute, it uses this index to iterate over the
rows in the UPDATE
or
DELETE
operation. For each row in
the row event, the algorithm looks up the row in the index to
locate the table record to update. If no matching record is
found, it returns the error
ER_KEY_NOT_FOUND and stops the
replication applier thread.
If the algorithm was not able to find a suitable index, or was
only able to find an index that was non-unique or contained
nulls, a hash table is used to assist in identifying the table
records. The algorithm creates a hash table containing the rows
in the UPDATE
or
DELETE
operation, with the key as
the full before-image of the row. The algorithm then iterates
over all the records in the target table, using the selected
index if it found one, or else performing a full table scan. For
each record in the target table, it determines whether that row
exists in the hash table. If the row is found in the hash table,
the record in the target table is updated, and the row is
deleted from the hash table. When all the records in the target
table have been checked, the algorithm verifies whether the hash
table is now empty. If there are any unmatched rows remaining in
the hash table, the algorithm returns the error
ER_KEY_NOT_FOUND and stops the
replication applier thread.
The
slave_rows_search_algorithms
system variable was previously used to control how rows are
searched for matches. The use of this system variable is now
deprecated, because the default setting, which uses an index
scan followed by a hash scan as described above, is optimal for
performance and works correctly in all scenarios.