Pre-General Availability Draft: 2018-02-20
Inconsistencies in the sequence of transactions that have been executed from the relay log can occur depending on your replication configuration. This section explains how to avoid inconsistencies and solve any problems they cause.
The following types of inconsistencies can exist:
Half-applied transactions. A transaction which updates non-transactional tables has applied some but not all of its changes.
Gaps. A gap is a transaction that has not been (fully) applied, even though some later transaction has been applied. Gaps can only appear when using a multi-threaded slave. To avoid gaps occurring, set
slave_preserve_commit_order=1, which requires
slave_parallel_type=LOGICAL_CLOCK, and that binary logging (the
log_binsystem variable) and slave update logging (the
--log-slave-updates) are also enabled.
Gap-free low-watermark position. Even in the absence of gaps, it is possible that transactions after
Exec_master_log_poshave been applied. That is, all transactions up to point
Nhave been applied, and no transactions after
Nhave been applied, but
Exec_master_log_poshas a value smaller than
N.This can only happen on multi-threaded slaves. Enabling
slave_preserve_commit_orderdoes not prevent gap-free low-watermark positions.
The following scenarios are relevant to the existence of half-applied transactions, gaps, and gap-free low-watermark position inconsistencies:
While slave threads are running, there may be gaps and half-applied transactions.
mysqld shuts down. Both clean and unclean shutdown abort ongoing transactions and may leave gaps and half-applied transactions.
KILLof replication threads (the SQL thread when using a single-threaded slave, the coordinator thread when using a multi-threaded slave). This aborts ongoing transactions and may leave gaps and half-applied transactions.
Error in applier threads. This may leave gaps. If the error is in a mixed transaction, that transaction is half-applied. When using a multi-threaded slave, workers which have not received an error complete their queues, so it may take time to stop all threads.
STOP SLAVEwhen using a multi-threaded slave. After issuing
STOP SLAVE, the slave waits for any gaps to be filled and then updates
Exec_master_log_pos. This ensures it never leaves gaps or gap-free low-watermark positions, unless any of the cases above applies (in other words, before
STOP SLAVEcompletes, either an error happens, or another thread issues
KILL, or the server restarts. In these cases,
STOP SLAVEreturns successfully.)
If the last transaction in the relay log is only half-received and the multi-threaded slave coordinator has started to schedule the transaction to a worker, then
STOP SLAVEwaits up to 60 seconds for the transaction to be received. After this timeout, the coordinator gives up and aborts the transaction. If the transaction is mixed, it may be left half-completed.
STOP SLAVEwhen using a single-threaded slave. If the ongoing transaction only updates transactional tables, it is rolled back and
STOP SLAVEstops immediately. If the ongoing transaction is mixed,
STOP SLAVEwaits up to 60 seconds for the transaction to complete. After this timeout, it aborts the transaction, so it may be left half-completed.
The global variable
unrelated to the process of stopping the replication threads. It
only makes the client that issues
SLAVE return to the client, but the replication
threads continue to try to stop.
If a replication channel has gaps, it has the following consequences:
The slave database is in a state that may never have existed on the master.
SHOW SLAVE STATUSis only a "low-watermark". In other words, transactions appearing before the position are guaranteed to have committed, but transactions after the position may have committed or not.
After applying the dump on another server, and starting the replication threads, transactions appearing after the position are replicated again. Note that this is harmless if GTIDs are enabled (however, in that case it is not recommended to use
If a replication channel has a gap-free low-watermark position, cases 2 to 5 above apply, but case 1 does not.
The gap-free low-watermark position information is persisted in
binary format in the internal table
[SQL_THREAD] always consults this information so that
it applies only the correct transactions. This remains true even
been changed to 0 before
SLAVE, and even if
SLAVE is used with
START SLAVE UNTIL
SQL_AFTER_MTS_GAPS only applies as many transactions
as needed in order to fill in the gaps. If
START SLAVE is used with
UNTIL clauses that tell it to stop before it
has consumed all the gaps, then it leaves remaining gaps.
RESET SLAVE removes the relay
logs and resets the replication position. Thus issuing
RESET SLAVE on a slave with
gaps means the slave loses any information about the gaps,
without correcting the gaps.
ensures that there are no gaps. However, it is still possible
Exec_master_log_pos is just a gap-free
low-watermark position in scenarios 1 to 4 above. That is, there
may be transactions after
which have been applied. Therefore the cases numbered 2 to 5
above (but not case 1) apply, even when