The global value of the system variable
replica_transaction_retries
sets the maximum number of times for applier threads on a
single-threaded or multithreaded replica to automatically retry
failed transactions before stopping. Transactions are
automatically retried when the SQL thread fails to execute them
because of an InnoDB
deadlock, or when the
transaction's execution time exceeds the
InnoDB
innodb_lock_wait_timeout
value.
If a transaction has a non-temporary error that prevents it from
succeeding, it is not retried.
The default setting for
replica_transaction_retries
is
10, meaning that a failing transaction with an apparently
temporary error is retried 10 times before the applier thread
stops. Setting the variable to 0 disables automatic retrying of
transactions. On a multithreaded replica, the specified number
of transaction retries can take place on all applier threads of
all channels. The Performance Schema table
replication_applier_status
shows
the total number of transaction retries that took place on each
replication channel, in the
COUNT_TRANSACTIONS_RETRIES
column.
The process of retrying transactions can cause lag on a replica
or on a Group Replication group member, which can be configured
as a single-threaded or multithreaded replica. The Performance
Schema table
replication_applier_status_by_worker
shows detailed information on transaction retries by the applier
threads on a single-threaded or multithreaded replica. This data
includes timestamps showing how long it took the applier thread
to apply the last transaction from start to finish (and when the
transaction currently in progress was started), and how long
this was after the commit on the original source and the
immediate source. The data also shows the number of retries for
the last transaction and the transaction currently in progress,
and enables you to identify the transient errors that caused the
transactions to be retried. You can use this information to see
whether transaction retries are the cause of replication lag,
and investigate the root cause of the failures that led to the
retries.