Documentation Home
MySQL Replication
Related Documentation Download this Excerpt
PDF (US Ltr) - 1.5Mb
PDF (A4) - 1.5Mb
HTML Download (TGZ) - 384.5Kb
HTML Download (Zip) - 398.8Kb

MySQL Replication  /  ...  /  Replication Retries and Timeouts

4.1.31 Replication Retries and Timeouts

The global system variable slave_transaction_retries sets the maximum number of times for applier threads on a single-threaded or multithreaded replication slave to automatically retry failed transactions before stopping. Transactions are automatically retried when the SQL thread fails to execute them because of an InnoDB deadlock, or when the transaction's execution time exceeds the InnoDB innodb_lock_wait_timeout value. If a transaction has a non-temporary error that will prevent it from ever succeeding, it is not retried.

The default setting for slave_transaction_retries is 10, meaning that a failing transaction with an apparently temporary error is retried 10 times before the applier thread stops. Setting the variable to 0 disables automatic retrying of transactions. On a multithreaded slave, the specified number of transaction retries can take place on all applier threads of all channels. The Performance Schema table replication_applier_status shows the total number of transaction retries that took place on each replication channel, in the COUNT_TRANSACTIONS_RETRIES column.

The process of retrying transactions can cause lag on a replication slave or on a Group Replication group member, which can be configured as a single-threaded or multithreaded slave. The Performance Schema table replication_applier_status_by_worker shows detailed information on transaction retries by the applier threads on a single-threaded or multithreaded slave. This data includes timestamps showing how long it took the applier thread to apply the last transaction from start to finish (and when the transaction currently in progress was started), and how long this was after the commit on the original master and the immediate master. The data also shows the number of retries for the last transaction and the transaction currently in progress, and enables you to identify the transient errors that caused the transactions to be retried. You can use this information to see whether transaction retries are the cause of replication lag, and investigate the root cause of the failures that led to the retries.