Documentation Home
MySQL 5.7 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 37.9Mb
PDF (A4) - 38.0Mb
PDF (RPM) - 37.3Mb
HTML Download (TGZ) - 10.3Mb
HTML Download (Zip) - 10.3Mb
HTML Download (RPM) - 8.9Mb
Man Pages (TGZ) - 216.8Kb
Man Pages (Zip) - 329.5Kb
Info (Gzip) - 3.4Mb
Info (Zip) - 3.4Mb
Excerpts from this Manual

MySQL 5.7 Reference Manual  /  ...  /  Handling an Unexpected Halt of a Replication Slave

16.3.2 Handling an Unexpected Halt of a Replication Slave

In order for replication to be resilient to unexpected halts of the server (sometimes described as crash-safe) it must be possible for the slave to recover its state before halting. This section describes the impact of an unexpected halt of a slave during replication and how to configure a slave for the best chance of recovery to continue replication.

After an unexpected halt of a slave, upon restart the I/O thread must recover the information about which transactions have been received, and the SQL thread must recover which transactions have been executed already. For information on the slave logs required for recovery, see Section 16.2.4, “Replication Relay and Status Logs”. The information required for recovery was traditionally stored in files, which had the risk of losing synchrony with the master depending at which stage of processing a transaction the slave halted at, or even corruption of the files themselves. In MySQL 5.7 you can instead use tables to store this information. These tables are created using InnoDB, and by using this transactional storage engine the information is always recoverable upon restart. To configure MySQL 5.7 to store the replication information in tables, set relay_log_info_repository and master_info_repository to TABLE. The server then stores information required for the recovery of the I/O thread in the mysql.slave_master_info table and information required for the recovery of the SQL thread in the mysql.slave_relay_log_info table.

Exactly how a replication slave recovers from an unexpected halt is influenced by the chosen method of replication, whether the slave is single-threaded or multi-threaded, the setting of variables such as relay_log_recovery, and whether features such as MASTER_AUTO_POSITION are being used.

The following table shows the impact of these different factors on how a single-threaded slave recovers from an unexpected halt.

Table 16.5 Factors Influencing Single-threaded Replication Slave Recovery

GTID

MASTER_AUTO_POSITION

relay_log_recovery

relay_log_info_repository

Crash type

Recovery guaranteed

Relay log impact

OFF

Any

1

TABLE

Any

Yes

Lost

OFF

Any

1

TABLE

Server

Yes

Lost

OFF

Any

1

Any

OS

No

Lost

OFF

Any

0

TABLE

Server

Yes

Remains

OFF

Any

0

TABLE

OS

No

Remains

ON

ON

Any

Any

Any

Yes

Lost

ON

OFF

0

TABLE

Server

Yes

Remains

ON

OFF

0

Any

OS

No

Remains


As the table shows, when using a single-threaded slave the following configurations are most resilient to unexpected halts:

The following table shows the impact of these different factors on how a multi-threaded slave recovers from an unexpected halt.

Table 16.6 Factors Influencing Multi-threaded Replication Slave Recovery

GTID

sync_relay_log

MASTER_AUTO_POSITION

relay_log_recovery

relay_log_info_repository

Crash type

Recovery guaranteed

Relay log impact

OFF

1

Any

1

TABLE

Any

Yes

Lost

OFF

>1

Any

1

TABLE

Server

Yes

Lost

OFF

>1

Any

1

Any

OS

No

Lost

OFF

1

Any

0

TABLE

Server

Yes

Remains

OFF

1

Any

0

TABLE

OS

No

Remains

ON

Any

ON

Any

Any

Any

Yes

Lost

ON

1

OFF

0

TABLE

Server

Yes

Remains

ON

1

OFF

0

Any

OS

No

Remains


As the table shows, when using a multi-threaded slave the following configurations are most resilient to unexpected halts:

It is important to note the impact of sync_relay_log=1, which requires a write of to the relay log per transaction. Although this setting is the most resilient to an unexpected halt, with at most one unwritten transaction being lost, it also has the potential to greatly increase the load on storage. Without sync_relay_log=1, the effect of an unexpected halt depends on how the relay log is handled by the operating system. Also note that when relay_log_recovery=0, the next time the slave is started after an unexpected halt the relay log is processed as part of recovery. After this process completes, the relay log is deleted.

An unexpected halt of a multi-threaded replication slave using the recommended file position based replication configuration above may result in a relay log with transaction inconsistencies (gaps in the sequence of transactions) caused by the unexpected halt. See Section 16.4.1.34, “Replication and Transaction Inconsistencies”. In MySQL 5.7.13 and later, if the relay log recovery process encounters such transaction inconsistencies they are filled and the recovery process continues automatically. In MySQL versions prior to MySQL 5.7.13, this process is not automatic and requires starting the server with relay_log_recovery=0, starting the slave with START SLAVE UNTIL SQL_AFTER_MTS_GAPS to fix any transaction inconsistencies and then restarting the slave with relay_log_recovery=1.

When you are using multi-source replication and relay_log_recovery=1, after restarting due to an unexpected halt all replication channels go through the relay log recovery process. Any inconsistencies found in the relay log due to an unexpected halt of a multi-threaded slave are filled.


User Comments
Sign Up Login You must be logged in to post a comment.