It is safe to shut down a replication source server and restart
it later. When a replica loses its connection to the source, the
replica tries to reconnect immediately and retries periodically
if that fails. The default is to retry every 60 seconds. This
may be changed with the CHANGE REPLICATION
SOURCE TO
statement (from MySQL 8.0.23) or
CHANGE MASTER TO
statement
(before MySQL 8.0.23). A replica also is able to deal with
network connectivity outages. However, the replica notices the
network outage only after receiving no data from the source for
replica_net_timeout
or
slave_net_timeout
seconds. If
your outages are short, you may want to decrease the value of
replica_net_timeout
or
slave_net_timeout
. See
Section 19.4.2, “Handling an Unexpected Halt of a Replica”.
An unclean shutdown (for example, a crash) on the source side
can result in the source's binary log having a final position
less than the most recent position read by the replica, due to
the source's binary log file not being flushed. This can cause
the replica not to be able to replicate when the source comes
back up. Setting sync_binlog=1
in the source server's my.cnf
file helps to
minimize this problem because it causes the source to flush its
binary log more frequently. For the greatest possible durability
and consistency in a replication setup using
InnoDB
with transactions, you should also set
innodb_flush_log_at_trx_commit=1
.
With this setting, the contents of the InnoDB
redo log buffer are written out to the log file at each
transaction commit and the log file is flushed to disk. Note
that the durability of transactions is still not guaranteed with
this setting, because operating systems or disk hardware may
tell mysqld that the flush-to-disk operation
has taken place, even though it has not.
Shutting down a replica cleanly is safe because it keeps track of where it left off. However, be careful that the replica does not have temporary tables open; see Section 19.5.1.31, “Replication and Temporary Tables”. Unclean shutdowns might produce problems, especially if the disk cache was not flushed to disk before the problem occurred:
For transactions, the replica commits and then updates
relay-log.info
. If an unexpected exit occurs between these two operations, relay log processing proceeds further than the information file indicates and the replica re-executes the events from the last transaction in the relay log after it has been restarted.A similar problem can occur if the replica updates
relay-log.info
but the server host crashes before the write has been flushed to disk. To minimize the chance of this occurring, setsync_relay_log_info=1
in the replicamy.cnf
file. Settingsync_relay_log_info
to 0 causes no writes to be forced to disk and the server relies on the operating system to flush the file from time to time.
The fault tolerance of your system for these types of problems is greatly increased if you have a good uninterruptible power supply.