Several aspects of Group Replication's distributed recovery process can be configured to suit your system.
For state transfer from the binary log, Group Replication limits the number of attempts a joining member makes when trying to connect to a donor from the pool of donors. If the connection retry limit is reached without a successful connection, the distributed recovery procedure terminates with an error. Note that this limit specifies the total number of attempts that the joining member makes to connect to a donor. For example, if 2 group members are suitable donors, and the connection retry limit is set to 4, the joining member makes 2 attempts to connect to each of the donors before reaching the limit.
The default connection retry limit is 10. You can configure
this setting using the
group_replication_recovery_retry_count
system variable. The following statement sets the maximum
number of attempts to connect to a donor to 5:
mysql> SET GLOBAL group_replication_recovery_retry_count= 5;
For remote cloning operations, this limit does not apply. Group Replication makes only one connection attempt to each suitable donor for cloning, before starting to attempt state transfer from the binary log.
For state transfer from the binary log, the
group_replication_recovery_reconnect_interval
system variable defines how much time the distributed recovery
process should sleep between donor connection attempts. Note
that distributed recovery does not sleep after every donor
connection attempt. As the joining member is connecting to
different servers and not to the same one repeatedly, it can
assume that the problem that affects server A does not affect
server B. Distributed recovery therefore suspends only when it
has gone through all the possible donors. Once the server
joining the group has made one attempt to connect to each of
the suitable donors in the group, the distributed recovery
process sleeps for the number of seconds configured by the
group_replication_recovery_reconnect_interval
system variable. For example, if 2 group members are suitable
donors, and the connection retry limit is set to 4, the
joining member makes one attempt to connect to each of the
donors, then sleeps for the connection retry interval, then
makes one further attempt to connect to each of the donors
before reaching the limit.
The default connection retry interval is 60 seconds, and you can change this value dynamically. The following statement sets the distributed recovery donor connection retry interval to 120 seconds:
mysql> SET GLOBAL group_replication_recovery_reconnect_interval= 120;
For remote cloning operations, this interval does not apply. Group Replication makes only one connection attempt to each suitable donor for cloning, before starting to attempt state transfer from the binary log.
When distributed recovery has successfully completed state
transfer from the donor to the joining member, the joining
member can be marked as online in the group and ready to
participate. By default, this is done after the joining member
has received and applied all the transactions that it was
missing. Optionally, you can allow a joining member to be
marked as online when it has received and certified (that is,
completed conflict detection for) all the transactions that it
was missing, but before it has applied them. If you want to do
this, use the
group_replication_recovery_complete_at
system variable to specify the alternative setting
TRANSACTIONS_CERTIFIED
.