Documentation Home
MySQL 8.0 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 47.0Mb
PDF (A4) - 47.1Mb
PDF (RPM) - 42.4Mb
HTML Download (TGZ) - 10.8Mb
HTML Download (Zip) - 10.9Mb
HTML Download (RPM) - 9.4Mb
Man Pages (TGZ) - 226.8Kb
Man Pages (Zip) - 333.5Kb
Info (Gzip) - 4.2Mb
Info (Zip) - 4.2Mb
Excerpts from this Manual

MySQL 8.0 Reference Manual  /  ...  /  Configuring Distributed Recovery

18.4.3.2 Configuring Distributed Recovery

Several aspects of Group Replication's distributed recovery process can be configured to suit your system.

Replication User for Distributed Recovery

Distributed recovery requires a replication user that has the correct permissions so that Group Replication can establish direct member-to-member replication channels. The replication user must also have the correct permissions to act as the clone user on the donor for a remote cloning operation. For instructions to set up this replication user, see Section 18.2.1.3, “User Credentials”.

Manual State Transfer

State transfer from the binary log is Group Replication's base mechanism for distributed recovery, and if the donors and joining members in your replication group are not set up to support cloning, this is the only available option. As state transfer from the binary log is based on classic asynchronous replication, it might take a very long time if the server joining the group does not have the group's data at all, or has data taken from a very old backup image. In this situation, it is therefore recommended that before adding a server to the group, you should set it up with the group's data by transferring a fairly recent snapshot of a server already in the group. This minimizes the time taken for distributed recovery, and reduces the impact on donor servers, since they have to retain and transfer fewer binary log files.

Number of Connection Attempts

For state transfer from the binary log, Group Replication limits the number of attempts a joining member makes when trying to connect to a donor from the pool of donors. If the connection retry limit is reached without a successful connection, the distributed recovery procedure terminates with an error. Note that this limit specifies the total number of attempts that the joining member makes to connect to a donor. For example, if 2 group members are suitable donors, and the connection retry limit is set to 4, the joining member makes 2 attempts to connect to each of the donors before reaching the limit.

The default connection retry limit is 10. You can configure this setting using the group_replication_recovery_retry_count system variable. The following command sets the maximum number of attempts to connect to a donor to 5:

mysql> SET GLOBAL group_replication_recovery_retry_count= 5;

For remote cloning operations, this limit does not apply. Group Replication makes only one connection attempt to each suitable donor for cloning, before starting to attempt state transfer from the binary log.

Sleep Interval for Connection Attempts

For state transfer from the binary log, the group_replication_recovery_reconnect_interval system variable defines how much time the distributed recovery process should sleep between donor connection attempts. Note that distributed recovery does not sleep after every donor connection attempt. As the joining member is connecting to different servers and not to the same one repeatedly, it can assume that the problem that affects server A does not affect server B. Distributed recovery therefore suspends only when it has gone through all the possible donors. Once the server joining the group has made one attempt to connect to each of the suitable donors in the group, the distributed recovery process sleeps for the number of seconds configured by the group_replication_recovery_reconnect_interval system variable. For example, if 2 group members are suitable donors, and the connection retry limit is set to 4, the joining member makes one attempt to connect to each of the donors, then sleeps for the connection retry interval, then makes one further attempt to connect to each of the donors before reaching the limit.

The default connection retry interval is 60 seconds, and you can change this value dynamically. The following command sets the distributed recovery donor connection retry interval to 120 seconds:

mysql> SET GLOBAL group_replication_recovery_reconnect_interval= 120;

For remote cloning operations, this interval does not apply. Group Replication makes only one connection attempt to each suitable donor for cloning, before starting to attempt state transfer from the binary log.

Marking the Joining Member Online

When distributed recovery has successfully completed state transfer from the donor to the joining member, the joining member can be marked as online in the group and ready to participate. By default, this is done after the joining member has received and applied all the transactions that it was missing. Optionally, you can allow a joining member to be marked as online when it has received and certified (that is, completed conflict detection for) all the transactions that it was missing, but before it has applied them. If you want to do this, use the group_replication_recovery_complete_at system variable to specify the alternative setting TRANSACTIONS_CERTIFIED.

SSL and Authentication for Distributed Recovery

You can optionally use SSL for distributed recovery connections between group members. SSL for distributed recovery is configured separately from SSL for normal group communications, which is determined by the server's SSL settings and the group_replication_ssl_mode system variable. For distributed recovery connections, dedicated Group Replication distributed recovery SSL system variables are available to configure the use of certificates and ciphers specifically for distributed recovery.

By default, SSL is not used for distributed recovery connections. To activate this, set group_replication_recovery_use_ssl=ON, and configure the Group Replication distributed recovery SSL system variables as described in Section 18.5.2, “Group Replication Secure Socket Layer (SSL) Support”. You need a replication user that is set up to use SSL.

When distributed recovery is configured to use SSL, Group Replication applies this setting for remote cloning operations, as well as for state transfer from a donor's binary log. Group Replication automatically configures the settings for the clone SSL options (clone_ssl_ca, clone_ssl_cert, and clone_ssl_key) to match your settings for the corresponding Group Replication distributed recovery options (group_replication_recovery_ssl_ca, group_replication_recovery_ssl_cert, and group_replication_recovery_ssl_key).

If you are not using SSL for distributed recovery (so group_replication_recovery_use_ssl is set to OFF), and the replication user account for Group Replication authenticates with the caching_sha2_password plugin (which is the default in MySQL 8.0) or the sha256_password plugin, RSA key-pairs are used for password exchange. In this case, either use the group_replication_recovery_public_key_path system variable to specify the RSA public key file, or use the group_replication_recovery_get_public_key system variable to request the public key from the master, as described in Using Group Replication and the Caching SHA-2 User Credentials Plugin.

Compression for Distributed Recovery

From MySQL 8.0.18, you can optionally configure compression for distributed recovery by the method of state transfer from a donor's binary log. Compression can benefit distributed recovery where network bandwidth is limited and the donor has to transfer many transactions to the joining member. The group_replication_recovery_compression_algorithm and group_replication_recovery_zstd_compression_level system variables configure permitted compression algorithms, and the zstd compression level, used when carrying out state transfer from a donor's binary log. For more information, see Section 4.2.6, “Connection Compression Control”.

Note that these compression settings do not apply for remote cloning operations. When a remote cloning operation is used for distributed recovery, the clone plugin's clone_enable_compression setting applies.