Documentation Home
MySQL 9.1 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 40.3Mb
PDF (A4) - 40.5Mb
Man Pages (TGZ) - 259.3Kb
Man Pages (Zip) - 366.4Kb
Info (Gzip) - 4.0Mb
Info (Zip) - 4.0Mb


MySQL 9.1 Reference Manual  /  ...  /  Connections for Distributed Recovery

20.5.4.1 Connections for Distributed Recovery

When a joining member connects to an online existing member for state transfer during distributed recovery, the joining member acts as a client on the connection and the existing member acts as a server. When state transfer from the donor's binary log is in progress over this connection (using the asynchronous replication channel group_replication_recovery), the joining member acts as the replica and the existing member acts as the source. When a remote cloning operation is in progress over this connection, the joining member acts as a recipient and the existing member acts as a donor. Configuration settings that apply to those roles outside the Group Replication context can apply for Group Replication also, unless they are overridden by a Group Replication-specific configuration setting or behavior.

The connection that an existing member offers to a joining member for distributed recovery is not the same connection that is used by Group Replication for communication between online members of the group.

  • The connection used by the group communication engine for Group Replication (XCom, a Paxos variant) for TCP communication between remote XCom instances is specified by the group_replication_local_address system variable. This connection is used for TCP/IP messages between online members. Communication with the local instance is over an input channel using shared memory.

  • For distributed recovery, by default, group members offer their standard SQL client connection to joining members, as specified by hostname and port. If an alternative port number is specified by report_port, that one is used instead.

  • Group members may instead advertise an alternative list of distributed recovery endpoints as dedicated client connections for joining members, allowing you to control distributed recovery traffic separately from connections by regular client users of the member. A member transmits the list of distributed recovery endpoints specified by group_replication_advertise_recovery_endpoints to the group when it joins. By default, the member continues to offer the standard SQL client connection as in earlier releases.

Important

Distributed recovery can fail if a joining member cannot correctly identify the other members using the host name as defined by MySQL Server's hostname system variable. It is recommended that operating systems running MySQL have a properly configured unique host name, either using DNS or local settings. The host name that the server is using for SQL client connections can be verified in the Member_host column of the Performance Schema table replication_group_members. If multiple group members externalize a default host name set by the operating system, there is a chance of the joining member not resolving it to the correct member address and not being able to connect for distributed recovery. In this situation you can use MySQL Server's report_host system variable to configure a unique host name to be externalized by each of the servers.

The steps for a joining member to establish a connection for distributed recovery are as follows:

  1. When the member joins the group, it connects with one of the seed members included in the list in its group_replication_group_seeds system variable, initially using the group_replication_local_address connection as specified in that list. The seed members might be a subset of the group.

  2. Over this connection, the seed member uses Group Replication's membership service to provide the joining member with a list of all the members that are online in the group, in the form of a view. The membership information includes the details of the distributed recovery endpoints or standard SQL client connection offered by each member for distributed recovery.

  3. The joining member selects a suitable group member from this list to be its donor for distributed recovery, following the behaviors described in Section 20.5.4.4, “Fault Tolerance for Distributed Recovery”.

  4. The joining member then attempts to connect to the donor using the donor's advertised distributed recovery endpoints, trying each in turn in the order they are specified in the list. If the donor provides no endpoints, the joining member attempts to connect using the donor's standard SQL client connection. The SSL requirements for the connection are as specified by the group_replication_recovery_ssl_* options described in Section 20.5.4.1.4, “SSL and Authentication for Distributed Recovery”.

  5. If the joining member is not able to connect to the selected donor, it retries with other suitable donors, following the behaviors described in Section 20.5.4.4, “Fault Tolerance for Distributed Recovery”. Note that if the joining member exhausts the list of advertised endpoints without making a connection, it does not fall back to the donor's standard SQL client connection, but switches to another donor.

  6. When the joining member establishes a distributed recovery connection with a donor, it uses that connection for state transfer as described in Section 20.5.4, “Distributed Recovery”. The host and port for the connection that is used are shown in the joining member's log. Note that if a remote cloning operation is used, when the joining member has restarted at the end of the operation, it establishes a connection with a new donor for state transfer from the binary log. This might be a connection to a different member from the original donor used for the remote cloning operation, or it might be a different connection to the original donor. In any case, the distributed recovery process continues in the same way as it would have with the original donor.

20.5.4.1.1 Selecting addresses for distributed recovery endpoints

IP addresses supplied by the group_replication_advertise_recovery_endpoints system variable as distributed recovery endpoints do not have to be configured for MySQL Server (that is, they do not have to be specified by the admin_address system variable or in the list for the bind_address system variable). They do have to be assigned to the server. Any host names used must resolve to a local IP address. IPv4 and IPv6 addresses can be used.

The ports supplied for the distributed recovery endpoints do have to be configured for MySQL Server, so they must be specified by the port, report_port, or admin_port system variable. The server must listen for TCP/IP connections on these ports. If you specify the admin_port, the replication user for distributed recovery needs the SERVICE_CONNECTION_ADMIN privilege to connect. Selecting the admin_port keeps distributed recovery connections separate from regular MySQL client connections.

Joining members try each of the endpoints in turn in the order they are specified on the list. If group_replication_advertise_recovery_endpoints is set to DEFAULT rather than a list of endpoints, the standard SQL client connection is offered. Note that the standard SQL client connection is not automatically included on a list of distributed recovery endpoints, and is not offered as a fallback if the donor's list of endpoints is exhausted without a connection. If you want to offer the standard SQL client connection as one of a number of distributed recovery endpoints, you must include it explicitly in the list specified by group_replication_advertise_recovery_endpoints. You can put it in the last place so that it acts as a last resort for connection.

A group member's distributed recovery endpoints (or standard SQL client connection if endpoints are not provided) do not need to be added to the Group Replication allowlist specified by the group_replication_ip_allowlist system variable. The allowlist is only for the address specified by group_replication_local_address for each member. A joining member must have its initial connection to the group permitted by the allowlist in order to retrieve the address or addresses for distributed recovery.

The distributed recovery endpoints that you list are validated when the system variable is set and when a START GROUP_REPLICATION statement has been issued. If the list cannot be parsed correctly, or if any of the endpoints cannot be accessed on the host because the server is not listening on them, Group Replication logs an error and does not start.

20.5.4.1.2 Compression for Distributed Recovery

You can optionally configure compression for distributed recovery by the method of state transfer from a donor's binary log. Compression can benefit distributed recovery where network bandwidth is limited and the donor has to transfer many transactions to the joining member. The group_replication_recovery_compression_algorithms and group_replication_recovery_zstd_compression_level system variables determine permitted compression algorithms, and the zstd compression level used when carrying out state transfer from a donor's binary log. For more information, see Section 6.2.8, “Connection Compression Control”.

These compression settings do not apply to remote cloning operations. When a remote cloning operation is used for distributed recovery, the clone plugin's setting for clone_enable_compression applies.

20.5.4.1.3 Replication User for Distributed Recovery

Distributed recovery requires a replication user that has the correct permissions so that Group Replication can establish direct member-to-member replication channels. The replication user must also have the correct permissions to act as the clone user on the donor for a remote cloning operation. The same replication user must be used for distributed recovery on every group member. For instructions to set up this replication user, see Section 20.2.1.3, “User Credentials For Distributed Recovery”. For instructions to secure the replication user credentials, see Section 20.6.3.1, “Secure User Credentials for Distributed Recovery”.

20.5.4.1.4 SSL and Authentication for Distributed Recovery

SSL for distributed recovery is configured separately from SSL for normal group communications, which is determined by the server's SSL settings and the group_replication_ssl_mode system variable. For distributed recovery connections, dedicated Group Replication distributed recovery SSL system variables are available to configure the use of certificates and ciphers specifically for distributed recovery.

By default, SSL is not used for distributed recovery connections. To activate it, set group_replication_recovery_use_ssl=ON, and configure the Group Replication distributed recovery SSL system variables as described in Section 20.6.3, “Securing Distributed Recovery Connections”. You need a replication user that is set up to use SSL.

When distributed recovery is configured to use SSL, Group Replication applies this setting for remote cloning operations, as well as for state transfer from a donor's binary log. Group Replication automatically configures the settings for the clone SSL options (clone_ssl_ca, clone_ssl_cert, and clone_ssl_key) to match your settings for the corresponding Group Replication distributed recovery options (group_replication_recovery_ssl_ca, group_replication_recovery_ssl_cert, and group_replication_recovery_ssl_key).

If you are not using SSL for distributed recovery (so group_replication_recovery_use_ssl is set to OFF), and the replication user account for Group Replication authenticates with the caching_sha2_password plugin (the default) or the sha256_password plugin (deprecated), RSA key pairs are used for password exchange. In this case, either use the group_replication_recovery_public_key_path system variable to specify the RSA public key file, or use the group_replication_recovery_get_public_key system variable to request the public key from the source, as described in Section 20.6.3.1.1, “Replication User With The Caching SHA-2 Authentication Plugin”.