WL#13767: Group Replication: specify through which endpoints can recovery traffic flow

Affects: Server-8.0   —   Status: Complete

Executive Summary
=================

This worklog implements a mechanism to specify which ips and ports a donor
shall advertise as its endpoints. Then a joiner shall try to connect to them,
in order to pull binary logs, during distributed recovery.

User/Dev Stories
================

As a MySQL Operator I want to specify through which interfaces
can group replication recovery take place for a given member
so that I restrict where recovery traffic flows in my network
infrastructure.

Scope
=====

The work described in this document does:

- implement group replication variable
  `group_replication_advertise_recovery_endpoints` with recovery
  points available from this donor
- implement validation on variable
  `group_replication_advertise_recovery_endpoints`, endpoints shall
  be reachable from the donor
- implement on the joiner a iteration mechanism over recovery endpoints to
  execute recovery/clone from first endpoint available

This worklog shall not implement:

- support for disabling recovery on some hosts (e.g., donor
  advertising an empty list of recovery endpoints).
- advertising recovery endpoints that are not part of the
  donor's host IP:ports

High Level Description
======================

When a member joins a group in Group Replication, it goes through
distributed recovery to fetch the missing transactions to fill in
the gap between its state and the global state of the group.
Distributed recovery establishes clone and asynchronous
replication connections with a member of the group to get the data.
The host and port of that member is fetched by the member
information that all group members have, which among other things
have the host and port of all members. Each member advertises its
own recovery socket address (IP:port) when they join the group.

The admin_address is a bind address for administrative access,
it can used to split connectors traffic from replication
(internal) traffic, that is, client connections are accepted on
bind_address[2] whereas internal traffic like Replication, is accepted on
admin_address. This separation of context allows the DBA to better
secure its network, for instance, enforce throttling on client
connections and keep replication unbounded.

On top of that, bind_address[2] can be configured with multiple addresses or
wildcard address.

A donor from Group Replication will transmit a string with DEFAULT or a list
recovery endpoints where a member can do the recovery process.

On distributed recovery the donor address and port will be selected
following these rules:
  if the donor version is <= 8.0.20:
    1) use the values listed on performance_schema.replication_group_members table.
  if the donor has group_replication_advertise_recovery_endpoints (8.0.21+):
    1) if DEFAULT use the values listed on
       performance_schema.replication_group_members table.
    2) otherwise iterate on endpoints to connect and execute recovery/clone

[1] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_admin_address
[2] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_bind_address
Functional Requirements
======================

- FR1: If starting group replication on boot and the configuration
       is invalid, then it shall abort and one of the following errors shall be
       written to the error log:

   ER_GRP_RPL_RECOVERY_ENDPOINT_FORMAT: "Invalid input value for recovery
              socket endpoints '%'. Please, provide a valid, comma separated,
              list of endpoints (IP:port)".

   ER_GRP_RPL_RECOVERY_ENDPOINT_INVALID => "The server is not listening on
              endpoint '%s'. Only endpoints that the server is listening on are
              valid recovery endpoints."

- FR2: If starting group replication through the command line,
       i.e. START GROUP_REPLICATION, and the configuration is invalid, then it
       shall abort and one of the following errors shall be emitted to the
       client session:

   ER_DA_GRP_RPL_RECOVERY_ENDPOINT_FORMAT: "Invalid input value for recovery
              socket endpoints '%'. Please, provide a valid, comma separated,
              list of endpoints (IP:port)".
   ER_DA_GRP_RPL_RECOVERY_ENDPOINT_INVALID => "The server is not listening on
              endpoint '%s'. Only endpoints that the server is listening on are
              valid recovery endpoints."

- FR3: When setting this variable GLOBAL scope through an SQL session,
       all IPs and port on `group_replication_advertise_recovery_endpoints`
       shall be valid IP addresses on that host. Otherwise,
       ER_WRONG_VALUE_FOR_VAR_PLUS_ACTIONABLE_PART SHALL be emitted to the
       client session.

- FR4: A server that support `group_replication_advertise_recovery_endpoints`
       will do recovery from a group without support for them.

- FR5: All log related to a server recovery/clone shall print host and port
       from `group_replication_advertise_recovery_endpoints` when used to
       connect to donor.

- FR6: When `group_replication_advertise_recovery_endpoints` is set on a donor,
       the joiner will only attempt recovery from the list of advertised
       endpoints.

- FR7: A `group_replication_advertise_recovery_endpoints` with
       network namespace shall be rejected,
       ER_WRONG_VALUE_FOR_VAR_PLUS_ACTIONABLE_PART SHALL be emitted to the
       error log.


Non-Functional Requirements
===========================

- NFR1: Recovery performance shall be not impacted
Definitions
===========
 * **group_replication_advertise_recovery_endpoints** : list of endpoints where
                           a member can establish connection to execute recovery
 * **recovery_address**: An IP address on which it listen for TCP/IP connections
                         retrieved from `group_replication_advertise_recovery_endpoints`
 * **recovery_port**: An TCP/IP address on which it listen for TCP/IP connections
                      retrieved from `group_replication_advertise_recovery_endpoints`
 * **regular hostname** : Host name of server that run mysql
 * **regular port** : The number of the port on which the server listens for
                      TCP/IP connections.

Summary of the approach
=======================

- Donors can specify a list of advertised recovery endpoints
  through group_replication_advertise_recovery_endpoints.

- The list is transmitted to the group in the state exchange
  (or handshake) when the donor originally joins the group.

- When a new server joins the group, it will receive a list of
  specific recovery endpoints from those donors that have the
  feature implemented in this worklog and have specified the
  list. Otherwise, they receive the default enpoint.

Security context
================

The use of `group_replication_advertise_recovery_endpoints` will not introduce
any modification on security.

Some of the endpoints connections may need more privileges. For example
connections established to `admin_port` require SERVICE_CONNECTION_ADMIN
privilege.

Upgrade/downgrade and cross-version replication
===============================================

There will be no impact on upgrade/downgrade, the new member will connect to
regular host name and regular port when
`group_replication_advertise_recovery_endpoints` is empty.

User interface
==============

On this worklog we will introduce an option to specify a list of endpoints
which are advertised to a member joining. The joiner will iterate over the
list, establish a connection to one endpoint and execute recovery/clone over it.

A option will be added to server to specify the failover guarantees.

 - NAME: group_replication_advertise_recovery_endpoints
 - VALUES: { DEFAULT, RECOVERY_ENDPOINTS=host:port[,host:port] }
 - DEFAULT: DEFAULT
 - SCOPE: global
 - DYNAMIC: yes
 - REPLICATED (written to the binary log): no
 - PERSIST: PERSIST, PERSIST_ONLY
 - PRIVILEGES REQUIRED: SYSTEM_VARIABLES_ADMIN
 - DESCRIPTION: DEFAULT is the classic endpoint for recovery and
                RECOVERY_ENDPOINTS is a list of endpoints from which recovery
                member can retrieve data.

DEFAULT means the server that is executing recover/clone will use regular host and
port to establish connection.

RECOVERY_ENDPOINTS a string with a list of IP:port separated for comma, where
the donor can iterate to start process of recovery/clone

API
===

The variable `group_replication_advertise_recovery_endpoints` from the member
will be added to member info structure.

Observability
=============

If server is using an IP and port from
`group_replication_advertise_recovery_endpoints` for recovery/clone process
every command that log information about host and port will display
`recover_ip` and `recover_port`.

When updating `group_replication_advertise_recovery_endpoints` variable, if
configuration is invalid it will throw an error to the client::

* ER_WRONG_VALUE_FOR_VAR_PLUS_ACTIONABLE_PART: "Variable '%-.64s' cannot be set to the value of
             '%-.200s'. %-200s"

When group replication starts on boot will validate
`group_replication_advertise_recovery_endpoints` variable, if configuration is
invalid it will abort and log one of the following errors:

* ER_GRP_RPL_RECOVERY_ENDPOINT_FORMAT: "Invalid input value for recovery
             socket endpoints '%'. Please, provide a valid, comma separated,
             list of endpoints (IP:port)".
* ER_GRP_RPL_RECOVERY_ENDPOINT_INVALID => "The server is not listening on
             endpoint '%s'. Only endpoints that the server is listening on are
             valid recovery endpoints."

If the failure happens on a session, when executing START GROUP_REPLICATION
group replication plugin will abort the start and return one of the following
errors to the client :

* ER_DA_GRP_RPL_RECOVERY_ENDPOINT_FORMAT: "Invalid input value for recovery
            socket endpoints '%'. Please, provide a valid, comma separated,
            list of endpoints (IP:port)".

* ER_DA_GRP_RPL_RECOVERY_ENDPOINT_INVALID: "The server is not listening on
            endpoint '%s'. Only endpoints that the server is listening on are
            valid recovery endpoints."

Deployment and installation
===========================

The worklog won't introduce any modification on plugin deployment or installation.

A new deployments that want to specify recovery endpoints other than the
automatic one, need to configure the donor recovery endpoints specifically,
according to each setup.

Protocol
========

When transmitting member info now it will contain the
`group_replication_advertise_recovery_endpoints` of the member.

When a server receive information from a donor where
`group_replication_advertise_recovery_endpoints` is undefined the connection will
use regular host and port.


Failure model specification
===========================

There will be no modifications to failure model specification.
Summary of Changes
==================

Server will transmit his `group_replication_advertise_recovery_endpoints`
on member info structure
to allow recovery process use them.

That infomartion will be sent on `Plugin_gcs_message`:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
class Group_member_info : public Plugin_gcs_message {
    /* ...  */

    // Length of the payload item: variable
    PIT_RECOVERY_ENDPOINTS = 20,

    // No valid type codes can appear after this one.
    PIT_MAX = 21
   };
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When a server that is doing recovery will consult if
`group_replication_advertise_recovery_endpoints` port are available and if so
use it to start recovery process.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      // pseudo code
      if (member->recovery_endpoints() != "DEFAULT") {
        for endpoint in member->recovery_endpoints() {
          if(execute_recovery(endpoint->host(), endpoint->port()))
            return;
        }
      } else {
        hostname.assign(member->get_hostname());
        port.assign(std::to_string(member->get_port()));
      }
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~