WL#14019: Automatic connection failover for Async Replication Channels - Step II: Automatic senders list

Affects: Server-8.0   —   Status: Complete

EXECUTIVE SUMMARY

This worklog is second step of 'WL#12649: Automatic connection failover for Async Replication Channels'. It will focus on keeping receiver's sender list in sync with the Group Replication member's changes especially its state and role.

After this worklog is implemented the list of senders as potential replication connection failover targets is automatically updated, for the senders that are in a group (group replication). The list of targets (senders) is dynamic and updated according to the group membership. The receiver will update its list of target senders after there is a membership change in the group. Also after this worklog is implemented receiver will always stay connected to source having highest failover weight through asynchronous replication channel, even when existing asynchronous replication connection isn't failed/disconnected. This will be useful where User/DBA who want receiver to always stay connected to the primary of the group by always keeping higher failover weight for primary, to keep binary logs on receiver in sync with minimum delay.

TERMINOLOGY

Let us first define the terminologies which will be used in this worklog.

Asynchronous Replication terminologies:

The different roles that servers are playing while engaging in asynchronous replication activities.

  • sender: endpoint that sends data.
  • receiver: endpoint that receives data from the source.

  • sender list: a list of multiple senders in which each item contains sender connection details and a priority. The user would add this sender list so that receiver can connect to new sender in case existing sender fails. The sender with higher priority would be choosen to connect next.

sender list                               sender list
 [sender1] -----> [receiver]               [sender1] ---x--> [receiver]
                                                                ^
                             --------->                         |
 [sender2]                                 [sender2] -----------|


       sender1 fails, receiver connects to  sender2
       --------------------------------------------

  • weight (priority): When more than one sender are present for the same (failover weight) channel, then next sender would be selected with highest weight (priority) for the same channel. The weight is a number between 1 and 100, where 1 means lowest priority, and 100 means highest priority. To avoid confusion with group_replication_member_weight it will be also called 'failover weight' in this design document.

  • Group Configuration: To assign different weights to primary and secondary member of the group so that User/DBA can control whom they want receiver to stay connected through asynchronous replication channel.

The configuration weight will contain these data:

  • primary weight: which will be assigned to the primary of the group
  • secondary weight: which will be assigned the rest of the members of the group having quorum.

MOTIVATION

The main driver of this worklog is to keep group membership changes at sender's end in sync with stored sender list on receiver, thereby eliminating the need for user to manually update sender list (on receiver) everytime a member joins or leaves a group.

Also in some scenarios User/DBA may want receiver to always stay connected to sender having highest failover weight. And that can be useful for primary of the group to keep binary logs in sync with receiver's with minimum delay, by assigning primary always highest failover. In this worklog we would provide changes that would keep receiver always stay connected to higher sender weight and way through which user/DBA can assign highest failover weight to primary.

USER STORIES

  • As a MySQL DBA I want to setup asynchronous replication between two servers, S1 (sender) who is member of the group (group replication) to R1 (receiver), and in case group replication membership changes on S1, I want it to get automatically sync with stored sender list on R1, so that I do not have to manually update sender list (on receiver) everytime.

New member joins group:
-----------------------

sender list [S1, S2]                      sender list [S1, S2, S3]
[S1:50] ---> [R1]                         [S1:50] ------> [R1]

                       ------->
[S2:50]              S3 join group        [S2:50]

                                          [S3:50]


       S3 joins sender group, sender list on R1 (receiver)
       which initially had [S1, S2] is updated and now
       contains [S1, S2, S3].
       ------------------------------------------------


Existing member leaves group:
-----------------------------

sender list [S1, S2, S3]                  sender list [S1, S2]
[S1:50] ---> [R1]                         [S1:50] ------> [R1]

                       ------->
[S2:50]              S3 leaves group      [S2:50]


[S3:50]


       S3 leaves sender group, sender list on R1 (receiver)
       which initially had [S1, S2, S3] is updated and now
       contains [S1, S2].
       ------------------------------------------------

  • As a MySQL DBA I want to setup asynchronous replication between two servers, S1 (sender) who is member of the group (group replication) to R1 (receiver), and in case S1 loses quorum I want asynchronous replication to automatically failover the replication connection to one of S3, S4 or S5, according to highest weight.

sender list [S1,S2,S3,S4,S5]           sender list [S1,S2,S3,S4,S5]
-----------                            -----------
| [S1:50]-|----> [R1]                  | [S1:50]-| ---x--> [R1]
|         |                            | [S2:50] |          ^
|         |              ------->      -----------          |
| [S2:50] |            Majority Lost                        |
| [S3:50] |                            -----------          |
| [S4:50] |                            | [S3:50]-|-----------
| [S5:50] |                            | [S4:50] |
-----------                            | [S5:50] |
                                       -----------


       S3, S4 and S5 are not reachable from S1 and S2 and S1 and S2
       has lost majority. The asynchronous replication will get
       connected to one of the S3, S4 and S5.
       ------------------------------------------------

  • As a MySQL DBA I want to setup asynchronous replication between two servers, S1 (sender) to R1 (receiver), and in case a new sender S3 is added with higher failover weight then S1, I want asynchronous replication to automatically switch to S3, as its sender with higher weight.

sender list [S1, S2]              sender list [S1, S2, S3]
[S1:60] ---> [R1]                         [S1:60]          [R1]
                                                            ^
                      ------->                              |
[S2:50]             S3 added with         [S2:50]           |
                    higher weight                           |
                                                            |
                                          [S3:70] ----------|


    S3 is added in sender list with higher weight as compared to
    S1 and S2. R1 gets re-connected to S3.
    -------------------------------------------------------------

  • As a MySQL DBA I want to setup asynchronous replication between two servers, S1 (sender) who is primary member of the group (group replication) to R1 (receiver), and in case primary member changes i.e. S2 becomes PRIMARY and S1 becomes SECONDARY, I want asynchronous replication to automatically switch the replication connection to S2.

sender list [S1:PRI, S2, S3]              sender list [S1, S2:PRI, S3]
[S1:60] ---> [R1]                         [S1:50]          [R1]
                                                            ^
                       ------->                             |
[S2:50]             S2 becomes PRIMARY    [S2:60] ----------|


[S3:50]                                   [S3:50]


       The primary changes from S1 to S2. The weight is also changed as
       per mode and the asynchronous replication gets connected to S2.
       ----------------------------------------------------------------

Functional and Non-Functional Requirements

FUNCTIONAL REQUIREMENTS:

  • FR1. For Managed_type 'GroupReplication', when the monitoring thread is connected to a sender which belongs to the majority of Managed_name, it updates the current state of each sender listed on mysql.replication_asynchronous_connection_failover table, including adding new members and removing members that left.

  • FR2. For Managed_type 'GroupReplication', when a sender leaves a group (group replication state is OFFLINE/ERROR), its entry in the Sender List on receiver also MUST be removed with logged warning ER_RPL_ASYNC_SENDER_REMOVED. This is true if after the member leaving there the majority of the group is working. If this is the last member leaving, it will not be removed from the Sender List.

  • FR3. For Managed_type 'GroupReplication', and a new member joins a sender group, then the connection details (host, port, Managed_type, Managed_name) and weight with value either of primary or secondary weight provided in group's Configuration value of the mysql.replication_asynchronous_connection_failover_managed for the channel for this new member MUST be added to the Sender List on the receiver with logged warning ER_RPL_ASYNC_SENDER_ADDED. The new members to be added MUST have one of the following member state: ONLINE, RECOVERING or UNREACHABLE.

  • FR4. For Managed_type 'GroupReplication' and the monitoring worker notices that the sender, to which the receiver is connected to, is not in the primary partition (lost quorum) then it SHALL automatically stop the receiver, issue an entry in the error log (ER_RPL_ASYNC_CHANNEL_STOPPED_QUORUM_LOST). As a consequence the receiver (IO thread) SHALL also automatically failover to another sender with highest weight.

    The QUORUM for the group member is calculated using below query:

 SELECT IF(
   ((SELECT COUNT(*) FROM
     performance_schema.replication_group_members
     WHERE MEMBER_STATE != 'ONLINE' AND MEMBER_STATE != 'RECOVERY')
    >= ((SELECT COUNT(*) FROM
         performance_schema.replication_group_members)/2)=0),1,0);

  • FR5. When replica IO thread connects a source and replication asynchronous connection failover feature is enabled on receiver and Managed_type is 'GroupReplication', then it MUST check on the source for quorum (when group replication is enabled). Otherwise it logs ER_RPL_ASYNC_CHANNEL_CANT_CONNECT_NO_QUORUM error and tries connecting to another source (sender) from the Sender List according to the highest weight. It keeps on retrying in an endless loop till either the IO thread is stopped using STOP REPLICA (IO_THREAD) or it finds a sender who if member of the group has majority or single server (group replication disabled).

  • FR6. For senders who are members of the group and the group Managed_type is 'GroupReplication', if there is an error connecting sender to fetch sender membership or quorum details , it does retry connecting MASTER_RETRY_COUNT times and if still not able to connect, the replication channel fails (if the same sender has channel connected) with ER_RPL_ASYNC_CHANNEL_CANT_CONNECT error, then the receiver will attempt failover to one of the sender from the sender list. But it does not remove sender from the Sender List, so that sender can be retried and allowed to recover.

  • FR7. For senders who are members of the group and the group Managed_type is 'GroupReplication', if there is an error executing queries to fetch sender membership or quorum details, the replication channel fails with (if the same sender has channel connected) ER_RPL_ASYNC_EXECUTING_QUERY error, then the receiver will attempt failover to one of the sender from the sender list. But it does not remove sender from the Sender List, so that sender can be retried and allowed to recover.

  • FR8. The replication asynchronous connection failover feature can be enabled or disabled on running channel (replica doesn't need to be stopped), and so Monitor IO thread MUST not update sender details or stop channel, for the channel which is no longer managed i.e. SOURCE_CONNECTION_AUTO_FAILOVER=0.

  • FR9. The IO thread MUST not connect to RECOVERING group member because recovery can take time and that would unnecessary delay receiver in getting binary logs.

  • FR10. When a sender changes group and the group Managed_type is 'GroupReplication', its entry in the Sender List remains only if there is some other member of its new group already in Sender List, otherwise its entry in the Sender List on receiver MUST be removed with logged warning ER_RPL_ASYNC_SENDER_REMOVED.

    When a sender changes group, then its entry in the Sender List is first removed, as shown in figure below:

       Sender List                     Sender List
     [S1,S2,S3,S4,S5,S6]            [S1,S2,S3,S4,S5,S6]
        -------------                  -------------
        |  S1:UUID1 |                  |  S1:UUID1 |
        |  S2:UUID1 |    ----->        |  S2:UUID1 |
        |  S3:UUID1 |   S3 changes     | (removed) |
        |  S4:UUID2 |    group         |  S4:UUID2 |
        |  S5:UUID2 |                  |  S5:UUID2 |
        |  S6:UUID2 |                  |  S6:UUID2 |
        -------------                  -------------

  And later if it appears in its new group member's membership table
  (performance_schema.replication_group_members), then its re-added back to
  receiver Sender List with new Managed_name, as shown below:

       Sender List                     Sender List
     [S1,S2,S3,S4,S5,S6]            [S1,S2,S3,S4,S5,S6]
        -------------                  -------------
        |  S1:UUID1 |                  |  S1:UUID1 |
        |  S2:UUID1 |    ----->        |  S2:UUID1 |
        |  S4:UUID2 |   S3 changes     |  S3:UUID2 |
        |  S5:UUID2 |    group         |  S4:UUID2 |
        |  S6:UUID2 |                  |  S5:UUID2 |
        -------------                  |  S6:UUID2 |
                                       -------------


  For a sender to be re-added back to receiver sender list, S3 in this
  example, there must be another member of the sender's new group. In the
  above example we already have S4, S5 and S6 in Sender List which belongs
  to same new group (UUID2) to which S3 joined. So S3 was removed and
  re-added to sender list after reading membership table from either of
  S4, S5 or S6.

  • FR11. If a sender with higher priority for the same channel becomes available, the receiver MUST reconnect to it immediately, irrespective of whether current asynchronous replication connection is still connected and haven't failed or disconnected.

  • FR12. Whenever a sender who has group replication enabled and Managed_type is 'GroupReplication', and its group replication role changes, it's failover weight on receiver MUST be assigned with primary or secondary weight provided in group's Configuration value of the mysql.replication_asynchronous_connection_failover_managed for the channel.

NON FUNCTIONAL REQUIREMENTS:

  • NFR1. The functioning of Monitor IO thread MUST not affect performance of applier threads.

  • NFR2. When Replication Asynchronous Failover feature is enabled, the MASTER_RETRY_COUNT and MASTER_CONNECT_RETRY options of CHANGE MASTER command should be changed to 3 and 10 respectively, as their default values are too high which would make IO thread switch to another source after lot of retries and time duration giving away main purpose of this feature and multiple failover sources.

1. INTRODUCTION

This worklog enhances Replication Asynchronous Connection Failover feature by implementing a mechanism which would automatically update the sender list and if required reconnects asynchronous replication on following changes:

State change of sender which is also a group member.

Eg: members joining or leaving a group, which would result in updation of sender list.

sender list [S1, S2]                      sender list [S1, S2, S3]
[S1:90] ---> [R1]                         [S1:90] ------> [R1]

                       ------->
[S2:80]              S3 join group        [S2:80]

                                          [S3:50]

S3 joins sender group, sender list on R1 (receiver) which initially
had [S1, S2] is updated and now contains [S1, S2, S3].

Failover weight change of Sender.

Eg: a sender with higher failover weight then current asynchronous replication connection is added/updated. This will result in asynchronous replication re-connection to sender with higher weight and also updation of sender list.

sender list [S1, S2]              sender list [S1, S2, S3]
[S1:60] ---> [R1]                         [S1:60]          [R1]
                                                            ^
                      ------->                              |
[S2:50]             S3 added with         [S2:50]           |
                    higher weight                           |
                                                            |
                                          [S3:70] ----------|


S3 is added in sender list with higher weight as compared to
S1 and S2. R1 gets re-connected to S3.

Role change of sender which is also a group member

Eg: Primary is switched, which would result in asynchronous replication re-connection.

sender list [S1, S2, S3] sender list [S1, S2, S3] Configuration [Primary Weight:80, Configuration [Primary Weight:80, Secondary Weight:40] Secondary Weight:40] [S1:80] ---> [R1] [S1:40] [R1] ^ -------> | [S2:40] R1 connects to [S2:80] ------------| new source [S3:40] [S3:40]

S2 becomes a new primary and so it gets assigns higher weight by Monitor
IO thread, and R1 switches from S1 to S2.

This enhancement is divided into two parts:

  1. A new monitoring thread (Monitor IO Thread) would be added on receiver which would connect to all the senders and timely monitor for group (group replication) membership changes.

  2. On any sender failover weight change or, state or role change on sender who is also group (group replication). So based on whether sender is added/removed or its role or failover weight is changed appropriate action like update sender list on the receiver, or re-connect to new senders according to higher failover weight, is taken.

2. REPLICATION ASYNCHRONOUS CONNECTION SENDER LIST

The Sender List stores following sender's connection details and weight to influence next sender selection :

  • asynchronous replication channel name
  • sender connection details i.e. host, port and network namespace
  • failover weight based on which next sender would be selected.
  • [new] managed name which determine the group sender belongs to. This field will be added in this worklog and will be used to determine if sender changed group.

2.1. SENDER LIST STORAGE

A new field Managed_name would be added to mysql.replication_asynchronous_connection_failover table to determine which group sender belongs to initially and if it changed the group.

CREATE TABLE IF NOT EXISTS
mysql.replication_asynchronous_connection_failover (
  Channel_name CHAR(64) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL
  COMMENT 'The replication channel name that connects source and replica.',
  Host CHAR(255) CHARACTER SET ASCII NOT NULL COMMENT 'The source\'s
  hostname that the replica will attempt to switch over the replication
  connection to in case of a failure.',
  Port INTEGER UNSIGNED NOT NULL COMMENT 'The source\'s port that the
  replica will attempt to switch over the replication connection to in
  case of a failure.',
  Network_namespace VARCHAR(64) DEFAULT '' COMMENT 'The source\'s network
  namespace that the replica will attempt to switch over the replication
  connection to in case of a failure. If its value is empty, connections use
  the default (global) namespace.',
  Weight TINYINT UNSIGNED NOT NULL DEFAULT 50 COMMENT 'The order in which
  the replica shall try to switch the connection over to when there are
  failures. Weights can be set to a number between 1 and 100, where 100 is
  the highest weight and 1 the lowest.',
  Managed_name CHAR(64) NOT NULL CHARACTER SET utf8 COLLATE utf8_bin
  DEFAULT '' COMMENT 'The name of the group which this server belongs to.',
CONSTRAINT CHECK (Weight BETWEEN 1 AND 100),
PRIMARY KEY(Channel_name, Host, Port, Network_namespace, Managed_name))
DEFAULT CHARSET=utf8 STATS_PERSISTENT=0
COMMENT 'The source configuration details'

A new table mysql.replication_asynchronous_connection_failover_managed would be added which would help Monitor IO thread to determine how User/DBA wants the sender's group to be managed.

CREATE TABLE IF NOT EXISTS
mysql.replication_asynchronous_connection_failover_managed (
  Channel_name CHAR(64) NOT NULL CHARACTER SET utf8 COLLATE utf8_general_ci
  COMMENT 'The replication channel name that connects source and replica.',
  Managed_name CHAR(64) NOT NULL CHARACTER SET utf8 COLLATE utf8_bin
  DEFAULT ''
  COMMENT 'The name of the managed group which needs to be managed.'
  Managed_type CHAR(64) NOT NULL CHARACTER SET utf8 COLLATE utf8_bin
  DEFAULT '' COMMENT 'Determines the manged group type.',
  Configuration JSON DEFAULT NOT NULL
  COMMENT 'The data to help manage group. For Managed_type =
  PRIMARY_GROUP_REPLICATION, Configuration value should contain
  {"Primary_weight": 80, "Secondary_weight": 60}, so that it
  assigns weight=80 to PRIMARY of the group, and weight=60 for rest of the
  members in mysql.replication_asynchronous_connection_failover table.',
PRIMARY KEY(Channel_name, Managed_name))
DEFAULT CHARSET=utf8 STATS_PERSISTENT=0
COMMENT 'The source group configuration details'

3. MONITOR IO THREAD

A new thread known as Monitor IO thread is started on receiver which monitors for group member's state, role and quorum changes on all the potential senders in the Sender List, and if it finds any changes or lost quorum it does automatic update of the sender list or if required attempt to failover to one of the sender from the sender list who if member of the group has majority or single server (group replication disabled).

3.1. Monitor IO thread: Single vs Multiple thread

3.1.1. Single Monitor IO thread:

One Monitor IO thread will be used to monitor all the senders of all replication channels.

The Single Monitor IO thread starts with: - START REPLICA when replication asynchronous connection failover feature is enabled and Single Monitor IO thread is not already running. - enabling replication asynchronous connection failover feature provided replica IO thread is already started, and Single Monitor IO thread is not already running.

The Single Monitor IO thread stops with: - stop of replica IO thread for last replica which has replication asynchronous connection failover enabled. - disabling replication asynchronous connection failover feature for last replica which had replica running and replication asynchronous connection failover feature enabled.

3.1.2. Multiple Monitor IO thread (thread per channel)

One Monitor IO thread will be used to monitor all the senders of replication channels i.e. there will be different Monitor IO thread for each channel.

The Monitor IO thread for channel starts with: - START REPLICA when replication asynchronous connection failover feature is enabled, - enabling replication asynchronous connection failover feature provided replica IO thread is already started.

The Monitor IO thread for the channel stops with: - stop of replica IO thread. - disabling replication asynchronous connection failover feature.

3.1.3. Comparison of Single vs Multiple Monitor IO thread

Single IO Thread Multiple IO thread
1. Single thread would use lesser CPU 1. Multiple Monitor IO thread in
and Memory bound resources although comparsion would consume more CPU
that should be marginal performance and Memory resources.
improvement as in usual environement
we won't be expecting too many
channels.
2. The Single Monitor IO thread will 2. The higher number of Monitor IO
have lesser mutex contention. threads would also bring more mutex
contention, as these threads would
share variables and updates to
replication_asynchronous_connection_failover
table.
3. Any failour or error with the 3. The Multiple Monitor IO thread would
single thread or for particular be better as any issue with execution
channel would stop functioning or blocking of one thread won't block
for all the channels. changes for other thread (channel).
4. For start of Monitor IO thread it 4. The start/stop of Multiple Monitor IO
it needs to check if Monitor IO thread is much simpler as compared to
thread is already running. Single IO thread as it doesn't need
and for stop it check if its last IO to check for last IO thread with
thread, and for both start and stop replication asynchronous connection
whether replication asynchronous failover feature enabled.
connection feature enabled.

The worklog will use Single Monitor IO thread as it seems better for performance and mutex contention issues.

3.2. Monitor IO thread start

Please check section 3.1.1.

3.3. Monitor IO thread stop

Please check section 3.1.2.

3.4. Monitor IO thread updates Sender List

The Monitor IO thread updates Sender List (replication_asynchronous_connection_failover table) for following changes to group (group replication) membership:

  • When sender leaves the group, then its entry in the Sender List is removed.

  • When a new member joins a group, then connection details (host, port, network_namespace, Managed_name) of this new members is added to the Sender List on the receiver.

  • When a sender changes group, then its entry in the Sender List is first removed, as shown in figure below:

    Sender List                     Sender List
  [S1,S2,S3,S4,S5,S6]            [S1,S2,S3,S4,S5,S6]
     -------------                  -------------
     |  S1:UUID1 |                  |  S1:UUID1 |
     |  S2:UUID1 |    ----->        |  S2:UUID1 |
     |  S3:UUID1 |   S3 changes     | (removed) |
     |  S4:UUID2 |    group         |  S4:UUID2 |
     |  S5:UUID2 |                  |  S5:UUID2 |
     |  S6:UUID2 |                  |  S6:UUID2 |
     -------------                  -------------

And later if it appears in its new group member's membership table
(performance_schema.replication_group_members), then its re-added back to
receiver Sender List with new Managed_name, as shown below:

    Sender List                     Sender List
  [S1,S2,S3,S4,S5,S6]            [S1,S2,S3,S4,S5,S6]
     -------------                  -------------
     |  S1:UUID1 |                  |  S1:UUID1 |
     |  S2:UUID1 |    ----->        |  S2:UUID1 |
     |  S4:UUID2 |   S3 changes     |  S3:UUID2 |
     |  S5:UUID2 |    group         |  S4:UUID2 |
     |  S6:UUID2 |                  |  S5:UUID2 |
     -------------                  |  S6:UUID2 |
                                    -------------

For a sender to be re-added back to receiver sender list, S3 in this
example, there must be another member of the sender's new group. In the
above example we already have S4, S5 and S6 in Sender List which belongs
to same new group (UUID2) to which S3 joined. So S3 was removed and
re-added to sender list after reading membership table from either of
S4, S5 or S6.

  • The replication asynchronous connection failover feature can be enabled or disabled on running channel (replica doesn't need to be stopped), so Monitor IO thread skips updating sender details or stopping channel for the channel which is no longer managed i.e. SOURCE_CONNECTION_AUTO_FAILOVER = 0.

  • When a sender who has group replication enabled, and its group replication role changes, it's failover weight on receiver must be assigned with primary or secondary weight provided in group's Configuration value of the mysql.replication_asynchronous_connection_failover_managed for the channel. The primary_weight would be assigned to failover weight of primary of the group, and secondary weight to rest of group members.

3.5. Monitor IO thread reconnects replication channel

  1. If the sender is connected to replication channel and group replication enabled, and if it loses quorum then Monitor IO thread on the receiver will attempt failover to one of the sender who if member of the group has majority or single server (without group replication enabled).

  2. While querying for membership changes or quorum if Monitor IO thread is not able to connect to the sender or the executed queries fails, and channel is connected to same sender where this failure happened then it will attempt failover to one of the sender who if member of the group has majority or single server (without group replication enabled).

  3. When a sender with higher failover weight is found, compared to sender to which asynchronous replication connection is currently connected to, then the asynchronous replication will reconnect to sender having higher failover weight for the channel.

4. USER INTERFACE

The existing UDF asynchronous_connection_failover_add_source() and asynchronous_connection_failover_delete_source() will be used to add/remove unmanaged senders, the senders User/DBA doesn't want to be managed/monitored for any changes to group replication role or state changes. Two new UDFs will be added with this worklog to add/delete sender details which needs to be managed/monitored.

4.1. Add managed source configuration details

The asynchronous_connection_failover_add_managed() UDF can be used to add managed source configuration details:

SELECT asynchronous_connection_failover_add_managed(channel, managed_type, managed_name, host, port, network_namespace, primary_weight, secondary_weight);

Example for the managed_type 'GroupReplication':

SELECT asynchronous_connection_failover_add_managed('ch1', 'GroupReplication', 'UUID', '127.0.0.1', 3310, 'blue', 80, 60);

+--------------------------------------------------------------------------------------------------------------------+
| asynchronous_connection_failover_add_managed('ch1', 'GroupReplication', 'UUID', '127.0.0.1', 3310, 'blue', 80, 60) |
+--------------------------------------------------------------------------------------------------------------------+
| Source managed configuration details successfully inserted.                                                        |
+--------------------------------------------------------------------------------------------------------------------+
1 row in set (0,01 sec)

The Monitor IO thread would assign primary_weight to failover weight of primary of the group, and secondary weight to rest of group members.

4.2. Delete managed source configurations details

The asynchronous_connection_failover_delete_managed() UDF can be used to delete managed source configuration details:

SELECT asynchronous_connection_failover_delete_managed(channel, managed_name);

Example for the managed_type 'GroupReplication':
SELECT asynchronous_connection_failover_delete_managed(ch1', 'UUID');
+------------------------------------------------------------------------------+
| asynchronous_connection_failover_delete_managed(ch1', 'UUID')                |
+------------------------------------------------------------------------------+
| Source managed configuration details successfully deleted.                   |
+------------------------------------------------------------------------------+
1 row in set (0,01 sec)

Arguments:

The UDF's arguments used in above section 4.1 and 4.2 has following meaning:

  • channel The replication channel name that connects source and replica.

  • Managed_type The type of Managed service which needs to be provided. Currently, the only value it would accept and work upon is 'GroupReplication', through which group member connection details can be synced.

  • Managed_name The identifier to determine senders which needs to be managed. When the the managed_type is 'GroupReplication', the Managed_name would be group_replication_group_name system variable's value.

  • host The source hostname that the replica will attempt to switch over the replication connection to in case of a failure.

  • port The source port that the replica will attempt to switch over the replication connection to in case of a failure.

  • network_namespace The source network namespace that the replica will attempt to switch over the replication connection to in case of a failure. If its value is empty, connections use the default (global) namespace.

  • primary_weight When the the managed_type is 'GroupReplication', the primary_weight would be assigned to failover weight of the primary of the the group.

  • secondary_weight When the the managed_type is 'GroupReplication', the secondary_weight would be assigned to failover weight of all the secondaries of the the group.

Return value

A string containing the result of the operation, for example whether it was successful or not.

4.3. UDF Errors

4.3.1. Add managed source configuration details validation checks

The UDF asynchronous_connection_failover_add_managed() would return ER_UDF_ERROR error on following failures:

  1. Check that the parameter count is greater than 8. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong arguments: You need to specify all mandatory arguments.

  2. Check that the channel name is provided. The channel_name cannot be empty. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong arguments: You need to specify channel name.

  3. Check that the managed_name is provided. The managed_name cannot be empty. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong arguments: You need to specify managed_name.

  4. Check that the managed_type is provided. The managed_type cannot be empty. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong arguments: You need to specify managed type.

  5. Check host parameters value is provided and not empty. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong arguments: You need to specify hostname.

  6. Check port parameters value is provided. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong arguments: You need to specify value for port.

  7. Check primary_weight and secondary_weight parameters values are provided. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong arguments: You need to specify value for both primary_weight and secondary_weight.

  8. Check the primary_weight parameter value is between 1-100. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong argument: The primary_weight argument value must be between 1-100.

  9. Check the secondary_weight parameter value is between 1-100. ERROR 1123 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong argument: The secondary_weight argument value must be between 1-100.

  10. Check if the source is already added. ERROR 3200 (HY000): asynchronous_connection_failover_add_managed UDF failed; Wrong argument: Error inserting row to the table.

4.3.2. Delete primary configuration details validation checks

The UDF asynchronous_connection_failover_delete_managed() would return ER_UDF_ERROR error on following failures:

  1. Check that the parameter count is greater than 2. ERROR 3200 (HY000): asynchronous_connection_failover_delete_managed UDF failed; Wrong arguments: You must specify all arguments.

  2. Check that the channel name is provided. The channel_name can be empty. ERROR 3200 (HY000): asynchronous_connection_failover_delete_managed UDF failed; Wrong arguments: You must specify channel name.

  3. Check that the managed_name is provided. The managed_name cannot be empty. ERROR 3200 (HY000): asynchronous_connection_failover_delete_managed UDF failed; Wrong arguments: You must specify managed name.

4.3.3. Errors while registering/unregistering asynchronous connection failover UDFs

When mysql server initializes, it will also register asynchronous_connection_failover_add_managed() and asynchronous_connection_failover_delete_manaaged() UDFs, so that User/DBA can use them to add/delete managed source configuration details. And when mysql stops it does unregisters them.

The following errors can occur while registering/unregistering these UDFs: 1. Error acquire registry service. If it fails to acquire registry service which is used for UDF registration, then it fails with ER_UDF_REGISTER_SERVICE_ERROR error.

  1. Error registering asynchronous connection failover UDFs. After acquiring register service UDFs are registered, and if it fails to register these UDFs, then it fails with ER_UDF_REGISTER_ERROR error.

  2. Error unregistering asynchronous connection failover UDFs. And when mysql server stop it also unregisters asynchronous_connection_failover_add_managed() and asynchronous_connection_failover_delete_managed() UDFs, and if there is error in unregistering it fails with ER_UDF_UNREGISTER_ERROR.

5. SECURITY CONTEXT

The replication user on the source server must have

GRANT SELECT ON performance_schema.* TO 'USER';

in order to select the performance schema tables:

  • information_schema.plugins

  • performance_schema.replication_group_members

  • performance_schema.global_status

  • performance_schema.global_variables

6. CROSS-VERSION REPLICATION

There is no impact on cross-version replication.

7. UPGRADE/DOWNGRADE

The worklog will introduce a new column Managed_name in the replication_asynchronous_connection_failover table, but that is not expected to cause problems.

8. PROTOCOL

There are no changes in any protocol.

9. FAILURE MODEL SPECIFICATION

There will be no modifications to failure model specification.

10. OBSERVABILITY

A new Monitor IO thread would be added with this worklog and would be added to performance_schema.threads with following details:

NAME    thread/sql/replica_monitor
TYPE:   FOREGROUND
PROCESSLIST_STATE: [stage_connecting_to_sources/
                   stage_fetching_source_member_details/
                   stage_upating_source_member_details/
                   stage_wait_before_next_fetch]

Processlist State details:
PSI_stage_info stage_connecting_to_sources= { 0, "Connecting to sources", 0, PSI_DOCUMENT_ME};
PSI_stage_info stage_fetching_source_member_details= { 0, "Fetching source member details from connected source", 0, PSI_DOCUMENT_ME};
PSI_stage_info stage_upating_source_member_details= { 0, "Updating fetched source member details on receiver", 0, PSI_DOCUMENT_ME};
PSI_stage_info stage_wait_before_next_fetch= { 0, "Wait before trying to fetch next membership changes from source", 0, PSI_DOCUMENT_ME};

Interface Specification

Events that have severity higher than information/note, shall be written to the error log as well. The following errors shall be written:

I.1. ER_RPL_ASYNC_SENDER_REMOVED [WARNING]

eng "The source (host:%s port:%u network_namespace:%s) for channel '%s' has left the group (group_name: %s), and so removed its entry from replication_asynchronous_connection_failover table. Check for errors in its error log and if found add its connection details again after correction using asynchronous_connection_failover_add_source() UDF."

I.2. ER_RPL_ASYNC_SENDER_ADDED [WARNING]

eng "The source (host:%s port:%u network_namespace:%s) for channel '%s' has joined the group (group_name: %s), and so added its entry into replication_asynchronous_connection_failover table."

I.3. ER_RPL_ASYNC_CHANNEL_STOPPED_QUORUM_LOST [ERROR]

eng "The source (host:%s port:%u network_namespace:%s) has lost quorum, and so channel '%s' connected to it will be stopped. Check for errors in its error log and on correction add its connection details again using asynchronous_connection_failover_add_source() UDF."

I.4. ER_RPL_ASYNC_CHANNEL_CANT_CONNECT_NO_QUORUM [ERROR]

eng "The source (host:%s port:%u network_namespace:%s) does not have quorum, and so connection will be killed for channel '%s'. Check for errors in its error log."

I.5. ER_RPL_ASYNC_CHANNEL_CANT_CONNECT [ERROR]

eng "Failed to connect to the source (host:%s port:%u network_namespace:%s) for channel '%s'. Check for errors in its error log."

I.6. ER_RPL_ASYNC_EXECUTING_QUERY [ERROR]

eng "Failed executing %s query on the source (host:%s port:%u network_namespace:%s) for channel '%s'. Check for other errors in its error log."

11. SUMMARY OF CHANGES

11.1 Monitor IO Thread

1 The replication asynchronous connection failover feature is enabled for the channel by executing below command:

CHANGE MASTER TO SOURCE_CONNECTION_AUTO_FAILOVER = 1 [FOR CHANNEL channel]

2 The Replica or Replica IO thread is started using any of below sql command:

START REPLICA [FOR CHANNEL channel] START REPLICA IO_THREAD [FOR CHANNEL channel]

3 The step 1 and 2 executed in any order will start Monitor IO Thread. The Monitor IO thread does not have explicit command to start, but require both replication asynchronous connection failover feature and IO thread to be enabled.

While starting Monitor IO thread it is also checked if there is already one enabled. As there will be only Single Monitor IO thread for all the channels, so it is verified if one is not already running before starting a new one.

4 For each replica channels for which replica IO thread is started and also replication asynchronous connection failover feature is enabled, the following task are execute:

4.i. Get stored sender details for the channel from replication_asynchronous_connection_failover table.

4.ii. If not already connected, then connect to all senders gathered in step 4.i. and store their connection object.

5 For each connection stored in step 4.ii. is picked and processed as below.

5.i. This all below points does apply to all below tasks (5.ii. - 5.iv.) where sender membership and quorum details are gathered and updated:

A. But if once data is updated for any sender using connection connection picked in this step, then any further connection to its group members will be ignored as we already found stable group member which has majority and ONLINE. The new connection to another sender is picked in step 5, so if we have already gathered membership data in step 5.i.A. and updated in 5.iv.A.a, 5.iv.A.b and 5.iv.A.c, then membership data doesn't need to be updated again from another sender.

B. While the Monitor IO thread reads sender's details from replication_asynchronous_connection_failover table and till it updates changes to replication_asynchronous_connection_failover table, if there is any insert or delete using asynchronous_connection_failover_add_source() or asynchronous_connection_failover_delete_source() UDFs respectively, then Monitor IO will ignore its gathered updated data and will restart gathering new sender details again.

C. Before every new query and update to replication_asynchronous_connection_failover table it is checked if Monitor IO thread has been terminated or stopped, and if it is then this monitoring process is stopped and returned.

D. If the replication asynchronous connection failover feature is disabled for the channel, of which current sender (connection is to a particular sender) is part of i.e. sender's channel in replication_asynchronous_connection_failover table is equal to current channel and its MANAGED = 0, then jump to step 6.

5.ii. We will use following terms for the QUORUM status further in this algorithm:

1 -> MANAGED_GR_HAS_QUORUM
2 -> MANAGED_GR_HAS_ERROR
3 -> NOT_MANAGED

For each connection to sender, we would check whether its managed i.e. managed_name column for the sender is not empty in mysql.replication_asynchronous_connection_failover table. And if its not empty, its QUORUM status is determined.

It can be summarized as below:

if (managed_name):
    // server is always handled like a group member
    // then we can check
    if (not GR installed)
       2 -> MANAGED_GR_HAS_ERROR
    if (quorum)
       1 -> MANAGED_GR_HAS_QUORUM
    else
       2 -> MANAGED_GR_HAS_ERROR
else:
    3 -> NOT_MANAGED

To determine QUORUM status for the sender following query would be used:

SELECT * FROM (
(SELECT CASE
 WHEN ((SELECT COUNT(*) FROM
       mysql.replication_asynchronous_connection_failover_managed RAFM,
       mysql.replication_asynchronous_connection_failover RAF
       WHERE RAF.Managed_name=RAFM.Managed_name
       AND   RAF.channel_name=RAFM.channel_name
       AND   RAF.Host=@@global.hostname
       AND   RAF.Port=@@global.port) = 1)
 THEN (
  SELECT CASE
    WHEN ((SELECT count(*) from information_schema.plugins WHERE
           PLUGIN_NAME LIKE 'group_replication') <> 1)
    THEN (SELECT 2) /* MEMBER HAS GR NOT INSTALLED */

    WHEN ((SELECT IF(MEMBER_STATE='ONLINE' AND
          ((SELECT COUNT(*) FROM
            performance_schema.replication_group_members
            WHERE MEMBER_STATE != 'ONLINE' AND MEMBER_STATE != 'RECOVERY')
            >= ((SELECT COUNT(*) FROM
                 performance_schema.replication_group_members)/2)=0),1,0)
       FROM performance_schema.replication_group_members
       JOIN performance_schema.replication_group_member_stats
       USING(member_id)
       WHERE member_id=@@global.server_uuid) = 1)
    THEN (SELECT 1) /* MEMBER HAS QUORUM ITS GROUP MEMBERS NEEDS TO BE
                             ADDED TO SENDER LIST */

    ELSE (SELECT 2) /* MEMBER HAS NO QUORUM */
  END AS QUORUM)

 ELSE (select 3) END AS MANAGED) /* NOT A GROUP MEMBER */
) Q;

5.iii. If QUORUM status is NOT_MANAGED then ignore this sender.

5.iv. If QUORUM status is MANAGED_GR_HAS_QUORUM:

5.iv.A. Get membership details from the connected sender by executing below query:

SELECT @@global.group_replication_group_name, PRGM.MEMBER_HOST,
       PRGM.MEMBER_PORT, PRGM.MEMBER_STATE, PRGM.MEMBER_ROLE
FROM performance_schema.replication_group_members PRGM;

Save group name details so new connection to another sender of the same group can be ignored in step 5.

For each member from the above table below tasks are executed.

5.iv.A.a. If member state is either of ONLINE, RECOVERING or UNREACHABLE, then insert this row in replication_asynchronous_connection_failover table with secondary_weight or primary_weight for new members based on member role. This step will accomplish FR3 requirement.

5.iv.A.b. If member is not ONLINE or group_name has changed i.e. member has changed group and new Managed_name differs from old Managed_name in replication_asynchronous_connection_failover table. If this member is connected through asynchronous channel i.e. host and port of member match, with channel host and port, then attempt reconnect this replication channel to another sender. The Monitor IO thread disconnects the existing replication channel connection and then IO thread then attempts to reconnect this replication channel to another sender from the Sender List. The network_namespace is not checked as group_replication doesn't use network_namespace yet.

5.iv.A.c. If a member is OFFLINE or in ERROR state, that is, it is not included on the gathered membership, then remove this member details from the replication_asynchronous_connection_failover table. If this member is connected through asynchronous channel i.e. host and port of member match, with channel host and port, then attempt reconnect this replication channel to another sender. The Monitor IO thread disconnects the existing replication channel connection and then IO thread then attempts to reconnect this replication channel to another sender from the Sender List. The network_namespace is not checked as group_replication doesn't use network_namespace yet.

5.iv.C. If QUORUM status is MANAGED_GR_HAS_ERROR:

If this member is connected through asynchronous channel i.e. host and port of member match, with channel host and port, then attempt reconnect this replication channel to another sender. The Monitor IO thread disconnects the existing replication channel connection and then IO thread then attempts to reconnect this replication channel to another sender from the Sender List. The network_namespace is not checked as group_replication doesn't use network_namespace yet.

6 The Monitor IO thread will sleep for 5 seconds and after awake it jumps to step 4 to restart whole monitoring updates to sender process.

11.2. IO Thread

The only changes to IO thread in this worklog will be checks for its member state and quorum status after it gets connected to sender.

It would check if connected sender has GROUP REPLICATION enabled, and if enabled: - has QUORUM, and, - is ONLINE, that is, not RECOVERING member.

If the connected sender doesn't have QUORUM or is not ONLINE, then IO thread will look for new sender, who if is a member of a group satisfy above criteria.

If we do not add these checks in IO thread then receiver can connect again to same sender which was earlier disconnected by Monitor IO thread because of lost QUORUM. Also we are not allowing IO thread to connect to RECOVERING member, because recovery can take time and would unecessary delay receiver in getting binary logs.

11.3. On START REPLICA:

When replication asynchronous connection failover is already enabled and then replica or replica IO thread is started using any of below sql command:

START REPLICA [FOR CHANNEL channel]

START REPLICA IO_THREAD [FOR CHANNEL channel]

  • This will also start Monitor IO thread. The Monitor IO thread does not have explicit command to start, but enabling replication asynchronous connection failover feature and then starting IO thread would also start Monitor IO thread.

  • When replica IO thread connects a source and if replication asynchronous connection failover feature is also enabled on receiver, then it checks for quorum on the source (when group replication is enabled). Otherwise it keeps trying to connect another source (sender) from the Sender List according to the highest priority. It keeps on retrying in an endless loop till either the IO thread is stopped using STOP REPLICA (IO_THREAD) or it finds a sender which has quorum.

11.4. On Replication Asynchronous Connection Failover enabled:

When replication asynchronous is running and then replication asynchronous connection failover is enabled using any of below sql command:

CHANGE MASTER TO SOURCE_CONNECTION_AUTO_FAILOVER = 1 [FOR CHANNEL channel]

  • This will also start Monitor IO thread. The Monitor IO thread does not have explicit command to start, but if IO thread is running and then enabling replication asynchronous connection failover feature would start Monitor IO thread.

11.5. On STOP REPLICA:

When replica or replica IO thread is stopped using any of below sql command:

STOP REPLICA [FOR CHANNEL channel]

STOP REPLICA IO_THREAD [FOR CHANNEL channel]

  • This will also stop Monitor IO thread.

11.6. On Replication Asynchronous Connection Failover disabled:

When replication asynchronous connection failover is disabled using any of below sql command:

CHANGE MASTER TO SOURCE_CONNECTION_AUTO_FAILOVER = 0 [FOR CHANNEL channel]

  • This will also stop Monitor IO thread.

11.7. On Monitor IO thread start:

When Monitor IO thread starts on receiver it connects all the senders present in the Sender List and monitors for group (group replication) membership changes.

The Sender List gets updated for following changes to group (group replication) membership:

11.7.1. On sender joining group:

When a new member joins the group, whose other members are already present in the Sender List, then connection details (host, port, network_namespace, Managed_name) of this new member is added to the Sender List on the receiver i.e. add sender connection details to replication_asynchronous_connection_failover table.

11.7.2. On sender leaving group:

When a sender leaves a group, then its entry on receiver in the Sender List is removed i.e. deleted sender connection details in replication_asynchronous_connection_failover table.

11.7.3. On sender losing quorum:

If sender is connected through asynchronous replication channel for which replication asynchronous connection failover is enabled, then attempt reconnect this replication channel to another sender.

11.7.4. On sender role change:

When a sender group replication role changes, it's failover weight on receiver must be assigned with primary or secondary weight provided in group's configuration value of the mysql.replication_asynchronous_connection_failover_managed for the channel. The primary_weight would be assigned to failover weight of primary of the group, and secondary weight to rest of group members.

11.7.5. On Managed_name different from sender group name:

If group replication is enabled on a sender, and its Managed_name column is non-empty and non-matching to group name on sender, then sender entry is deleted from replication_asynchronous_connection_failover table on receiver.

11.7.6. On Sender failover weight change:

When a sender with higher failover weight is found, compared to sender to which asynchronous replication connection is currently connected to, then the asynchronous replication will reconnect to sender having higher failover weight for the channel.