WL#13188: Add support for InnoDB ReplicaSet in Router

Affects: Server-8.0   —   Status: Complete

Async Replica Sets are a new solution for creation and management of asynchronous MySQL replication setups. They provide ease-of-use and out-of-box usability similar to InnoDB Clusters for Group Replication setups.

Like InnoDB Clusters, they require support across several MySQL components, MySQL Server, MySQL Router and MySQL Shell.

This WL# describes the necessary parts for supporting Async Replica Sets in the MySQL Router.


When using the MySQL Shell's AdminAPI to create an Async Replica Set, it creates the metadata schema that holds topology information, similar to working with InnoDB Clusters.

The shell performs the necessary steps to setup asynchronous MySQL replication and manage it during operations.

The MySQL Router needs to be extended to understand the concept of Async Replica Sets, load and cache the topology information from the metadata schema and regularily update the status information of the setup in order to perform the correct routing operations.

Version Compatibility

The InnoDB cluster metadata schema will be changed and its version increased.

However, for backwards compatibility reasons, older versions of the MySQL Router should still be able route using the newer version of the metadata schema, albeit with some limitations (bootstrap should fail).

Similarly, newer versions of MySQL Router should be able to function with the older version of the metadata schema, as well as with the new one.

This cross-compatibility requirement is a one-time requirement and should not be necessary in future metadata upgrades, as a stable public interface for the metadata schema will be provided.

Bootstrapping

  • B-FR1 - A target metadata schema version of 2.0.x must be accepted for bootstrapping. Bootstrapping compatibility with metadata version 1.y.z should not be supported (expected router to refuse to bootstrap).

  • B-FR2 - Bootstrap must recognize if the target instance belongs to a InnoDB cluster or an Async replica set.

  • B-FR3 - If the target instance belongs to a InnoDB cluster, the regular InnoDB cluster bootstrap process must be executed, but using the new v2_* public metadata views.

  • B-FR4 - If the target instance belongs to an Async replica set, a similar process for replica sets must be executed, where:

    • 4.1 - v2_ar_members is queried for members of the same replica set and their addresses cached locally, including the indication of the PRIMARY member
    • 4.2 - if the PRIMARY member is not the one the router is bootstrapping from, it must automatically reconnect to that PRIMARY and restart the bootstrap process
    • 4.3 - the bootstrap process must be aborted if not possible to connect to the PRIMARY member

Metadata Cache: General

  • MD-FR1 - If the metadata version is 1.0, the existing behavior should be preserved

  • MD-FR2 - If the metadata version is 2.0, no tables other than the public views with a prefix v2_* must be queried

  • MD-FR3 - Metadata version upgrades between metadata refreshes must be supported (that is, Router must not require a restart after a metadata schema upgrade - WL#13417)

Metadata Cache: Async Replica Sets

  • R-FR1 - Each time the Router does the metadata refresh it should start with the last known set of the members (either from the state file or from the last successful refresh round).

  • R-FR2 - Each refresh round the router should attempt to connect to each of the members from the set described in [R-FR1].

    • R-FR2.1 If the Router was able to connect to the member it should fist query the cluster view_id this member has.
      • R-FR2.1.1 If the view_id is greater than any other view_id it saw during this refresh round it should query the metadata from that member and replace the new cluster members list with what was queried from that member. It should not matter if the member it queries is PRIMARY or SECONDARY, the deciding factor should be the view_id.
        • R-FR2.1.1.1 The additional check that should be done when querying the view and the metadata is the cluster_id. The router should only accept the metadata if the cluster_id is the same as the one saved in the state file during the bootstrap. If the cluster_id does not match it should be treated as if we could not connect to the member.
      • R-FR2.1.2 If the view_id is equal to or smaller than the highest view_id it saw during this round it should not do any other queries. It should proceed to the next member from the list.
    • R-FR2.2 Not being able to connect or failure while doing any query should have the same result as in [R-FR2.1.2]: proceeding to the next member from the list.
    • R-FR2.3 The latest view_id should be cached between metadata refresh rounds. Router should always only care about the metadata with view_id equal to or greater than the one it has cached.
  • R-FR3 If after finishing the whole round (all the members from the list queried) the new members list is empty (because for example the Router could not connect to any of the members or failed to fetch the metadata etc.) the routing data should be cleared. That means that all the existing connections should be dropped and no new connections should be accepted until the Router manages to get the current metadata during the following attempts.

    • R-FR3.1 However if the new member list is empty it should not replace the last non-empty list of members used for following refresh attempts. This would lead to the scenario from which the Router could not recover. For the next round of the metadata refresh always the last non-empty list the Router managed to get should be used.
    • R-FR3.2 Also if the new members list is empty it should never be written to the state file.
  • R-FR4 In case the new members list is not empty the router should compare it with the last known member list (the one it just used for the refresh).

    • R-FR4.1 If it has not changed no further action is needed.
    • R-FR4.2 If it has changed the Router should store the new members list in the state file.
    • R-FR4.3 If it has changed the new connections should be routed to the members from a new list. The new members list becomes the current list and should be also used for the next round the metadata refresh.
    • R-FR4.4 If it has changed all the user connections to any member that was on the list before and is not on the new member lists should be forcefully closed.
    • R-FR4.5 If the PRIMARY change has been detected, all connections to the previous PRIMARY should be forcefully closed.

Remote Monitoring

  • M-FR1 - Same monitoring endpoints as available for clusters must be available for replica sets

  • M-FR2 - It must be possible to query the type of a routing target

Since the Router now is able to bootstrap against both GR-based Cluster and Async Replicaset Cluster there is a new option "cluster_type" added on bootstrap to the config file [metadata_cache] section. For example:


[metadata_cache:mycluster]
cluster_type=ar
router_id=1
user=mysql_router1_ritc56yrjz42
metadata_cluster=mycluster
ttl=0.5

It generally (as most of the other [metadata_cache] options) should be considered read-only and not modified by the user manually. The allowed values are "ar" and "gr". Router would fail to start if the value is different than that. If the configured type does not match the cluster type discovered in the runtime router will log an error and fail to update the metadata.