WL#12089: Support for dynamic changes betweem single and multi master GR mode

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

WL#10378 "Group Replication: group single/multi primary mode change and primary election" enables dynamic changes of the single/multi master mode in the GR. Some changes are need in the MySQLRouter to support that.

In the current implementation when the Router is bootstraped it queries the current metadata state and creates a static configuration based on the result of that queries.

If it discovers the single master mode it creates 2 pairs of the input ports (one for classic and one for x protocol). Fist port is for r/w operations (role=PRIMARY) and second is for r/o operations (role=SECONDARY).

If it discovers the multi master mode it creates only single pair of the input ports for r/w operations (role=PRIMARY).

That leads to 2 problems when the mode changes dynamically with such default static configuration:

  1. In case of multimaster->single-master change there is no port for non-PRIMARY nodes which means they become unreachable

  2. In case of single-master->multimaster change there is port for role=SECONDARY which will become unusable as there no longer are SECONDARY nodes.

Function Requirements

FR1
Bootstrap shall generate a configuration file, which will have both RW and RO Routing sections for all selected/implied transports (TCP, named-socket; classic and X protocols), regardless of how many PRIMARY or SECONDARY nodes the cluster has.
FR2
Configuration generated according to FR1 should be able to service RW connections when 1 or all nodes are PRIMARY, and should be able to service RO connections for any number of (PRIMARY nodes + SECONDARY nodes) > 0.
FR3
To fullfill FR2, RO connections should be routed to RW nodes when there are no available RO nodes.
FR4
FR2 and FR3 must remain fulfilled even if number of PRIMARIES and SECONDARIES changes during Router operation, without having to restart Router.
FR5
When a particular node changes state from RW to RO, all current connections that are routed to it via RW port shall be closed (RO routings will remain open, because the user never expected to make RW requests through them in the first place).

Please note that FR1 and FR2 are true new features of this WL which require implementation. FR3-FR5 are already implemented, but need verifying that they also work for this new use case (dynamic switching between SM and MM modes).

PROPOSED CHANGES



1. Bootstrap-generated configuration

To implement FR1 and FR2, all we really need is a new "universal" configuration file, which will handle both SM and MM modes well. It should be generated by bootstrap. Except for the Routing plugin sections (shown below), all else would remain unchanged. Example configuration after bootstrapping:

[routing:mycluster_default_rw]
bind_address=0.0.0.0
bind_port=6446
destinations=metadata-cache://mycluster/default?role=PRIMARY
routing_strategy=first-available
protocol=classic

[routing:mycluster_default_ro]
bind_address=0.0.0.0
bind_port=6447
destinations=metadata-cache://mycluster/default?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=classic

[routing:mycluster_default_x_rw]
bind_address=0.0.0.0
bind_port=64460
destinations=metadata-cache://mycluster/default?role=PRIMARY
routing_strategy=first-available
protocol=x

[routing:mycluster_default_x_ro]
bind_address=0.0.0.0
bind_port=64470
destinations=metadata-cache://mycluster/default?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=x

Aside from the fact that from now on we will always generate both RW and RO configuration sections, the only other change vs what we have now is the routing_strategy appearing in those sections:

routing now WL12089
RW round-robin first-available
RO round-robin round-robin-with-fallback

Such configuration change will change the behaviour of routing as summarised in the table below (3-node cluster is shown, but the idea can be extrapolated to any N-node cluster):

nodes loop patterns (now) loop patterns (WL12089)
A B C RW: round-robin (on new conn) RO: round-robin (on new conn) RW: first-available (on conn error) RO: round-robin-with-fallback (on new conn)
w w w A B C fail A B C A B C
w w r A B C A B C
w w _ A B fail A B A B
w r r A B C A B C
w r _ A B A B
w _ _ A fail A A
r r r fail A B C fail A B C
r r _ fail A B fail A B
r _ _ fail A fail A
_ _ _ fail fail fail fail

Please note that configurations with multiple 'w' above are not currently supported by the InnoDB cluster, they may however appear in the future.

2. Default routing strategies

Right now, if we omit routing_strategy option in Routing plugin configuration section, it will default to round-robin. This default will remain unchanged.