WL#12089: Support for dynamic changes betweem single and multi master GR mode
WL#10378 "Group Replication: group single/multi primary mode change and primary election" enables dynamic changes of the single/multi master mode in the GR. Some changes are need in the MySQLRouter to support that.
In the current implementation when the Router is bootstraped it queries the current metadata state and creates a static configuration based on the result of that queries.
If it discovers the single master mode it creates 2 pairs of the input ports (one for classic and one for x protocol). Fist port is for r/w operations (role=PRIMARY) and second is for r/o operations (role=SECONDARY).
If it discovers the multi master mode it creates only single pair of the input ports for r/w operations (role=PRIMARY).
That leads to 2 problems when the mode changes dynamically with such default static configuration:
In case of multimaster->single-master change there is no port for non-PRIMARY nodes which means they become unreachable
In case of single-master->multimaster change there is port for role=SECONDARY which will become unusable as there no longer are SECONDARY nodes.
Function Requirements
- FR1
- Bootstrap shall generate a configuration file, which will have both RW and RO Routing sections for all selected/implied transports (TCP, named-socket; classic and X protocols), regardless of how many PRIMARY or SECONDARY nodes the cluster has.
- FR2
- Configuration generated according to FR1 should be able to service RW connections when 1 or all nodes are PRIMARY, and should be able to service RO connections for any number of (PRIMARY nodes + SECONDARY nodes) > 0.
- FR3
- To fullfill FR2, RO connections should be routed to RW nodes when there are no available RO nodes.
- FR4
- FR2 and FR3 must remain fulfilled even if number of PRIMARIES and SECONDARIES changes during Router operation, without having to restart Router.
- FR5
- When a particular node changes state from RW to RO, all current connections that are routed to it via RW port shall be closed (RO routings will remain open, because the user never expected to make RW requests through them in the first place).
Please note that FR1 and FR2 are true new features of this WL which require implementation. FR3-FR5 are already implemented, but need verifying that they also work for this new use case (dynamic switching between SM and MM modes).
PROPOSED CHANGES
1. Bootstrap-generated configuration
To implement FR1 and FR2, all we really need is a new "universal" configuration file, which will handle both SM and MM modes well. It should be generated by bootstrap. Except for the Routing plugin sections (shown below), all else would remain unchanged. Example configuration after bootstrapping:
[routing:mycluster_default_rw]
bind_address=0.0.0.0
bind_port=6446
destinations=metadata-cache://mycluster/default?role=PRIMARY
routing_strategy=first-available
protocol=classic
[routing:mycluster_default_ro]
bind_address=0.0.0.0
bind_port=6447
destinations=metadata-cache://mycluster/default?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=classic
[routing:mycluster_default_x_rw]
bind_address=0.0.0.0
bind_port=64460
destinations=metadata-cache://mycluster/default?role=PRIMARY
routing_strategy=first-available
protocol=x
[routing:mycluster_default_x_ro]
bind_address=0.0.0.0
bind_port=64470
destinations=metadata-cache://mycluster/default?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=x
Aside from the fact that from now on we will always generate both RW and RO configuration sections, the only other change vs what we have now is the routing_strategy appearing in those sections:
routing | now | WL12089 |
---|---|---|
RW | round-robin | first-available |
RO | round-robin | round-robin-with-fallback |
Such configuration change will change the behaviour of routing as summarised in the table below (3-node cluster is shown, but the idea can be extrapolated to any N-node cluster):
nodes | loop patterns (now) | loop patterns (WL12089) | |||||||
A | B | C | RW: round-robin (on new conn) | RO: round-robin (on new conn) | RW: first-available (on conn error) | RO: round-robin-with-fallback (on new conn) | |||
w | w | w | A B C | fail | A B C | A B C | |||
w | w | r | A B | C | A B | C | |||
w | w | _ | A B | fail | A B | A B | |||
w | r | r | A | B C | A | B C | |||
w | r | _ | A | B | A | B | |||
w | _ | _ | A | fail | A | A | |||
r | r | r | fail | A B C | fail | A B C | |||
r | r | _ | fail | A B | fail | A B | |||
r | _ | _ | fail | A | fail | A | |||
_ | _ | _ | fail | fail | fail | fail |
Please note that configurations with multiple 'w' above are not currently supported by the InnoDB cluster, they may however appear in the future.
2. Default routing strategies
Right now, if we omit routing_strategy option in Routing plugin configuration section, it will default to round-robin. This default will remain unchanged.