WL#8443: Group Replication: Plugin version handshake on member join
Affects: Server-5.7
—
Status: Complete
Focused on Group replication, this WL shall present a solution to deal with the problem of having members with different versions on the group. One shall present an algorithm of Version Handshaking that will determine what different versions can belong to the same group and if upgrade and downgrade paths are allowed. The outcome of this WL shall be: - A version handshake algorithm implementation - A foundation on how to declare the compatibility between versions.
FR1: A member shall use the installing server version to discover its compatibility. FR2: When a new member joins an existing group, it should be aware of other member's version. FR3: A new joining member shall be able to automatically act upon the fact that it is not compatible with the rest of the group. FR4: If a member has a major version equal to the group, it shall join the group. FR5: If a member with a lower major version than the group joins, it shall leave the group. FR6: If a member has a higher major version than the group, it shall join the group but can't write to it. FR7: There should be a way for a developer to introduce exceptions to the rule defined in FR4. FR8: The exception mechanism defined in FR7 shall not be changeable by an End User. FR9: The rule FR5 can be disable by a user defined option.
With the continuous development of Group Replication, there is a need for a management of the different versions that can coexist in the same environment. This shall address the DBA classical problem of Interoperability issues, in which one states which version can operate with one another. For this to happen, this WL has two major tasks: - Define HOW one should state version interoperability. - Define WHEN this verification should happen and WHAT behavior the system should have. Regarding how this should be defined, there should be a clear and fixed rule stating the paths in which one can operate. An mechanism should also be in place to explicitly declare version interoperability exceptions, by a developer. The main requirement to both verification paths is that they cannot be changed by an End-User/DBA. This verification should occur every time a new member joins a group. This means that the joining member shall be responsible to check its interoperability status with the rest of the group and act upon it, if needed. This will avoid the implementation of eviction policies in the Group Communication framework. In terms of software architecture, a new component shall be created to store: - The Compatibility algorithm, with its rules and exceptions; - All operations that can be done upon it.
This section shall detail the architecture and implementation
topics discussed in the HLS: how to define compatibility and
how to enforce it in the execution flow.
1. Compatibility Definition
One can state that versions are compatible if they are able to
talk with each other in a compatible way. This means that we must define
what is a breach in compatibility. Interoperability shall be deemed
impossible when the messages exchanged between members become
incompatible. This can happen in the following scenarios:
- Message format changed: The messages that are exchanged between
members and their encoding changed in an incompatible way.
- Event format changed: The events that are exchanged between members
changed.
- Message Protocol changed: The Messages that are exchanged
changed in an incompatible way by means of its order or even new messages.
This means that you need to send and receive different messages when
you belong to a group.
We should have two ways to deduce this incompatibility:
- Via a generic rule (Compatibility Rule) for all members;
- Via static rules.
The Compatibility Rule is supported by the fact that all version inside the
same major version shall be compatible, unless something catastrophic happens.
This rule can be seen as:
- A member from the same major version can enter and work in a group;
- A member with a superior major version that the ones in the group, can enter
but can only listen to the group in a Read Only mode (WL#TBD). This
shall be an enabler for an Upgrade process (WL#TDB)
- A member with a lower version than the ones in the group shall not enter
that group. The possibility to enter that group shall be detailed in
a downgrade worklog. This does not apply to minor and patch versions, i.e,
if two versions differ only on their minor/patch version, they are
always compatible.
As this incompatibly can be deduced automatically with the previous
algorithm, sometimes it might not be enough since one could have the
need to declare an explicit incompatibility with a version that would
otherwise would be approved using only the Compatibility Rule. An
example of this could be that, for instance, from version 2.3.3 to version
2.3.4, a message field was deleted rendering even Read-Only operations
useless.
For that, one must maintain a list in which a developer can explicitly
state that version A is incompatible. This shall be made always regarding
the current version. That structure should only contain versions that are
incompatible with the local member version. This static check must
happen before the Compatibility Rule.
2. Compatibility Algorithm
As described in the HLS, it is easier to implement this in the joiner
side. A rough joining algorithm would be:
- A new member joins the group
- At a low-level, the State Exchange occurs.
- If the State Exchange fails, the member is deemed incompatible at
that level and the join procedure must fail. This can be caused
by a similar mechanism implemented in the GCS Layer.
- The new member receives the new View and consequently, all Cluster
Member Infos, that must now include Version Information.
- The joiner checks if it is compatible with all members in the group.
- First he checks the table of explicit exceptions;
- Then it checks the generic rule;
- If it deems itself fully incompatible, it voluntarily leaves the group.
- If it deems itself partially incompatible, it voluntarily enters in
read-only mode.
- The previous two steps must happen before starting the Recovery
algorithm.
From the point of view of the existing group, in the case in which a new
member evicts itself, they will see a new View being delivered,
they will install it but in the meanwhile, a new view will arrive with
the member leaving.
This is the simplest option for now. One can think on improving this
in which the already existing members run the algorithm, deciding if
they will proceed with the View installation, but this can be addressed
as an algorithm improvement.
3. Forced entry in the group.
Even if a member is declared incompatible with the group due to the general rule
that states that lower major versions are incompatible, the user can still force
its entry.
The plugin shall facilitate a user option that allows a lower version to join
the group.
While dangerous, it can be possible that the versions are not indeed
incompatible or are so but only on some corner case, so a choice is given to the
DBA.
4. Code Improvements
In terms of code, one needs to create new modules and augment existing
ones.
Regarding new features, the plugin now needs to know:
- The server version
- How to broadcast its version
- Inform others members and the end user about its version.
In its development process, the Group Replication plugin is associated
to a server version. This is the version used for compatibility purposes.
But one needs to broadcast and receive information about all members versions.
For that, Cluster Member Info shall hold an extra field stating the each local
member version. That can be broadcast each time a new member joins, along with
the existing information.
A new module (Group_Replication_Versioning) must be created to hold:
- Compatibility Matrix and support structures
- Compatibility Algorithm
- Methods(s) that allow one to check version compatibility
e.g: bool is_compatible_with(Group_Replication_Version v);
One should also consider augmenting the P_S interface in order to state
the version of each member in Group_Replication_Members table.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.