WL#16432: GR: Up-to-date Aware Primary Election on Failover
Affects: Server-9.x
—
Status: Complete
Problem Statement
=================
Currently the DBA can only control the outcome of the primary election on
failover algorithm in single-primary mode by assigning priorities to members.
Though, that priority does not consider operational status, like the
most up-to-date the prioritized member is.
Proposed Solutions
==================
This worklog will enable the DBA to influence the primary member on failover in
single-primary mode by also taking replication member most up-to-date into
account.
From 9.3.0, this feature is available in Enterprise Edition.
From 9.7.0, this feature is available in Community and all other editions.
User Stories
============
- As a MySQL admin
I want to configure MySQL Group Replication, when in single-primary mode,
to elect the secondary most up-to-date on failover.
- FR03: The component shall have a service
`update_primary_election_status` to update status when was used most
up-to-date primary election method.
- FR04: When electing primary on failover using "most-up-to-date", when
multiple members are equal based on the criteria, the selection shall
follow the order:
1. most up to date
2. member weight
3. uuid lexical order
- FR05: The component will add a new status variable to have the
difference of transactions from new primary and second member most up
to date, when primary election method most up-to-date was used:
`Gr_latest_primary_election_by_most_uptodate_members_trx_delta`.
- FR06: The component will add a new status variable to have the
last timestamp when a primary was elected using most up-to-date election
method:
`Gr_latest_primary_election_by_most_uptodate_member_timestamp`.
- FR07: When a primary election on failover use most up-to-date method shall
log a message that announces new primary and number of transactions
that need to apply from backlog:
> ER_GRP_PRIMARY_ELECTION_METHOD_MOST_UPDATE
> 2024-10-08T16:07:48.100736Z 0 [Note] [MY-XXXXXX] [Server] Plugin
> group_replication reported: 'Group Replication Primary Election:
> Member with uuid 8a94f357-aab4-11df-86ab-c80aa9420000 was elected
> primary since it was the most up-to-date member with 100 transactions
> more than second most up-to-date member
> 8a94f468-aab4-11df-86ab-c80aa9420000. In case of a tie member weight
> and then uuid lexical order was used over the most updated members.'
- FR08: When a primary election use member weight order method shall log a
message that announces new primary:
> ER_GRP_PRIMARY_ELECTION_METHOD_MEMBER_WEIGHT
> 2024-10-08T16:07:48.100736Z 0 [Note] [MY-XXXXXX] [Server] Plugin
> group_replication reported: 'Group Replication Primary Election:
> Member with uuid 8a94f357-aab4-11df-86ab-c80aa9420000 was elected
> primary since it was highest weight member with value 70. In case
> of a tie uuid lexical order was used.'
- FR09: This election method does not interfere on group mode from
multi-primary to single-primary.
- FR10: This election method does not interfere on switchover, that is, when
the DBA selects the next primary.
- FR11: The stats shall be reset on INSTALL/UNINSTALL the component.
- FR12: The stats shall have member scope since they reflect what the local
member observes.
- FR13: The stats shall be reset on group bootstrap.
- FR14: The stats shall be reset on member join.
- FR15: The stats shall be reset on member automatic rejoin.
- FR16: The stats shall be reset on server restart.
- FR17: Primary election on failover will only use most up-to-date method when
all members have the component
`group_replication_elect_prefers_most_updated` installed
and `group_replication_elect_prefers_most_updated.enabled`
option is enabled.
- FR18: Previous versions and no component installed will be considered option
disabled, being the election ruled by the members weight, and fallback
to lexical order if weights are equal.
Non-Functional Requirements
===========================
Summary of the approach
=======================
Implement a component with a new primary election on failover, called most
up-to-date.
The new component will implement a service that look into all members gtid
executed and select member most up-to-date.
Security context
================
`SELECT * FROM performance_schema.global_status` statements do not require any
privilege.
`group_replication_elect_prefers_most_updated` is a component.
INSTALL COMPONENT requires the INSERT privilege for the mysql.component system
table because it adds a row to that table to register the component.
Observability
=============
The component will have a variable
`group_replication_elect_prefers_most_updated.enabled` that
enable/disable the most up-to-date primary election method on failover.
The following error messages will be added:
1. ER_GRP_PRIMARY_ELECTION_METHOD_MOST_UPDATE
2024-10-08T16:07:48.100736Z 0 [Note] [MY-XXXXXX] [Server] Plugin
group_replication reported: 'Group Replication Primary Election:
Member with uuid 8a94f357-aab4-11df-86ab-c80aa9420000 was elected
primary since it was the most up-to-date member with 100 transactions
more than second most up-to-date member
8a94f468-aab4-11df-86ab-c80aa9420000. In case of a tie member weight and
then uuid lexical order was used over the most updated members.'
2. ER_GRP_PRIMARY_ELECTION_METHOD_MEMBER_WEIGHT
2024-10-08T16:07:48.100736Z 0 [Note] [MY-XXXXXX] [Server] Plugin
group_replication reported: 'Group Replication Primary Election:
Member with uuid 8a94f357-aab4-11df-86ab-c80aa9420000 was elected
primary since it was highest weight member with value 70. In case
of a tie uuid lexical order was used.'
When the members of the group differ on using the most up-to-date election
method it will launch a warning.
The warning will be thrown on two scenarios, on a failover and on member join:
* ER_GRP_PREFER_MOST_UPDATED_CONFIG_DIFFER_ON_FAILOVER
2024-10-08T16:12:48.100736Z 0 [Warning] [MY-XXXXXX] [Server] Plugin
group_replication reported: Members have different configurations (option
group_replication_elect_prefers_most_updated.enabled) on usage of
most up-to-date method on primary failover, method will not be used.'
The following metrics, that can read throught global status variables on the
`performance_schema.global_status`, will be added:
Gr_latest_primary_election_by_most_uptodate_members_trx_delta
-------------------------------------------------------------
Difference number of transactions from member elected and second most
up-to-date.
This metric should allow a DBA to reason if used method is the best one.
The difference of the to apply transactions between the two most up-to-date
secondaries, the DBA can later compare with the weights and make decisions
like:
1. the difference of the to apply transactions is small, my system can afford
that and instead use the weights to ensure that the primary is on the
location I want.
2. the difference of the to apply transactions is big, if the member with more
transactions to apply was selected due to the weight my system would
struggle, thence I continue to use
`group_replication_elect_prefers_most_updated.enabled=ON`.
Gr_latest_primary_election_by_most_uptodate_member_timestamp
------------------------------------------------------------
Timestamp last time used most up-to-date primary election method on failover.
Upgrade/downgrade and cross-version replication
===============================================
Member can have different values, component uninstalled or be from previous
versions that do not have the component.
Election will only use this method when all members have the component
`group_replication_elect_prefers_most_updated` installed and
`group_replication_elect_prefers_most_updated.enabled` option is
enabled.
Previous versions and no component installed will be considered option
disabled, begin the election ruled by the members weight, and fallback to
lexical order if weights are equal.
User interface
==============
To use the component it shall be installed using the following statement:
```
INSTALL COMPONENT 'file://component_group_replication_elect_prefers_most_updated';
```
After usage, if need to be removed it shall call:
```
UNINSTALL COMPONENT 'file://component_group_replication_elect_prefers_most_updated';
```
Components can only be installed/uninstalled when a server is writable, which
means that we cannot install/uninstall the feature on a secondary. To overcome
that and allow dynamically enable/disable the feature, we are introducing the
option:
- NAME: group_replication_elect_prefers_most_updated.enabled
- VALUES: bool [ON|OFF]
- DEFAULT: ON
- SCOPE: global
- DYNAMIC: Can be changed while Group Replication is running. Value can be
different on members.
- REPLICATED (written to the binary log): no
- PERSIST: PERSIST, PERSIST_ONLY
- PRIVILEGES REQUIRED: SYSTEM_VARIABLES_ADMIN
- DESCRIPTION: Will enable on this member the usage of most up-to-date primary
election on failover method. The group only will use this
method if all members have it enabled.
To access the component variable:
```
SELECT @@GLOBAL.group_replication_elect_prefers_most_updated.enabled;
```
The metrics can be read through global status variables on the
`performance_schema.global_status` table:
```
mysql> SELECT * FROM performance_schema.global_status WHERE VARIABLE_NAME LIKE '%most_uptodate%';
+------------------------------------------------------------------------+---------------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+------------------------------------------------------------------------+---------------------+
| Gr_latest_primary_election_by_most_uptodate_members_trx_delta | 10 |
+------------------------------------------------------------------------+---------------------+
| Gr_latest_primary_election_by_most_uptodate_member_timestamp | 2024-07-01 12:50:56 |
+------------------------------------------------------------------------+---------------------+
```
The metrics can be read also using `SHOW` command:
```
mysql> SHOW GLOBAL STATUS LIKE 'Gr\_most\_uptodate%%';
+------------------------------------------------------------------------+---------------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+------------------------------------------------------------------------+---------------------+
| Gr_latest_primary_election_by_most_uptodate_members_trx_delta | 10 |
+------------------------------------------------------------------------+---------------------+
| Gr_latest_primary_election_by_most_uptodate_member_timestamp | 2024-07-01 12:50:56 |
+------------------------------------------------------------------------+---------------------+
```
The new election method does not interfere on group mode from multi-primary to
single primary, it will follow member weight and lexical order if no primary is
specified.
The new election method most up-to-date requires information that is only sent
on a view change, preventing is usage when switching from multi-primary to
single-primary mode.
The new election method does not interfere on switchover, when DBA specify the
new primary it will be selected aside any method of election present.
Deployment and installation
===========================
To use the component it shall be installed using the following statement:
```
INSTALL COMPONENT 'file://component_group_replication_elect_prefers_most_updated';
```
After usage, if need to be removed it shall call:
```
UNINSTALL COMPONENT 'file://component_group_replication_elect_prefers_most_updated';
```
Protocol
========
On Group Replication to use the new primary election method on failover all
members need to have the component installed end `most_uptodate` set to true.
On a view change the information of `most_uptodate` is passed to all members
to all have same information when selecting new primary.
If one of the member not know the component variable the most up-to-date
election method will not be used.
Failure Model Specification
===========================
A member can fail retrieving information of their GTID executed. If that happen
the most up-to-date method will not able to compute member most update. It will
be used as primary member election member weight and lexical order.
Summary of changes ================== The Group Replication will change primary election on failover, adding a check if all members have most up-to-date method enable and if so call the service. An component will be created with one service, one component variable and two global status variables. The service: ``` BEGIN_SERVICE_DEFINITION(group_replication_primary_election) DECLARE_BOOL_METHOD(update_primary_election_status, (char *timestamp, uint64_t transactions_delta)) END_SERVICE_DEFINITION(group_replication_primary_election) ``` The component variable: ``` bool group_replication_elect_prefers_most_updated.enabled; ``` And register two global status variables: * `Gr_latest_primary_election_by_most_uptodate_members_trx_delta` * `Gr_latest_primary_election_by_most_uptodate_member_timestamp`
Copyright (c) 2000, 2026, Oracle Corporation and/or its affiliates. All rights reserved.