WL#12895: Group Replication: Add OFFLINE_MODE to group_replication_exit_state_action
Affects: Server-8.0
—
Status: Complete
EXECUTIVE SUMMARY
=================
The option group_replication_exit_state_action configures how Group
Replication behaves when a server leaves the group unintentionally:
1. after encountering an applier error;
2. after encountering an recovery error;
3. in the case of a loss of majority;
4. when another member of the group expels it due to a suspicion
timing out;
5. after an error on coordinated group changes[1];
6. after a primary election error;
7. after a clone provisioning error;
8. when automatic rejoin is enable, after its attempts are
exhausted unsuccessfully.
This behavior can be:
* READ_ONLY disable writes on the server, by enabling
server super_read_only[2] option;
* ABORT_SERVER shutdown the server.
After this worklog is implemented, a new behavior is added:
* OFFLINE_MODE close all connections and disallow new ones from
users who do not have the CONNECTION_ADMIN or SUPER
privilege, by enabling server offline_mode[3] option.
OFFLINE_MODE includes the behaviour of READ_ONLY.
This automatic connection closure will avoid user sessions to:
1) read stale data and 2) write data to orphans servers, without the
need to shutdown the server.
Proxy tools, like MySQL Router, will immediately detect when a
connection is closed and update its pool of connections avoiding
directing users request to failed servers.
[1]
https://dev.mysql.com/doc/refman/8.0/en/group-replication-configuring-online-group.html
[2]
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_super_read_only
[3]
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_offline_mode
USER STORIES
============
- As a system developer using MySQL I want to always read data from
a MySQL server that is connected through replication so I minimize
my chances of reading stale data.
- As a MySQL DBA I want my servers to automatically close users
connections if they (the servers) involuntarily drop out of
replication, so that other components in my system do not engage
stale servers.
- As a system builder, I want my system to react (e.g., close
connections, remove server from the pool of "good" servers, etc)
whenever a server goes involuntarily offline w.r.t. replication, so
that I avoid to pro-actively polling the system to figure that out.
- As a proxy tool routing connections to a MySQL server, I want to
get my connections to stale servers automatically closed, so that
these connections are evicted from my routing cache automatically.
Functional requirements
=======================
FR1: In the case of applier error, the member will change to ERROR
state and leave the group, consequently if
group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR2: In the case of recovery error, the member will change to ERROR
state and leave the group, consequently if
group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR3: In the case of member expel from the group, the member will
change to ERROR state, consequently if
group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR4: In the case of member is unable to contact a majority of the
group members, after the timeout specified on
group_replication_unreachable_majority_timeout[2] option, if
group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR5: In the case of error during a coordinated group change, like
switch from single-primary to multi-primary mode, the member
will change to ERROR state and leave the group, consequently
if group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR6: In the case of error during a primary election, the member
will change to ERROR state and leave the group, consequently
if group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR7: In the case of error during cloning provisioning, the member
will change to ERROR state and leave the group, consequently
if group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR8: When automatic rejoin is enable[3], after its attempts are
exhausted unsuccessfully, the member will change to ERROR
state and leave the group, consequently if
group_replication_exit_state_action= OFFLINE_MODE then the
plugin will set server offline_mode= ON [1].
FR9: group_replication_exit_state_action= OFFLINE_MODE includes the
behaviour of group_replication_exit_state_action= READ_ONLY.
[1] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_offline_mode
[2] https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replication_unreachable_majority_timeout
[3] https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replication_autorejoin_tries
Non-functional requirements
===========================
None.
DEFINITIONS
===========
super_read_only: server super_read_only option
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_super_read_only
offline_mode: server offline_mode option
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_offline_mode
READ_ONLY: value of group_replication_exit_state_action option
https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replication_exit_state_action
OFFLINE_MODE: value of group_replication_exit_state_action option
https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replication_exit_state_action
SUMMARY OF THE APPROACH
=======================
The DBA can configure how Group Replication behaves when a server
leaves the group unintentionally.
A server leaves the group unintentionally:
1. after encountering an applier error;
2. after encountering an recovery error;
3. in the case of a loss of majority;
4. when another member of the group expels it due to a suspicion
timing out;
5. after an error on coordinated group changes[1];
6. after a primary election error;
7. after a clone provisioning error;
8. when automatic rejoin is enable, after its attempts are
exhausted unsuccessfully.
The behavior taken depends on the value of the option
group_replication_exit_state_action:
- READ_ONLY
The default value, switches the server to super read only mode.
RW (read-write) transactions are disallowed, RO (read-only) are
allowed.
- ABORT_SERVER
(WL#11568) Shutdowns the server, thence closing all user
connections.
- OFFLINE_MODE
Behavior added on this worklog, closes all connections and
disallows new ones from users who do not have the
CONNECTION_ADMIN or SUPER privilege.
This is done by setting server offline_mode option to ON.
OFFLINE_MODE includes the behaviour of READ_ONLY, that is,
it also disallows RW transactions, even from users with
CONNECTION_ADMIN or SUPER privilege.
The full description of offline_mode can be read at
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_offline_mode
[1]
https://dev.mysql.com/doc/refman/8.0/en/group-replication-configuring-online-group.html
USER INTERFACE
==============
The DBA can specify the behaviour by setting the option:
- name: group_replication_exit_state_action
- values: { READ_ONLY, ABORT_SERVER, OFFLINE_MODE }
- default: READ_ONLY
- scope: global
- dynamic: yes
- replicated: no
- persistable: PERSIST, PERSIST_ONLY
- credentials: SYSTEM_VARIABLES_ADMIN
- description: The action that is taken when the server
leaves the group unintentionally.
Possible values are READ_ONLY,
ABORT_SERVER and OFFLINE_MODE.
SECURITY CONTEXT
================
SYSTEM_VARIABLES_ADMIN privilege is required to change the option
group_replication_exit_state_action.
The mysql.session user used internally to access the SQL API
requires SUPER or CONNECTION_ADMIN privilege to avoid having its
connections closed when server offline_mode= ON.
Currently the mysql.session has the SUPER privilege, this worklog
will add the CONNECTION_ADMIN one.
UPGRADE/DOWNGRADE AND CROSS-VERSION REPLICATION
===============================================
There are no repercussions on upgrade scenarios.
OBSERVABILITY
=============
The member state after leaving is ERROR (no change on this worklog),
which can be check on performance_schema.replication_group_members
table:
SELECT * FROM performance_schema.replication_group_members;
The offline_mode state can be check with:
SELECT @@GLOBAL.offline_mode;
Existing connections which users do not have CONNECTION_ADMIN or
SUPER privilege will be close and unable to establish new
connections.
DEPLOYMENT AND INSTALLATION
===========================
There are no repercussions.
PROTOCOL
========
There are no repercussions.
FAILURE MODEL SPECIFICATION
===========================
The failure model remains unchanged, this worklog introduces a new
behaviour on how to handle the involuntary leaves.
SUMMARY OF CHANGES ================== Server core changes ------------------- - Add CONNECTION_ADMIN privilege to mysql.session user. Group Replication changes ------------------------- - Extend the option group_replication_exit_state_action with one more value: OFFLINE_MODE. - When a server leaves involuntary and group_replication_exit_state_action=OFFLINE_MODE: - set server super_read_only[1] to ON through the SQL API; - set server offline_mode[2] to ON through the SQL API; [1] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_super_read_only [2] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_offline_mode
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.