WL#13855: GR: Member actions
Affects: Server-8.0
—
Status: Complete
EXECUTIVE SUMMARY ================= This worklog shall provide a framework for DBAs/operators to allow configure a group to always stay in read-only mode. The DBA/operator shall be able to trigger actions when a member role changes to primary. BACKGROUND ========== In certain cases the DBA/operator may want that all members are read only. For example, when this group (B) is replica of another group. Group A |---| |---| |---| /-- | P | | S | | S | | |---| |---| |---| | | | | | ----------------- | | | Group B | inbound |---| |---| |---| \--------------> | P | | S | | S | replication |---| |---| |---| | | | ----------------- USER STORIES ============ - As MySQL DBA, I want all members in group B, which is a replica of group A and should never be written into until group A is down, to be read only so that group B protects itself against direct/stray writes, and therefore prevent split brain situations between group A and group B under normal operations.
FUNCTIONAL REQUIREMENTS ======================= FR-01: The member actions shall only be triggered on single-primary mode. FR-02: The member actions configuration can only be changed on: a) a server that is part of the group majority in single-primary mode and is the primary; b) a server that is not part of a group. On both cases the server must be writable, that is, @@GLOBAL.read_only=OFF [1] FR-03: When a member role changes to PRIMARY, a action shall be triggered on that member. This action is assigned to the event AFTER_PRIMARY_ELECTION. FR-04: The following action types are allowed: INTERNAL: actions provided by Group Replication; FR-05: Each action is assigned with a priority value, a integer between 1 to 100, that specifies the order on which the action will be run, the lower the value, the higher the priority. FR-06: Each action shall specify the type of error handling, either IGNORE or CRITICAL. IGNORE: errors will be ignored; CRITICAL: member will move into ERROR state and --group_replication_exit_state_action option[2] will be followed. FR-07: The actions must be configured through the UDFs: group_replication_enable_member_action; group_replication_disable_member_action; group_replication_reset_member_actions. These UDFs only exist when the Group Replication plugin is installed. Enable a already enabled action or disable a already disabled action is allowed. FR-08: The UDFs do require SUPER or GROUP_REPLICATION_ADMIN privilege. FR-09: The UDFs must be executed when: a) the server is part of the group majority in single-primary mode and is the primary; b) the server is not part of a group. On both cases the server must be writable. Attempting to execute the UDFs on a secondary will throw ER_GRP_RPL_UDF_ERROR. FR-10: A given action can only be used once per event, that is, no two actions with the same name can exist on the same event. FR-11: Internal actions name must be prefixed with "mysql_". FR-12: A action is configured with: name: name enabled: boolean type: INTERNAL event: AFTER_PRIMARY_ELECTION priority: integer between 1 to 100 error_handling: IGNORE, CRITICAL FR-13: Only enabled actions shall be triggered. FR-14: Group Replication provides the following INTERNAL primary election actions: action: mysql_disable_super_read_only_if_primary Which disables @@GLOBAL.super_read_only[3]. FR-15: The actions configuration shall be equal on all group members. As such Group Replication will ensure: a) The configuration done on the primary is propagated to all group members. This propagation is done through group messages and not through binary log. b) When a member joins the group, it will override its configuration with the one from the group. c) When a member joins the group, if all members are from a version that does not support member actions, then the joining member shall reset its actions configuration to the default one (described on FR-18). d) When a server bootstraps a group, that server configuration becomes the group configuration. e) After a group mode change from multi to single-primary, the primary shall propagate the actions configuration to all group members. FR-16: An error while storing the configuration during the UDFs group_replication_enable_member_action; group_replication_disable_member_action; group_replication_reset_member_actions. will throw ER_GRP_RPL_UDF_ERROR. FR-17: An error while receiving or storing the configuration on a group member during configuration propagation, will move that member into ERROR state and follow the --group_replication_exit_state_action option[2]. FR-18: The default actions configuration is composed by a single action: name: mysql_disable_super_read_only_if_primary enabled: 1 type: INTERNAL event: AFTER_PRIMARY_ELECTION priority: 1 error_handling: IGNORE which only takes place on the primary, in order to keep the current read only behaviour. DBA can disable this action, meaning that after the primary is elected it will remain read only. FR-19: Group Replication will keep the current behaviour of enabling super_read_only on join, before primary elections and errors. This behaviour is not configurable. FR-20: The actions configuration can be queried on the `performance_schema.replication_group_member_actions` table. This table is only selectable by design. This table only exists when the Group Replication plugin is installed. FR-21: The actions configuration version can be queried on the `performance_schema.replication_group_configuration_version` table. This table is only selectable by design. This table only exists when the Group Replication plugin is installed. FR-22: An error while receiving or storing the configuration on a member during member join, will make the join error out. FR-23: The member actions configuration can only be reset to the default configuration (described on FR-18) on a server that is not part of a group, using the UDF: group_replication_reset_member_actions The server must be writable, that is, @@GLOBAL.read_only=OFF [1] NON-FUNCTIONAL REQUIREMENTS =========================== None [1] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_read_only [2] https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replication_exit_state_action [3] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_super_read_only
SUMMARY OF THE APPROACH ======================= The DBA will be able to configure actions that shall be triggered when a primary election takes place on group in single-primary mode. Approach: - Internally, GR will intercept the primary election event and then will run an action named `mysql_disable_super_read_only_if_primary` The action will make the server read-writable if it has become the primary. - This action can be enabled/disabled by privileged user through UDFs. - GR will conditionally call the action, depending on the configuration. Disabling: ``` mysql> SELECT group_replication_disable_member_action("mysql_disable_super_read_only_if_primary", "AFTER_PRIMARY_ELECTION"); ``` Enabling: ``` mysql> SELECT group_replication_enable_member_action("mysql_disable_super_read_only_if_primary", "AFTER_PRIMARY_ELECTION"); ``` USER INTERFACE ============== A action is configured with: name: name enabled: boolean type: INTERNAL event: AFTER_PRIMARY_ELECTION priority: integer between 1 to 100 error_handling: IGNORE, CRITICAL `name` is: the name of the action. `enabled`: whether the action is enabled. 0 means disabled. 1 means enabled. `type` is: INTERNAL: actions provided by Group Replication. `event` is: AFTER_PRIMARY_ELECTION: on which event the action is triggered after the member role change to PRIMARY. `priority` is: a integer between 1 and 100, that specifies the order on which the action will be run, lower values first. `error_handling` is one of: IGNORE: errors will be ignored; CRITICAL: error will be handled according to --group_replication_exit_state_action option[1]. Enable a action --------------- The DBA can enable actions through the UDF ``` Name: group_replication_enable_member_action Arguments: - name: string - event: string Return: - string Throws: - ER_UDF_ERROR ``` Example: ``` mysql> SELECT group_replication_enable_member_action("mysql_disable_super_read_only_if_primary", "AFTER_PRIMARY_ELECTION"); ``` The following log message will be logged: ``` Name: ER_GRP_RPL_MEMBER_ACTION_ENABLED Input message: Member action enabled: "%s", type: "%s", event: "%s", priority: "%d", error_handling: "%s". Materialized message: 2020-10-01T11:29:31.296927Z 0 [System] [MY-013742] [Repl] Plugin group_replication reported: 'Member action enabled: "mysql_disable_super_read_only_if_primary", type: "INTERNAL", event: "AFTER_PRIMARY_ELECTION", priority: "1", error_handling: "IGNORE".' ``` Disable a action ---------------- The DBA can disable actions through the UDF ``` Name: group_replication_disable_member_action Arguments: - name: string - event: string Return: - string Throws: - ER_UDF_ERROR ``` Example: ``` mysql> SELECT group_replication_disable_member_action("mysql_disable_super_read_only_if_primary", "AFTER_PRIMARY_ELECTION"); ``` The following log message will be logged: ``` Name: ER_GRP_RPL_MEMBER_ACTION_DISABLED Input message: Member action disabled: "%s", type: "%s", event: "%s", priority: "%d", error_handling: "%s". Materialized message: 2020-10-01T11:29:31.296927Z 0 [System] [MY-013743] [Repl] Plugin group_replication reported: 'Member action disabled: "mysql_disable_super_read_only_if_primary", type: "INTERNAL", event: "AFTER_PRIMARY_ELECTION", priority: "1", error_handling: "IGNORE".' ``` Reset actions configuration --------------------------- The DBA can reset the actions configuration through the UDF ``` Name: group_replication_reset_member_actions Arguments: - None Return: - string Throws: - ER_UDF_ERROR ``` It will reset the configuration to the default one, described on FR-18, and set the version to 1. Example: ``` mysql> SELECT group_replication_reset_member_actions(); ``` The following log message will be logged: ``` Name: ER_GRP_RPL_MEMBER_ACTIONS_RESET Input message: Member actions configuration was reset. Materialized message: 2020-10-01T11:29:31.296927Z 0 [System] [MY-013744] [Repl] Plugin group_replication reported: 'Member actions configuration was reset'.' ``` The DBA must have SUPER or GROUP_REPLICATION_ADMIN privilege to call these UDFs. CONFIGURATION ============= The member actions configuration is stored in a system table. In order to change it, the DBA must use the UDFs presented on the USER INTERFACE section. The UDFs will ensure: a) that the configuration is updated on the primary (or on a single server); b) its arguments correctness; c) its propagation to all group members; d) its persistence. When the configuration is changed on a single server, obviously it is not propagated to outside that server. The configuration propagation is done through group messages and not through binary log, that is, it will not be written into the binary log, thence no GTIDs will be consumed neither this configuration will reach servers outside the group. All group members will have the same configuration, the change propagation guarantees that. Since the configuration change will not generate GTIDs and the propagation through the group is eventually consistent, there will be a dedicated table to keep track of the version of the configuration: `mysql.replication_group_configuration_version`. The table will have two columns: 1) name: the configuration name; 2) version: the version of the configuration. Every configuration change will increase that version. Example: ``` |----------------------------------+---------| | NAME | VERSION | |----------------------------------+---------| | replication_group_member_actions | 1 | |----------------------------------+---------| ``` The version default value is 1. The version is propagated together with the configuration. Every time the UDFs group_replication_enable_member_action group_replication_disable_member_action are run, the version for the row `replication_group_member_actions` on the table `mysql.replication_group_configuration_version` is incremented. Every time the UDF group_replication_reset_member_actions is run, the version for the row `replication_group_member_actions` on the table `mysql.replication_group_configuration_version` is set to 1. That version can be queried on the table `performance_schema.replication_group_configuration_version`. Configuration will first be stored locally and then propagated, more specifically: 1) open the configuration and version tables; 2) update both tables; 3) commit; 4) propagate changes; In the case of a uncontrolled full group shutdown, the DBA can query the table `performance_schema.replication_group_configuration_version`, to identify the latest version of the actions configuration. When a member joins the group, it will override its configuration with the one from the group. The default primary election actions configuration is composed by a single action: name: mysql_disable_super_read_only_if_primary enabled: 1 type: INTERNAL event: AFTER_PRIMARY_ELECTION priority: 1 error_handling: IGNORE in order to keep the current read only behaviour, that is, once the primary is elected, it becomes writable. The default configuration version is 1. The DBA can disable this action, meaning that after the primary is elected it will remain read only. SECURITY CONTEXT ================ The DBA must have SUPER or GROUP_REPLICATION_ADMIN privilege to call the UDFs presented on USER INTERFACE section. ACTIONS ======= INTERNAL -------- This type of actions are provided by Group Replication. There are the following INTERNAL actions: mysql_disable_super_read_only_if_primary Which disables @@GLOBAL.super_read_only[2] on the primary. UPGRADE/DOWNGRADE AND CROSS-VERSION REPLICATION =============================================== There are no repercussions on upgrade scenarios, since the default configuration (described at FR-18) will provide the previous behavior. Group Replication will keep the current behaviour of enabling super_read_only on join, primary elections and errors. This behaviour is not configurable. To prevent the following scenario: 1) A group with 3 members of a version that do not support actions. 2) A server (S4) that supports actions is configured with actions while outside of the group. 3) S4 joins the group, it will not receive the group actions configuration, then it will join with possible different actions of the group. when a joining member does not receive the group actions configuration during the join, the joining member will reset its actions configuration to the default one (described at FR-18). OBSERVABILITY ============= A performance schema table will be added to list the configured actions. This table is only selectable by design. ``` performance_schema.replication_group_member_actions ( name CHAR(255) CHARACTER SET ASCII NOT NULL COMMENT 'The action name.', event CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'The action event.', enabled BOOLEAN NOT NULL COMMENT 'Whether the action is enabled.', type CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'The action type.', priority TINYINT UNSIGNED NOT NULL COMMENT 'The order on which the action will be run, value between 1 and 100, lower values first.', error_handling CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'On errors during the action will be handled: IGNORE, CRITICAL.'); ``` A performance schema table will be added to list the configuration versions. This table is only selectable by design. ``` performance_schema.replication_group_configuration_version ( name CHAR(255) CHARACTER SET ASCII NOT NULL COMMENT 'The configuration name.', version BIGINT UNSIGNED NOT NULL COMMENT 'The version of configuration.'); ``` Log messages will be logged before each action is triggered. ``` Name: ER_GRP_RPL_MEMBER_ACTION_TRIGGERED Input message: The member action "%s" for event "%s" with priority "%d" will be run. Materialized message: 2020-10-01T11:29:31.296927Z 0 [System] [MY-013731] [Repl] Plugin group_replication reported: 'The member action "mysql_disable_super_read_only_if_primary" for event "AFTER_PRIMARY_ELECTION" with priority "1" will be run.' ``` DEPLOYMENT AND INSTALLATION =========================== There are no repercussions, the new tables and default configuration will be automatically added by the upgrade step. PROTOCOL ======== Not a protocol change, but the propagation of member actions configuration will rely on the services `gr_message_service_send` and `gr_message_service_recv` introduced by WL#12896: "Group Replication: delivery message service". These services use a generic message type `Group_service_message` composed by a tag and a raw payload. When the member actions configuration is changed on the primary, the complete configuration is encoded with Protocol Buffers[3], which will go in the raw payload with the tag `mysql_replication_group_member_actions`. This message is delivered to all members, replacing its configuration. In order to update the configuration on a new member, the state exchanged during the join will be extended with the complete primary election actions configuration from the primary. The member actions configuration included on the state exchange will also be encoded with Protocol Buffers. FAILURE MODEL SPECIFICATION =========================== The UDF sections presented on USER INTERFACE can throw several errors ===================================================================== Member role ----------- The UDFs can only be executed on the primary and it must belong to the group majority, or when the server does not belong to a group. ER_UDF_ERROR will be thrown when those conditions are not met. The `group_replication_reset_member_actions` UDF can only be executed when the server does not belong to a group. Invalid parameters ------------------ All parameters are mandatory and will be check according to the functional requirements. ER_UDF_ERROR will be thrown when those conditions are not met. Persistence error on a single server ------------------------------------ If there is a error while storing the configuration during the UDFs group_replication_enable_member_action; group_replication_disable_member_action; group_replication_reset_member_actions. ER_GRP_RPL_UDF_ERROR will be thrown. Persistence or receiving error on a group ----------------------------------------- If there is a error while receiving the configuration update, either on the message decoding or while persisting the configuration locally, this member will move into ERROR state and follow the --group_replication_exit_state_action option[1]. The following error message will be logged: ``` ER_GRP_RPL_MESSAGE_SERVICE_FATAL_ERROR: "A message sent through the Group Replication message deliver service was not delivered successfully. The server will now leave the group. Try to add the server back to the group and check if the problem persists, or check previous messages in the log for hints of what could be the problem." ``` Actions errors ============== Each member action does specify how a error during the action is handled: IGNORE: errors will be ignored; A error message will be logged: ``` Name: ER_GRP_RPL_MEMBER_ACTION_FAILURE_IGNORE Input message: The member action "%s" for event "%s" with priority "%d" failed, this error is ignored as instructed. Please check previous messages in the error log for hints about what could have caused this failure. Materialized message: 2020-10-01T11:29:31.296927Z 0 [Error] [MY-013732] [Repl] Plugin group_replication reported: 'The member action "mysql_disable_super_read_only_if_primary" for event "AFTER_ELECTION" with priority "1" failed, this error is ignored as instructed. Please check previous messages in the error log for hints about what could have caused this failure.' ``` CRITICAL: member will move into ERROR state and --group_replication_exit_state_action option[1] will be followed. A error message will be logged: ``` Name: ER_GRP_RPL_MEMBER_ACTION_FAILURE Input message: The member action "%s" for event "%s" with priority "%d" failed. Please check previous messages in the error log for hints about what could have caused this failure. Materialized message: 2020-10-01T11:29:31.296927Z 0 [Error] [MY-013733] [Repl] Plugin group_replication reported: 'The member action "mysql_disable_super_read_only_if_primary" for event "AFTER_ELECTION" with priority "1" failed. Please check previous messages in the error log for hints about what could have caused this failure.' ``` [1] https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html#sysvar_group_replication_exit_state_action [2] https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_super_read_only [3] https://developers.google.com/protocol-buffers
SUMMARY OF CHANGES ================== - Add the system table `mysql.replication_group_member_actions` to persist the member actions configuration. ``` CREATE TABLE mysql.replication_group_member_actions ( name CHAR(255) CHARACTER SET ASCII NOT NULL COMMENT 'The action name.', event CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'The action event.', enabled BOOLEAN NOT NULL COMMENT 'Whether the action is enabled.', type CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'The action type.', priority TINYINT UNSIGNED NOT NULL COMMENT 'The order on which the action will be run, value between 1 and 100, lower values first.', error_handling CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'On errors during the action will be handled: IGNORE, CRITICAL.', PRIMARY KEY(name, event)) DEFAULT CHARSET=utf8mb4 STATS_PERSISTENT=0 COMMENT 'The member actions configuration.'; ``` - Add the default content to `mysql.replication_group_member_actions` system table. ``` name: mysql_disable_super_read_only_if_primary enabled: 1 type: INTERNAL event: AFTER_PRIMARY_ELECTION priority: 1 error_handling: IGNORE ``` - Add the system table `mysql.replication_group_configuration_version` to persist the member actions configuration version. ``` CREATE TABLE mysql.replication_group_configuration_version ( name CHAR(255) CHARACTER SET ASCII NOT NULL COMMENT 'The configuration name.', version BIGINT UNSIGNED NOT NULL COMMENT 'The version of configuration.', PRIMARY KEY(name)) DEFAULT CHARSET=utf8mb4 STATS_PERSISTENT=0 COMMENT 'The group configuration version.'; ``` - Add the performance schema table. ``` CREATE TABLE performance_schema.replication_group_member_actions ( name CHAR(255) CHARACTER SET ASCII NOT NULL COMMENT 'The action name.', event CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'The action event.', enabled BOOLEAN NOT NULL COMMENT 'Whether the action is enabled.', type CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'The action type.', priority TINYINT UNSIGNED NOT NULL COMMENT 'The order on which the action will be run, value between 1 and 100, lower values first.', error_handling CHAR(64) CHARACTER SET ASCII NOT NULL COMMENT 'On errors during the action will be handled: IGNORE, CRITICAL.'); ``` - Add the performance schema table. ``` CREATE TABLE performance_schema.replication_group_configuration_version ( name CHAR(255) CHARACTER SET ASCII NOT NULL COMMENT 'The configuration name.', version BIGINT UNSIGNED NOT NULL COMMENT 'The version of configuration.'); ``` - Add the UDFs group_replication_enable_member_action; group_replication_disable_member_action; group_replication_reset_member_actions. - Introduce the `Member_actions_handler` to: 1) handle the actions configuration persistence on `mysql.replication_group_member_actions` and `mysql.replication_group_configuration_version` tables; 2) handle the actions encoding/decoding with Protocol Buffers for group propagation; 3) handle the actions triggering during primary election. - Actions configuration Protocol Buffers specification: ``` syntax = "proto2"; option optimize_for = LITE_RUNTIME; package protobuf_replication_group_member_actions; message Action { required string name = 1; required string event = 2; required bool enabled = 3; required string type = 4; required uint32 priority = 5; required string error_handling = 6; } message ActionList { required string origin = 1; required uint64 version = 2; required bool force_update = 3 [default = false]; repeated Action action = 4; } ``` - Refactor the state exchange during a member join to include the member actions configuration. - Refactor the primary election algorithms to engage `Member_actions_handler` after a election takes place.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.