WL#15294: Extending GTID with tags to identify group of transactions
Affects: Server-8.x
—
Status: Complete
EXECUTIVE SUMMARY
===============================================================================
This worklog is aimed at easier identification of groups of
transactions that were executed for different purposes. Right now,
the user is able to set GTID for the next
transaction by setting GTID_NEXT to UUID:NUMBER. The user is also
allowed to set GTID_NEXT to AUTOMATIC, which means that the server
will generate GTIDs for transactions executing within the current
session scope. The goal of this worklog is allow the user to
specify a name for a group of transactions, so that
the user can easily distinguish this group by simply looking at
transactions GTIDs.
This worklog extends the GTID definition. Current GTID implementation consists
of unique source UUID and transaction sequence number.
Introduced is a GTID tag, which may be assigned to a single
transaction or a group of transactions by the user.
Therefore, definition of a tagged transaction GTID is:
UUID:TAG:NUMBER. When transaction tag is unspecified, transaction
keeps UUID:NUMBER definition.
The user is able to assign a tag for transaction GTID by executing the
SET GTID_NEXT command. The user can assign a tag to a single transaction
with specified UUID and GNO or to all transactions generated within current
session scope that will be generated automatically.
TAG definition accepted in the system is the following:
^[a-zA-Z_][a-zA-Z0-9_]{0,31}$
which means that:
- tag consist of up to 32 characters (<=32)
- tag accepts letters with ASCII codes between 'a'-'z' and 'A'-'Z',
numbers (0-9), and the underscore character; tag must start with a letter
or underscore
- tag definition is case insensitive, after the user provides a tag,
tag is normalized to contain only lower-case letters
Please note that the user may choose to skip tag definition. This way, tag
will be empty.
SET gtid_next='AUTOMATIC:' is only allowed when gtid_mode is ON or
ON_PERMISSIVE. If gtid_mode is OFF or OFF_PERMISSIVE,
SET gtid_next='AUTOMATIC:' gives an error. In case the current mode
is ON or ON_PERMISSIVE, and there is any session ongoing with gtid_next set
to 'AUTOMATIC:', the user is not able to change the mode to any
of the incompatible GTID modes (OFF / OFF_PERMISSIVE).
Setting gtid_next to '::' is allowed in case
current GTID mode is ON, ON_PERMISSIVE or OFF_PERMISSIVE.
Otherwise, setting a tag produces an error. In case the current mode is ON,
ON_PERMISSIVE or OFF_PERMISSIVE, and there is GTID specified for the next
transaction that includes a TAG, the user is not able to change the mode
to any of the incompatible GTID modes (OFF).
USER/DEV STORIES
===============================================================================
US1. As a MySQL user I want to assign a specific name tag to a group of
transactions GTIDs
* so that I can distinguish transactions that are from
domain 1 (e.g. normal data) from domain 2 (e.g. admin operations) just by
looking at transaction GTID.
US2. As a MySQL server administrator, I want to restrict the use of SET
GTID_NEXT functionality to a given set of MySQL users (or roles)
* so that I ensure that only those users (related to a given data domain)
can commit new transactions with assigned name tags.
SCOPE
===============================================================================
After this worklog, the user is able to provide a custom
transaction TAG to be applied at commit time for transactions that
are originated in the same session (or at certification time when
running a Group replication plugin). The user is able to set TAG for
upcoming transaction / transactions by executing the SET GTID_NEXT command.
New allowed values for the GTID_NEXT are:
- :: (ASSIGNED_GTID)
- AUTOMATIC: (AUTOMATIC_GTID)
Setting GTID_NEXT to:
- :GNO
- AUTOMATIC
results in producing an empty transaction tag.
The scope of this worklog is changing implementation of the GTID
representation in the system. Introduced are GTID tags,
which may be assigned by the user with specific privileges level.
Changes include: adjusting implementation of the GTID (adding Tag component),
changes in GTID generation at commit time in asynchronous replication, changes
in GTID generation at certification time in GR, GTID reading and writing,
coding and encoding of GTID related events. This worklog also requires changes
in all of the places that propagate or present information about transaction
GTID, which include amongst others: GTID related events, GTID related
system messages, mysqlbinlog client, InnoDB redo log and commands presenting
information about current system state that include information about
transactions GTIDs.
This WL introduces a new level of privileges (TRANSACTION_GTID_TAG)
under which the user (or replication applier) may set a tag for the next
transaction.
LIMITATIONS
===============================================================================
L1. When using a fully specified GTID, being: ::,
the user is responsible for providing GTID that is unique in a given
replication topology
Note: inherited from : - ASSIGNED_GTID
REFERENCES
===============================================================================
None.
FUNCTIONAL REQUIREMENTS:
===============================================================================
FR1. The user shall be allowed to provide a tag for a fully specified GTID.
FR2. The user shall be allowed to provide the TAG for an automatically
generated GTID.
FR3. GTID definition shall be TSID:GNO, being:
* TSID - Transaction Source Id, which is a pair of:
- UUID - UUID of the source from which transaction originates
- Tag - Transaction Tag, name tag assigned by the user, may be empty
* GNO - transaction sequence number.
FR4. GTID tags will be persisted alongside the other GTID components in
all the storage mediums where it is currently persisted (binlog, redo log,
gtid executed table).
FR5. TAG part of the transaction GTID, provided by the user by execution
of the SET GTID_NEXT command, shall be applicable for
transactions executed in the current session scope committing after
execution of the SET GTID_NEXT command.
FR6. For automatic GTID generation, the source shall automatically generate
a transaction sequence number that is unique
for a pair of UUID and a tag.
FR7. For automatic GTID generation, the source shall not produce gaps
in generation of a GTID for any TSID.
FR8. Server shall provide functionality of skipping tagged GTIDs in the
COM_BINLOG_DUMP_GTID.
INTERFACE:
===============================================================================
NFR1. User shall be able to provide a tag of transaction GTID for the current
session by execution of the SET GTID_NEXT command.
NFR2. Server shall accept transaction tag in one of the following formats:
a) ::
b) AUTOMATIC:
NFR3. System shall accept TAGS in the format in line with the following
regular expression: ^[a-zA-Z_][a-zA-Z0-9_]{0,31}$ (case insensitive)
NFR4. Text representation of the GTID shall be one of the following:
* UUID:GNO (empty tag)
* UUID:Tag:GNO (with a tag assigned by the user)
SECURITY:
===============================================================================
NFR5. Change of GTID for the next transaction shall be allowed only if the user
executing the transaction has the TRANSACTION_GTID_TAG and at least one of the:
SYSTEM_VARIABLES_ADMIN, SESSION_VARIABLES_ADMIN or REPLICATION_APPLIER
privileges.
NFR6. Change of GTID for the next transaction in replication applier thread
shall be allowed only if the account for a replication channel
the following privileges levels:
TRANSACTION_GTID_TAG and REPLICATION_APPLIER.
NFR7. When upgrading, every user with the BINLOG_ADMIN privilege shall be
granted the TRANSACTION_GTID_TAG privilege.
OBSERVABILITY:
===============================================================================
NFR8. The user shall be able to observe an assigned transaction tag in all
observable GTID representations, be it tables, commands, state variables
or output produced by tools such as mysqlbinlog
===============================================================================
ASSUMPTIONS OF USE:
===============================================================================
AOU1. When tagging a fully specified GTID, the user shall provide a GTID
that is unique in a given replication topology.
SUMMARY OF THE APPROACH =============================================================================== ## Purpose This worklog is aimed at extending the current identification of the transactions within the system. It gives the user a possibility of tagging groups of transactions - giving a group of transactions a specific name. This way, the user will be able to easily distinguish between groups of transactions by looking at transaction GTID and searching for a specific transaction tag. Server already provides functionality to assign unique or automatically generated GTID for the next transaction or transactions executed within the current session scope. Instead of using source UUID as first part of transaction identifier, we introduce Transaction Source Identifier (TSID) composed of source UUID and transaction tag (Tag). TSID will replace SID as the first component of transaction GTID. ## Scope User provided tag will be preserved for the current session. Generation: There is no change foreseen in the GTID generation function used at commit time. Algorithm searches for the next possible and free (not committed, not owned) number for the next transaction. Function will need to ingest TSID instead of UUID (functionality of the GTID_NEXT=AUTOMATIC). This way, the user will be able to distinguish between certain types of transactions by looking at transaction tag, e.g.: 11111111-1111-1111-1111-111111111111:1-4 11111111-1111-1111-1111-111111111111:admin:1-3 11111111-1111-1111-1111-111111111112:admin2:1-10 11111111-1111-1111-1111-111111111112:admin:1-10 Please note that in case of multi-source topologies, the user is allowed to include transactions of different origin in the same group, identified by a specific tag. SET gtid_next='AUTOMATIC:' is only allowed when gtid_mode is ON or ON_PERMISSIVE. If gtid_mode is OFF or OFF_PERMISSIVE, SET gtid_next='AUTOMATIC: ' gives an error. In case the current mode is ON or ON_PERMISSIVE, and there is any session ongoing with gtid_next set to 'AUTOMATIC: ', the user is not able to change the mode to any of the incompatible GTID modes (OFF / OFF_PERMISSIVE). Setting gtid_next to ' : : ' is allowed in case current GTID mode is ON, ON_PERMISSIVE or OFF_PERMISSIVE. Otherwise, setting a tag produces an error. In case the current mode is ON, ON_PERMISSIVE or OFF_PERMISSIVE, and there is GTID specified for the next transaction that includes a TAG, the user is not able to change the mode to any of the incompatible GTID modes (OFF). The scope of this WL assumes: * changes in GTID definition by defining TSID that consists of source UUID and transaction tag. Every TSID will be mapped into one SIDNO. * algorithms will be changed to ingest and manipulate TSID instead of UUID * changes in GTID read, write, encode and decode functions * modification of GTID events which propagate information regarding the transaction GTID (see PROTOCOL CHANGES section below) * changes in group replication: Group replication does not use automatic assignment invoked in the binlog commit flush stage. Instead, certifier is responsible for assigning automatic GTID for the upcoming transactions. Defined values for the GTID NEXT are sent from the primary and GTIDs are generated and assigned in all group members. Gtid specification is transmitted using the Transaction_context_event. Since this WL does not introduce any new transaction GTIDs specification types, Transaction_context_event will possibly carry tagged GTIDs, but its structure will remain unchanged. In the current implementation, certifier keeps track of free GTIDs for the group UUID (one SIDNO) in a separate intervals set. This worklog will extend existing structures to track free GTIDs for separate TSIDs (SIDNOs). Moreover, the Certifier holds currently assigned interval for each GR member. This implementation will be extended to keep track of currently assigned interval of free GTIDs for each TSID (SIDNO) used by each GR member. Note: Currently in GR enabled replication, a separate error is issued in case given GTID has already been used - ER_GRP_RPL_GTID_ALREADY_USED. After this WL, if specified GTID is already taken, being tagged or untagged GTID, a ER_GRP_RPL_GTID_ALREADY_USED error is raised. Currently, the certifier tracks a set of GTID assignment blocks (controlled by group_replication_gtid_assignment_block_size) for the group UUID only. In this feature, we will track a set of GTID assignment blocks for each TSID where the UUID component is equal to the group UUID. This includes both the TSID with the group UUID and no tag, and any TSIDs with the group UUID and a tag. * new feature in the COM_BINLOG_DUMP_GTID: Server will also provide a possibility connecting a new, tag-aware replica to an old, tag-unaware source. When replica connects to the source and detect that the source's version is lower than the version that introduced tags, the replica will exclude tagged GTIDs in the COM_BINLOG_DUMP_GTID. * extension of existing server functionalities: Functionalities of the server are intended to be extended to take account for the tag, therefore this WL will also extend: - tracking of next free GTID, which will be done for each AUTOMATIC GTID, being tagged or untagged - checking replica GTIDs against source at connection time - if the replica has more GTIDs with the source's UUID, with TSID being tagged or untagged, connection will be refused ## Limitations L1. When using a fully specified GTID, being: : : , the user is responsible for providing GTID that is unique in a given replication topology. USER INTERFACE =============================================================================== VARIABLE CHANGES: GTID_NEXT (new values) VALUES: : (already supported value) AUTOMATIC (already supported value) ANONYMOUS (already supported value) AUTOMATIC: (new value) : : (new value) not changed: DEFAULT: AUTOMATIC SCOPE: SESSION REPLICATED: No DYNAMIC: Yes PERSIST: NOT PERSISTABLE COMMAND LINE: NOT changed: Tagged GTID : : : or AUTOMATIC: PRIVILEGES: TRANSACTION_GTID_TAG and at least one of the: - SYSTEM_VARIABLES_ADMIN - SESSION_VARIABLES_ADMIN - REPLICATION_APPLIER Untagged GTID: : , AUTOMATIC, ANONYMOUS PRIVILEGES: SUPER, SYSTEM_VARIABLES_ADMIN, SESSION_VARIABLES_ADMIN or REPLICATION_APPLIER (no changes in privileges) DESCRIPTION (new values only): AUTOMATIC: The user provides a tag for the next transaction GTID. This GTID will be generated automatically. Server will generate source UUID, attach a TAG for the transaction and generate transaction sequence number, unique for the pair :TAG. : : The user provides a tag for the assigned, fully specified GTID. Server will use this GTID while committing the next transaction. group_replication_gtid_assignment_block_size This variable now affects not only the generation of GTIDs for the group UUID but also the generation for defined UUIDs when the user specifies GTID_NEXT=AUTOMATIC: GTID. PERSISTENT STATE =============================================================================== GTID tags are persisted everywhere GTID is persisted, which means: - InnoDB redo log Before this worklog, textual representation of a Gtid used in InnoDB redo log, contained up to 56 characters. Gtid_info used in GTID descriptor reserved 64 bytes to hold GTID information. This WL changes Gtid UUID encoding from text to binary. Reserved space remains the same - GTID uses up to 64 bytes to represent GTID UUID, GTID Tag length, GTID Tag and GNO number. This format will be given a version number 2. Reading functions are designed to be backward compatible with version 1. - The gtid_executed table Current gtid_executed table definition is as follows: source_uuid CHAR(36) NOT NULL COMMENT 'uuid of the source where the transaction was originally executed.', interval_start BIGINT NOT NULL COMMENT 'First number of interval.', interval_end BIGINT NOT NULL COMMENT 'Last number of interval.', PRIMARY KEY(source_uuid, interval_start) New gtid_executed definition: source_uuid CHAR(36) NOT NULL COMMENT 'uuid of the source where the transaction was originally executed.', interval_start BIGINT NOT NULL COMMENT 'First number of interval.', interval_end BIGINT NOT NULL COMMENT 'Last number of interval.', gtid_tag CHAR(32) NOT NULL COMMENT 'GTID Tag.', PRIMARY KEY(source_uuid, gtid_tag, interval_start) - Binary log Binary log contains new versions of serialized Gtid_log_event and updated format of events that contain GTIDs, details are contained in Protocol section of HLS) OBSERVABILITY =============================================================================== No changes w.r.t. existing functionality - GTID tags are presented to the user whenever GTID of the transaction is presented, meaning: - events present in the binary log (SHOW BINLOG/RELAYLOG EVENTS) - SQL queries related to presenting the state of gtid_executed table. Since new column is added, SELECT queries that include 'gtid_tag' column will present new information about assigned GTID tag. - GTID related data in performance_schema: - replication_connection_status: - LAST_QUEUED_TRANSACTION 57->90 bytes - QUEUEING_TRANSACTION 57->90 bytes - RECEIVED_TRANSACTION_SET no changes in column definition - replication_applier_status_by_coordinator: - LAST_PROCESSED_TRANSACTION 57->90 bytes - PROCESSING_TRANSACTION 57->90 bytes - replication_applier_status_by_worker - APPLYING_TRANSACTION 57->90 bytes - LAST_APPLIED_TRANSACTION 57->90 bytes - replication_group_member_stats - LAST_CONFLICT_FREE_TRANSACTION no changes in column definition - binary_log_transaction_compression_stats - LAST_TRANSACTION_ID no changes in column definition * clone_status - GTID_EXECUTED no changes in column definition * events_transactions_current - GTID 64->90 bytes * events_transactions_history - GTID 64->90 bytes * events_transactions_history_long - GTID 64->90 bytes * status, session, global variables presented in performance schema (no changes) - the following GTID functions accept GTID sets containing tags: GTID_SUBSET(), GTID_SUBTRACT(), WAIT_FOR_EXECUTED_GTID_SET() - SHOW statements: SHOW REPLICA/MASTER STATUS, - output of the mysqlbinlog, --include-gtids and --exclude-gtids now accept tagged gtid sets - Error messages that contain GTIDs and GTID sets may now contain tags. None of the existing messages specify the maximum number of characters to display GTID, therefore no format change is needed. - Statements accepting GTID sets, namely START REPLICA {UNTIL_AFTER_GTIDS|UNTIL_BEFORE_GTIDS}. They now accept GTID sets containing tags. - System variables other than @@gtid_next containing GTIDs or GTID sets, namely @@gtid_purged, @@gtid_executed, and @@gtid_owned. They may now contain GTIDs with tags, and SET @@gtid_purged accepts GTID sets containing tags. USER PROCEDURE =============================================================================== The user is able to set GTID_NEXT for upcoming transaction by using the SET GTID_NEXT command. To be able to use this functionality, the user needs to make sure that the source server is run in GTID mode. Automatic GTIDs are assigned during the flush stage of Binlog group commit or at certification time (GR case). Default value for the GTID_NEXT is AUTOMATIC, meaning that server is automatically assigning GTID based on the UUID of the source - source UUID becomes the first component of the GTID. To add a tag attached to transaction GTID, the user needs to execute the command: SET GTID_NEXT= : : or SET GTID_NEXT=AUTOMATIC: keeping in mind that the change will be observed with the next committed transaction. Valid GTID NEXT specifications: : : AUTOMATIC: Invalid GTID NEXT specifications: :: AUTOMATIC: SECURITY CONTEXT =============================================================================== This WL introduces a new level of privileges (TRANSACTION_GTID_TAG) under which the user may set a tag for the next transaction. Required privileges to set GTID_NEXT with the tag are: - TRANSACTION_GTID_TAG and at least one of the: - SYSTEM_VARIABLES_ADMIN - SESSION_VARIABLES_ADMIN - REPLICATION_APPLIER PRIVILEGE_CHECKS_USER account for the replication channel must have the following privileges to set GTID_NEXT with a tag: TRANSACTION_GTID_TAG and REPLICATION_APPLIER (checked during start of the replication applier thread) During the upgrade, every user with the BINLOG_ADMIN privilege will be granted the TRANSACTION_GTID_TAG. SYSTEM RESOURCES =============================================================================== No changes w.r.t. existing functionality DEPLOYMENT and INSTALLATION =============================================================================== No changes w.r.t. existing deployment/installation procedures. PROTOCOL =============================================================================== EXTERNAL -------- No changes w.r.t. existing functionality INTERNAL -------- ### Serialization framework #### High-level description This worklog introduces changes in Binary Log Events. At this point, Gtid event is no further extensible and new version needs to be introduced to meet requirements of a new representation of a GTID. In order to assure backward and forward compatibility of newly defined events, serialization framework is introduced. Serialization framework provides methods for automatic serialization and deserialization of event fields defined by the API user. Serialization framework is designed to expose a simple API to a developer that facilitates event definition, serialization and deserialization. The user instead of implementing serializing and deserializing functions, specifies definitions of fields included in the packet. Field definition considers definition of: - A number of bytes used to represent the field - A "field missing functor", specifying what decoder should do in case the field is not provided in the packet. By default, decoder won't take any action upon a missing field. - A "field encode predicate", specifying whether or not serializer should include the field in the packet. Default behavior of the encoder is to always include the field in the packet. - An "unknown field policy" - action that should be taken by decoder in case the field is encoded in the packet but its definition is unknown to the decoder (considers decoders of version older than version of the software which introduced field into the packet). We define two policies: "ignore" and "error". The field definition chooses one policy. The framework encodes the chosen policy in the packet. An old server decoding the packet reads the policy from the packet: if the policy is "ignore", it just skips the field. If the policy is "error", it reports and error and rejects the packet. The idea of the serialization framework is not to introduce many new types that can be ingested by encoding and decoding functions, but to reuse common STL types. Types supported by serialization framework are: - signed/unsigned integers (simple type) - floating point numbers (simple type) - sets, - maps, - vectors, - arrays (simple type) of fixed size, which cannot evolve over time - enumerations (simple type) - strings - custom types - nested messages, called "serializable fields" Important design decisions are listed below: - as message definitions evolve over time, the framework allows inserting new fields - fields get automatic type codes, assigned by the serialization framework and used to identify fields - fields can be effectively removed from the packet by defining a field encode predicate which will always return a false value. This ensures that removed fields don't occupy space in the output packet, while maintaining the auto-generated type codes and field sizes for subsequent fields - for simple types, field sizes are not included in the binary representation of a serialized type #### Message format specification Serialization framework types are encoded using the following formats: - signed/unsigned integers -> fl_integer_format / vl_integer_format - floating point numbers -> sp_fp_number_format / dp_fp_number_format - sets -> container_format, - maps -> map_container_format, - vectors -> container_format, - arrays (simple type) of fixed size -> fixed_container_format - enumerations -> vl_integer_format - strings -> string_format - custom types - nested messages, called "serializable fields", encoded according to the "message" Message formatting is the following: ::= top_level_message consists of: - serialization_version_number - serialization framework version number >=0, 1-9 bytes. Currently, the serializer writes 0 to this field. The deserializer stops with an error if this field has a value other than 0. - message ::= { } ::= message consists of: - total_field_size - Size of serializable field payload (1-9 bytes). Size information is used to calculate serializable type boundaries within a packet and also to skip unknown fields in the packet in case it is encoded by encoder of version newer than the version of decoder - last_non_ignorable_field_id - (1-9 bytes). This is a last encoded, non-ignorable field id. In case this field id is unknown to current version of decoder, it will generate an error. In case all fields in the packet are ignorable, last non-ignorable field id will be equal to 0. - field: Each "field" contains the metadata and the value for one field in the packet. field consists of: - auto-generated field_id >=0, sequence number which identifies fields - fields formatted according to the field_data format ::= | | | | | | | ::= | field_data can be one of: - vl_integer_format - format used to encode signed/unsigned integers using 1-9 bytes, depending on the integer value. Format is described in detail in the [Variable-length integers](#variable-length-integers). - fl_integer_format - fixed-length integer, encoded using a fixed number of bytes, signed or unsigned - fp_number_format - floating point number field: - sp_fp_number_format - single precision floating point number, using 4 bytes - dp_fp_number_format - double precision floating point number, using 8 bytes - string_format - string field, see below - container_format - unlimited-size container field, see below - fixed_container_format - fixed-length container format, see below - map_format - unlimited size container with key-value pairs, see below - message - nested message, see below Consecutive bytes of any number (integer, variable-lenght integer, floating point numbers) are stored using the little-endian convention. ::= { } Map format consists of: - number_elements - number of elements in the container >=0, encoded in unsigned vl_integer_format - key-value pairs formatted according to field_data format ::= { } Container format consists of: - number_elements - number of elements in the container >=0, encoded in unsigned vl_integer_format - field_data - encoded values, according to field_data format ::= { }+ Fixed-size array format consists of: - encoded values, formatted according to field_data format The number of elements is defined at compile-time and cannot be changed when the message format evolves ::= { } String format consists of: - string_length - the number of 1 byte elements in the string, encoded in unsigned vl_integer_format - string characters, 1 byte unsigned integers #### Variable-length integers Variable-length integers are encoded using 1-9 bytes, depending on the value of a particular field. Bytes are always stored using in LE byte order. This format allows the decoder to determine the total number of bytes to read after reading only the first byte. Therefore, the decoder will perform a maximum of 2 reads to read variable-length integer. The lowest bits in the first byte encode contain encoded information about how many consecutive bytes represent the number. It is equal to one plus the number of encoded trailing 1-bits before the first 0-bit. Therefore, it may range from 1 to 9, inclusively. When the number itself uses at most 56 bits, then the trailing ones are followed by a bit equal to "0"; otherwise it is followed by the full number. The special case for 57..64 bits allows us to use only 9 bytes to store numbers where the 63'rd bit is "1". (An encoding without such a special case would require 10 bytes.) For readability, in text below, we display bytes in big-endian format. Within each byte, we display the most significant bit first. Encoding is explained reading bits from right to left. - unsigned integers: Encoded length of the integer is followed by encoded value. Examples: "00000111 11111111 11111011" - BE byte order, bit layout: most significant bit first 65535 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). The latter bits are used to store the value. - signed integers: Signed integers are encoded such that both positive and negative numbers use fewer bits the smaller their magnitude is. The least significant bit is the sign and the remaining bits represent the number. If x is positive, then we encode (x<<1), cast to unsigned. If x is negative, then we encode ((-(x + 1) << 1) | 1), cast to unsigned. That is, we first add 1 to shift the range from -2^63..-1 to -(2^63-1)..0; then we negate the result to get a nonnegative number in the range 0..2^63-1; then we shift the result left by 1 to make place for the sign bit; then we "or" with the sign bit. The resulting number is reinterpreted as unsigned and serialized accordingly. "00001111 11111111 11110011" - BE byte order, bit layout: starting from the most significant bit 65535 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). After 0, we encode one sign bit (equal to 0). The latter bits are used to store the value. "00001111 11111111 11101011" - BE byte order, bit layout: starting from the most significant bit -65535 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). After 0, we encode one sign bit (equal to 1). The latter bits are used to store the negated value, minus 1. "00001111 11111111 11111011" - BE byte order, bit layout: starting from the most significant bit -65536 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). After 0, we encode one sign bit (equal to 1). The latter bits are used to store the negated value, minus 1. ### Changes in Binary Log Events #### Modification of the Gtid Event At this point, Gtid event contains optional commit group ticket field and it's extension is not further possible. The layout of the event is as follows: +------------+ | 1 byte | Flags +------------+ | 16 bytes| Encoded SID +------------+ | 8 bytes| Encoded GNO +------------+ | 1 byte | lt_type +------------+ | 8 bytes| last_committed +------------+ | 8 bytes| sequence_number +------------+ | 7/14 bytes| timestamps* +------------+ |1 to 9 bytes| transaction_length (see net_length_size()) +------------+ | 4/8 bytes| original/immediate_server_version (see timestamps*) +------------+ | 8 bytes| Commit group ticket +------------+ Tagged version of the Gtid_event will receive a new event typecode, GTID_TAGGED_LOG_EVENT (42). Gtid_event will be serialized / deserialized using a framework that enables further extension of the event. Each of the existing fields will receive it's own type code assigned automatically by serialization framework and layout will be following: +------------+ | 1-9 bytes | Serialization framework version (automatic) +------------+ | 1-9 bytes | Size of below payload (automatic) +------------+ | 1-9 bytes | Last non-ignorable field id (automatic), in case server does not know this field id, it will return an error. This field is equal to the commit group ticket typecode. +------------+ | 1-9 bytes | Flags typecode +------------+ | 1-9 bytes | Flags +------------+ | 1-9 bytes | Encoded SID typecode +------------+ | 16 bytes| Encoded SID +------------+ | 1-9 bytes | Encoded GNO typecode +------------+ | 1-9 bytes | Encoded GNO +------------+ | 1-9 bytes | GTID tag typecode +------------+ | 1-9 bytes | GTID tag length +------------+ | 1-32 bytes| GTID tag +------------+ | 1-9 bytes | last_committed typecode +------------+ | 1-9 bytes | last_committed +------------+ | 1-9 bytes | sequence_number typecode +------------+ | 1-9 bytes | sequence_number +------------+ | 1-9 bytes | immediate_commit_timestamp typecode +------------+ | 1-9 bytes | immediate_commit_timestamp +------------+ | 1-9 bytes | original_commit_timestamp typecode +------------+ | 1-9 bytes | original_commit_timestamp +------------+ | 1-9 bytes | transaction_length typecode +------------+ | 1-9 bytes | transaction_length +------------+ | 1-9 bytes | immediate_server_version typecode +------------+ | 1-9 bytes | immediate_server_version +------------+ | 1-9 bytes | original_server_version typecode +------------+ | 1-9 bytes | original_server_version +------------+ | 1-9 bytes | Commit group ticket typecode +------------+ | 1-9 bytes | Commit group ticket +------------+ First three fields is automatic "inner packet" overhead used to ensure forward/backward compatibility with future versions of the event. Packet integer fields are saved as variable-length integers, which length will be 1-9 bytes depending on the value of the field, as explained in the "Serialization framework" subsection. Since this is the first decoder version that uses serialization framework, decoder knows all of the fields ids - all packet fields are defined as ignorable, information won't be used. Field typecodes are also represented using the variable-length integers. Since we have a dozen fields in the packet, all typecodes will be encoded using 1 byte. #### New GTID set encoding, binary format: There are two formats of GTID set encoding: - untagged GTIDs format, used when GTID set contains only untagged GTIDs - tagged GTIDs format, used when GTID set contains tagged GTIDs Type code 0 (8 bit integer): - value 1: tagged or untagged GTIDS format - value 0: untagged GTIDs format Number of TSIDs (56 for untagged GTIDs format, 48 bit integer for tagged GTIDs format, little endian) Type code 1 (8 bit integer): - value 0: untagged GTIDs format, - value 1: tagged GTIDs format for each TSID: UUID (16 bytes) length of the tag (8 bits) tag definition (length of the tag number of bytes) number of intervals (64 bits) for each interval: interval start (64 bits) interval end (64 bits) GTIDs are encoded in the following places: - Previous_gtids_log_event - Transaction_context_log_event - View_change_log_event - The packet broadcast in a GR group which contains gtid_executed #### New GTID encoding, text format: Text format of a GTID is the following: : Tagged format of a GTID is used whenever GTID contains a tag: [: ]: being TAG a text that matches the following regular expression: [a-z_][a-z0-9_]{0,31} #### New GTID set encoding, text format: gtid_set: uuid_set [, uuid_set] ... | '' Within a gtid_set, uuids appear in alphabetical order uuid_set: uuid[:interval_list][:tag_intervals][:tag_intervals]... Within a uuid_set, tags appear in alphabetical order interval_list: interval[:interval]... Within an interval_list, intervals appear in increasing order and there is always a gap between two adjacent intervals tag_intervals: tag:interval_list uuid: hhhhhhhh-hhhh-hhhh-hhhh-hhhhhhhhhhhh h: [0-9|a-f] tag: [a-z_][a-z0-9_]{0,31} interval: n[-n] (n >= 1) Presented definition of the gtid_set ensures that text representation of the gtid_set is unique, therefore sets are equal only if their string representations are equal. When the system is ingesting gtid_set in text format prepared by the user: - Within a gtid_set, uuids may be repeated, may appear in any order, and are case-insensitive. - Within a uuid_set, tags may be repeated, may appear in any order. Also, tags are case-insensitive. - Within an interval_list, intervals may overlap, repeat and appear in any order. Also, interval_list may be empty. ### COM_ BINLOG_DUMP_GTID COM_BINLOG_DUMP_GTID uses a new GTID set encoding described in the "New GTID set encoding, binary format" section above. In case version of the source is lower than the version that introduce tags, replica will exclude tagged GTIDs from the packet sent to the source. ### Changes in GR communication #### Modification of the Transaction_prepared_message The current layout of this message is the following: +------------+ | 1 byte | GNO typecode (PIT_TRANSACTION_PREPARED_GNO) +------------+ | 8 bytes| GNO +------------+ | 1 byte | SID typecode (PIT_TRANSACTION_PREPARED_SID) optional +------------+ | 16 byte| SID (encoded) optional +------------+ New format of Transaction_prepared_message: +------------+ | 1 byte | GNO typecode (PIT_TRANSACTION_PREPARED_GNO) +------------+ | 8 bytes| GNO +------------+ | 1 byte | SID typecode (PIT_TRANSACTION_PREPARED_SID) optional +------------+ | 16 byte| SID (encoded) optional +------------+ | 1 byte | Tag typecode (PIT_TRANSACTION_PREPARED_TAG) optional +------------+ | 1 byte | Tag length optional (value 1-32, encoded using 1 byte) +------------+ | tag length | Tag optional +------------+ FAILURE MODEL SPECIFICATION =============================================================================== Failure model specification remains unchanged. New error messages are introduced and listed in below section. ## NEW ERRORS 1. In case the GTID MODE is set to OFF or OFF_PERMISSIVE, the server shall disallow setting the GTID_NEXT to AUTOMATIC:TAG and issue the ER_CANT_SET_GTID_NEXT_TO_AUTOMATIC_TAGGED_WHEN_GTID_MODE_IS_OFF: eng "@@SESSION.GTID_NEXT cannot be set to AUTOMATIC: when @@GLOBAL.GTID_MODE = OFF or OFF_PERMISSIVE" 2. Serialization framework uses internal error codes. Currently, in case serialization or deserialization fails, software will generate a context-dependent error. Internal Binlog_read_error is extended to handle event formats that come from future server versions. Internal message appended to the context-dependent error will be the following: "Unrecognized event format. The event appears to originate from a future server version" 3. In case GR failed to decode a tag in the Transaction_prepared_message, Group replication will report the ER_GRP_RPL_MSG_DECODING_FAILED: eng "Failed to decode Group Replication message: Transaction_prepared_message. Reason : Failed to decode a tag, wrong format" and will sent an error packet to the applier with ER_GRP_RPL_APPLIER_ERROR_PACKET_RECEIVED: eng "The Applier process of Group Replication found an error and was requested to stop: Failure when processing a received Transaction package from the communication layer" 4. In case the user has not sufficient privilege to execute / apply SET GTID_NEXT with a tag statement, the server will generate: ER_SPECIFIC_ACCESS_DENIED eng "Access denied; you need %-.256s privilege(s) for this operation" with the following error message: eng "Access denied; you need the TRANSACTION_GTID_TAG and at least one of the: SYSTEM_VARIABLES_ADMIN, SESSION_VARIABLES_ADMIN REPLICATION_APPLIER privilege(s) for this operation" UPGRADE/DOWNGRADE and CROSS-VERSION REPLICATION =============================================================================== In async replication, it is supported for a new replica to connect to an old source. If a new replica has committed tag transactions, and needs to connect to an old source that is tag-unaware, the COM_BINLOG_DUMP_GTID packet that the new replica sends to the old source has to be compatible with the new source. Therefore, the new replica has to check the server version of the source; if it is from an old version that is tag-unaware, the replica should exclude tagged GTIDs from the GTID set. In async replication, having a source server that is newer than the replica is unsupported. This is not enforced by any fence; it is possible for an old replica to connect to a new source. In this case, the source will attempt to replicate to the new replica. If the new source commits tag-transactions, an old, tag-unaware replica will observe that as an unknown and non-ignorable log event, the new Gtid_log_event. The replica then stops with an error. The user can recover from this situation by upgrading the replica and starting replication again. When running a replication topology with Group Replication enabled, GTID tags may be used only when all servers in replication topology have the version equal or greater on which GTID tags were introduced. During the upgrade, every user with the BINLOG_ADMIN privilege will be granted the TRANSACTION_GTID_TAG. During the upgrade, internal gtid_executed table will be extended by additional gtid_tag column, which will be a second component of the table primary key. BEHAVIOR CHANGE =============================================================================== No changes w.r.t. existing functionality
SUMMARY OF THE APPROACH
===============================================================================
Note for the reader:
Although all SW components are using the same GTID interface consistently in
the code, the GTID generation algorithms are different for GR and asynchronous
replication. Therefore, algorithms in this worklog are separately defined for
"GR case" and "asynchronous replication case".
This description contains the summary of the approach. More details can be
found in below sections of this LLD.
The scope of this worklog is changing implementation of the GTID
representation in the system. Introduced are optional GTID tags,
which may be assigned by the user with specific privileges level.
Changes include: adjusting implementation of the GTID (adding Tag component),
changes in GTID generation at commit time in asynchronous replication, changes
in GTID generation at certification time in GR, GTID reading and writing,
coding and encoding of GTID related events. This worklog also requires changes
in all of the places that propagate or present information about transaction
GTID, which include amongst others: GTID related events, GTID related
system messages, mysqlbinlog client, InnoDB redo log and commands presenting
information about current system state that include information about
transactions GTIDs.
Steps of this worklog are the following:
Step 1. Forward and backward compatible serialization framework
Step 2. Changing representation of the GTID
Step 3. Changes in GTID handling/generation in the Binlog Group Commit
Step 4. Changes in GTID handling/generation in the GR plugin
Step 5. Protect execution of GTID_NEXT 'tagged' with privileges
Step 6. Allow GTID_NEXT with a tag to run under certain GTID modes. Allow
changing GTID mode to compatible mode in case sessions with tag
assigned are running.
Step 7. Rename remaining '*sid*' related type names, function names and variables into '*tsid*'
Step 8. Keep track of the next_free_gno for AUTOMATIC tagged GTIDs
Step 9. Extend checking replica GTIDs during connection to the source
## Step 1. Forward and backward compatible serialization framework
This step is focused on providing a simple user API for defining
serializable class that can change in the future. Background for this step
are Gtid_events, which cannot be extended at this point.
Gtid_event class will receive additional
optional field related to GTID tag, but this time serialization and
deserialization of the event will be forward and backward compatible.
Serialization framework will provide 'Serializable' interface that needs
to be implemented by the API user. The API user needs to define two methods,
a non-const 'define_fields' object method used during deserialization
and const 'define fields' object method used during serialization. These
methods will return definition of fields with references to internal class
fields. After this step, the API user may choose defined serializer type
and call 'serialize' or 'deserialize' method to obtain byte representation
of the event, without the need of specifying how specific fields are
saved into the memory.
After extension of the existing serializable type, the API user may decide what
will happen in case older version of the software receives unknown fields.
Default behavior is "ignore", the API user may change this behavior to "generate
an error". In that case, older server will generate an error and
behave in crash-stop manner.
## Step 2. Changing representation of the GTID
This step include all of the changes needed in internal representation of
the GTID and processing of the new, tagged GTID type, which will be assumed
to remain empty at GTID generation time (in this step).
### Introduction
GTID functionalities are spread across several classes
placed in the: server libraries, binlog events library and GR plugin
related libraries. They are already containing functions dedicated to:
- GTID parsing ("from string"/"to string")
- Accessing the GTID data. UUID is represented either by string or a unique
number associated to the UUID string generated by the current mysqld process,
"rpl_sidno". GTID number, GNO is usually represented by the "rpl_gno" integer.
- GTID encoding / decoding functions (saving/loading GTID related data in
binary format).
GTID generation functions are defined separately, as external functionality
related to the GTID. However, there is no dedicated class for GTID generation.
### Planned changes to GTID interface
#### GTID definition
Definition of the UUID is extended to hold also information about defined
tag (which may be empty). TSID is mapped into the SIDNO, therefore,
transaction with the same source UUID but different tag will be mapped to
a separate SIDNO.
Before of this WL, parsing of the GTID was implemented in one function,
parsing both, the UUID and the GNO. Now, it will contain three functions
related to:
* parsing of the UUID
* parsing of the transaction tag
* parsing of the GNO
#### GTID specification changes and GTID type changes
GTID specification parsing functions need to account for the new functionality,
meaning they need to allow the user to pass defined tag to system to be
processed and changed into TSID.
### Integration with server code
All the places that utilize information about GTID UUID, will now be changed
to utilize information about GTID TSID.
### Gtid persistent state
Changes in GTID persistent state consider:
- changes in GTID executed table
- changes in InnoDB Redo log
- changes in binlog events
## Step 3. Changes in GTID handling/generation in the Binlog Group Commit
GTID is assigned automatically during the Binlog Group Commit flush stage. This
function acquires locks associated with the global SID map object that is
defined in the current mysql process and UUID related SIDNO. UUID related
SIDNO lock needs to be held while calling GTID generation code. In case
the user specifies the full GTID for the next transaction, lock related to
the defined SIDNO is taken and unlocked in the GTID generation code.
After this WL, GTID generation code needs to account for the fact that
UUID related lock might not be the only one SIDNO lock that is taken for
the "longer time". Extended locking mechanism is implemented in order to
eliminate constant SIDNO lock locking and unlocking in case more
transactions are using the same SIDNO lock (typical case for the
AUTOMATIC_GTID, but also newly introduced tagged AUTOMATIC_GTID). It
should be mentioned here, that "extended locking" mechanism was used
only for the UUID related SIDNO lock. Now, we may have several SIDNO
locks, but we want to keep the same optimization to reduce the
number of calls to lock/unlock functions.
Therefore, we introduce additional class, called Locked_sidno_set, that
will be used to lock not-already-owned SIDNO locks and unlock them
automatically after the binlog flush stage is over.
## Step 4. Changes in GTID handling/generation in the GR plugin
GTID is transmitted before the transactions executes the generation of the
GTID in the Binlog Group Commit flush stage. Therefore, GR contain it's own
code for handling generation of the GTID. Until now the GR certifies either
automatically generated GTIDs using the group UUID (group SIDNO) or a given
GTID was assigned to the received transaction. Existing certifier functions
need to be changed to account for the GTID generated with a tag
specified by the user, which means handling of various transaction SIDNOs
(that correspond to different TSIDs)
Changes in GR consider:
### Changes in Certifier functions:
Before this WL, Certifier kept track of:
- GTID free intervals for the group UUID
- currently "allocated" GTID interval used for each member.
Certifier allocated GTID intervals in blocks, with specified block size.
After introduction of this WL, functionality is extended to handle
not only currently used group SIDNO, but all TSIDs
which correspond to different SIDNOs.
In case the GR view changes, certifier is updating the list of available
GTID intervals for VCLE SIDNO. Allocated blocks are cleared out, and
allocated again after on-demand (first transaction that uses defined by the
user UUIDs). In case the current GTID interval is exhausted, a new block is
allocated for the given TSID (SIDNO) and member from which transaction
originates.
All of the used structures are changed according to the planned design,
including functions that call the functions on object of associated types.
More information in the "Detailed implementation" section.
### Changes in certification handler:
Certification handler need to utilize additional information about GTID
to be generated. Certification handler used "is_specified"
value of the Transaction context log event. After introduction of this
worklog, certification handler will also need to check whether Gtid_log_event
contains information about defined tag and propagate it to 'certify' function.
## Step 5. Protect execution of GTID_NEXT 'tagged' with privileges
This step implements new privilege: TRANSACTION_GTID_TAG.
SET_GTID_NEXT=::NUMBER and SET GTID_NEXT=AUTOMATIC:
is able to run under privileges specified in the "USER INTERFACE" section of
the HLS.
PRIVILEGE_CHECKS_USER account for the replication channel must have the
following privileges to set GTID_NEXT with a tag:
TRANSACTION_GTID_TAG and REPLICATION_APPLIER (checked during start of the
replication applier thread)
## Step 6. Allow GTID_NEXT with a tag to run under certain GTID modes. Allow changing GTID mode to compatible mode in case sessions with tag assigned are running.
This step implements:
- Allowing GTID_NEXT with a tag to be set under allowed GTID modes specified
- Allowing changing GTID mode to compatible mode in case there are sessions with GTID_NEXT set to AUTOMATIC:
## Step 7. Rename remaining '*sid*' related type names, function names and variables into '*tsid*'
This step performs renaming of type names, function names and variables
containing "sid" and using TSID definitions into "tsid".
## Step 8. Keep track of the next_free_gno for AUTOMATIC tagged GTIDs
This step extends GTID assigned optimization used in case UUID of the
transaction is a server UUID. The 'next_free_gno' variable is
extended to an unordered map, which tracks next free transaction
numbers for multiple SIDNOs
## Step 9. Extend checking replica GTIDs during connection to the source
At connection time, the source checks that the replica
doesn't have GTIDs with the source's UUID, which the
source does not have. This step extends the check to
cover also tagged GTIDs with the source's UUID
DETAILED IMPLEMENTATION
===============================================================================
## Step 1. Forward and backward compatible serialization framework
### Serialization framework entities
Serialization framework defines the following entities:
- Serializable - base class for data types that are capable of being
automatically coded / decoded using serialization framework. To define
a message format, the user needs to inherit a custom structure from
Serializable class and implement "define fields" function. Available are
the following helpers:
- define_field - enabled for definition of integer, floating point fields,
enumerations, container types
- define_field_with_size - enabled for integer fields, allows for definition
of fixed number of bytes used to represent a particular integer field
- define_compound_field - for definition of nested messages
- Serializer - base class for serializers, define what information is
serialized (value, typecode, size...). Serializer does not specify the
byte layout of the encoded basic types, but specifies the format of
supported container types:
- map / unordered map
- set / unordered set
- vector
- array
and enumeration types.
Serializer aggregates an archive and call its functions to format
simple types.
- Archive - base class for archives, defines how information is serialized -
defines format and final byte layout of the basic types:
- variable length integers
- fixed-length integers
- floating point numbers
- string
Archives decides how specific basic types are encoded, meaning format (string / binary)
and where specific encoded information is saved to or loaded from, e.g.
vector of bytes, string stream, externally allocated memory, file...
### Implementation specifics
Serialization framework is capable of ingesting the following types:
- simple types:
- (u)int8_t,
- (u)int16_t,
- (u)int32_t,
- (u)int64_t,
- std::string,
- float,
- double
- STL containers:
- vector
- map / unordered map
- set / unordered set
- array / constant C array (fixed size, which cannot evolve over versions)
- strongly typed enumerations
- types that implement "Serializable" base class interface,
called "serializable fields"
By default, integer fields are encoded using a variable-length integers format.
The user may change format of the integer by using the "define_field_with_size"
helper, which ingests the fixed number of bytes. The same helper may be used
to specify the maximum number of bytes for the string field. Supported
format definitions are described in the "Message format specification" section
of the HLS.
As mentioned before, the user may call the following helpers in the definition
of the "define_fields" method of the concrete Serializable class:
- define_field - this function ingests:
- Field reference
- Definition of the encode predicate. Optional argument, by default it
is equal to function which always returns true. A concept of the field
encode predicate is explained in the "Serialization framework. High level
description" section of the HLS.
- Definition of the missing field functor. Optional argument, by default it
is equal to an empty function). A concept of the field
missing functor is explained in the "Serialization framework. High level
description" section of the HLS.
- Definition of the "unknown field policy". Optional argument, by default it
is equal to "ignore". A concept of the unknown field policy
is explained in the "Serialization framework. High level
description" section of the HLS.
- define_field_with_size - this function ingests the fixed size of integer type
or may be used to define the maximum size of the string. It is enabled
only for string / integer types. Other function arguments are the same
as arguments of define_field helper. Defining field with size equal to "0"
means that default size will be used to encode a specific field
- define_compound_field - this function ingests only the reference of
the serializable field.
To define a serializable field, the user needs to inherit its own type from
the "Serializable" base class and define the following functions:
- define_fields, non-const, used during decoding
- define_fields, const, used during encoding
The user is able to calculate the size of the output packet by calling
the "get_size" function of the concrete serializer type with
serializable field passed as the function argument. Output of this function
depends on field values, therefore this function should be called only when
all of the packet fields have values assigned.
The user may also calculate the maximum size of the packet at compile time,
by calling the "get_max_size" function of the concrete serializer type.
Function call will cause compilation to fail if any of the fields has
unlimited size (vector/unbounded string/map/set).
### Extensibility of the serialization framework
The majority of serialization framework is focused on compile time
type deduction. Serialization framework may encode / decode data by itself
or, with further extension, use Protocol Buffers as backend - additional
'Serializer' specialization that will convert simple/STL containers enclosed in
MySQL data structures into fields in classes generated by Protocol Buffers.
Serializable framework includes only one Serializer specialization,
Serializer_default. This serializer saves type id and field value for
each simple field. For each field of type that is implementing "serializable"
interface, it saves type id, serializable size
and calls encoding function. For each variable length field, string field
or field of type that is one of the supported container types,
it saves the number of elements. As mentioned before, other types
of serializers may be developed to serve different purposes.
Serializable framework contains three implementations of an archive,
Read_archive_binary, used in case we only deserialize data, Archive_binary,
which stores encoded bytes internally in vector, and Archive_text
used for debugging purposes (contains text representation of data serialized
by Serializer).
Framework can be further extended to support various types of serializers
and archives. User API is not intended to be changed, the API user is
always responsible for providing fields definitions (see User API below).
#### Message evolution - backward and forward compatibility of encoded messages
Considering message evolution over time, the user defines whether field is
mandatory or optional for the current version of encoder by specifying the
field encode predicate and for older versions of decoder by specifying behavior
in case included field definition is unknown to older version of decoder.
Let's consider the following message definition in the server version x:
integer field_a
which changes in the server version y to the following definition:
integer field_a,
integer field_b.
In case field_b is required for processing in the current version of the
software, the user should specify the field_b encoding predicate which will
tell encoder to always include field_b in the packet. The user needs to
think what happens in case:
- server with version_y receives message from server with version_x
Field_b is missing in the packet. The user specifies actions that
need to be taken by the server (current version) in the field missing
functor. If this version does not know how to proceed without field_b and
does not have a way of providing backward compatibility, which means that
the user decides on breaking backward compatibility on purpose (unlike case),
the user will implement the field missing functor that will error out
the server
- server with version_x receives message from server with version_y
In this case, server with version_x receives field_b which definition is
unknown. The user specifies an action in version_y that should be taken by
server version x in case it receives field_b. Action is limited to:
- generate error
- ignore field and proceed with message decoding
- server with version_y receives message from server with version_z,
which is higher than version_y and field_b is missing:
In that case, server of version y will run a field missing functor.
Encoding predicate changed in version_z and encoder with version_z
chose to skip field_b while encoding the packet.
### User API example
An example of field definition method used during deserialization
(optional fields specify what happens in case field is not provided by
definition of the 'decode predicate'):
decltype(auto) define_fields() {
return std::make_tuple(
define_field(gtid_flags),
define_field(Uuid_parent_struct.bytes),
define_field(gtid_info_struct.rpl_gtid_gno),
define_field(last_committed),
define_field(sequence_number),
define_field(immediate_commit_timestamp),
define_field(original_commit_timestamp,
Field_missing_functor([this]() -> auto {
this->original_commit_timestamp =
this->immediate_commit_timestamp;
})),
define_field(transaction_length),
define_field(immediate_server_version),
define_field(original_server_version,
Field_missing_functor([this]() -> auto {
this->original_server_version = this->immediate_server_version;
})),
define_field(commit_group_ticket,
Field_missing_functor([this]() -> auto {
this->commit_group_ticket = Gtid_event::kGroupTicketUnset;
}), Unknown_field_policy::error)
);
}
Deserialization of defined fields inside of Gtid_event decoding method:
Serializer_default serializer;
serializer >> *this;
Serialization of defined fields inside of the Gtid_log_event encoding method:
serializer << *this;
### Generic functionality related to serialization/deserialization of strongly typed enumerations
Additional helper functions are introduced to simplify explicit conversion
of strongly typed enumerations and their defined underlying types.
Since C++23 defines "to_underlying", this function is temporarily
implemented in the MySQL in std namespace in case associated macro
implemented in the STL is not defined. After switching to C++23, STL
implementation of the function
will be used automatically. Conversion performed in the opposite direction
is not that straightforward. Code need to handle the case when defined
value is not appropriate state of the enumeration type. In that case,
a special, "invalid" constant of the enumeration type is returned and
handling of this special situation needs to be done by the caller.
## Step 2. Changing representation of the GTID
Newly introduced types:
- Trx_tag (implementation of the transaction tag)
- Trx_source_id (implementation of the TSID)
After this WL, Gtid class will include information of the TSID instead of UUID.
All of the code that uses UUID will be changed to process information
about GTID TSID.
The user will assign transaction GTID by execution of the SET GTID_NEXT.
Information about GTID tag will be propagated until GTID is assigned to the
transaction and transaction will receive a tagged GTID, meaning,
in the Binlog Group Commit or sent in the Gtid_event and applied
at certification time.
In this step, information about GTID tag will
be ignored in the system (integration will take place in Step 3 and 4).
Newly defined constants (static constant expressions):
- gtid_separator (defined to the currently used separator ':')
- gtid_end (GTID definition in the string type ends with the end of the string,
thus it is defined to the '\0')
Gtid parsing functions:
- currently implemented "parse function" uses newly defined parse_gno_str,
parse_sid_str and parse_tag_str to separately parse GNO, SID and Tag of the
GTID. It adds generated TSID into SID map in case it is passed as
a function parameter. Please note that one TSID is mapped into one
corresponding SIDNO. Since parsing functionality is duplicated in the current
implementation (to report GTID parsing error or not, in case we only want
to find out if GTID is correctly defined), the current "parse" function is
ingesting additional boolean variable. This variable decides whether GTID
parsing error will be reported or not. Duplicated implementation contained
in the "is_valid" function) is removed.
Note about GTID generation algorithm:
No changes here, already specified TSID (SIDNO) is passed to the 'generate_gtid'
function and only the GNO is generated.
### Gtid persistent state
Changes in GTID persistent state are implemented according to description
in "Persistent State" section of the HLS.
### Changes in Gtid related events include:
- changing of the Gtid_event and Gtid_log_event
New GTID event type will receive new type code: GTID_TAGGED_LOG_EVENT (42).
'Gtid_event' class will keep old code to encode / decode GTID_LOG_EVENT and
define new code to encode and decode extensible GTID_TAGGED_LOG_EVENT
with an usage of serialization framework (step 1)
- changing of the Gtid set encoding / decoding
GTID set encoding/decoding will be changed to
include information about GTID tag, but will be backward compatible. In case
tag is not provided, representation will be the same as it was before
introduction of this WL.
- changing GR messages that propagate information about transaction GTID
(introduction of the optional GTID tag fields)
### COM_BINLOG_DUMP_GTID
COM_BINLOG_DUMP_GTID encodes GTID set according to the format described in
"Protocol" section of the HLS.
## Step 3. Changes in GTID handling/generation in the Binlog Group Commit
GTID is assigned automatically during the Binlog Group Commit flush stage,
in the 'assign_automatic_gtids_to_flush_group' function. In case any of the
transaction has GTID type specified as "AUTOMATIC", the global SID lock
protecting the global SID map is taken. This functionality remains unchanged,
however, existing lines of code are executed for tagged and untagged automatic
GTIDs.
As optimization of the SID locking/unlocking mechanism, function kept
track of currently locked SIDNO (to lock server UUID sidno for all
transactions executing the current flush stage).
Right now, we might have larger number of SID locks, not only the one
associated with the source server UUID. Therefore, a newly introduced class
is used, called Locked_sidno_set. This is a simple RAII class that ingests
the current transaction UUID and takes the lock if not already taken. In the
class destructor, it releases all SID locks held. This way,
generate_automatic_gtid is greatly simplified. Branches related to
handling "generate_gtid" function argument, "locked_sidno" pointer that may
be null pointer or not, are removed. Instead, "generate_gtid" function is
ingesting the locked_sidno_set passed by the caller and uses
acquire_unowned_lock function of the "locked_sidno_set" object to acquire locks
for incoming SIDNOs that are not already taken. Both "generate_gtid" and
"assign_automatic_gtids_to_flush_group" does not care for unlocking procedure,
because it is being held automatically in the Lock_sidno_set class during the
stack unwinding (destructor).
## Step 4. Changes in GTID handling/generation in the GR plugin
GTID generation / handling functions are placed in the Certifier class. This
functionality is now placed in Gtid_generator and used
in the Certifier.
### Changes in certifier functions:
- Gtid_generator class will ingest a transaction sidno and will keep a map of
the following key-value pairs:
sindo and associated Gtid_generator_for_sidno class object. Internal
Gtid_generator_for_sidno members are:
- m_sidno - Sidno for which generator will produce a transaction number
- compute_group_available_gtid_intervals - calculates intervals
available in the group separately for the defined sidno
- reserve_gtid_block - reserves the GTID block for the defined
sidno
- m_available_intervals (previously group_available_gtid_intervals). This
member now holds free intervals for the defined sidno
- m_assigned_intervals (previously member_gtids). Maps group member UUID to
currently assigned interval. Map is defined for the defined sidno.
Gtid_generator class members are:
- m_gtid_assignment_block_size - GTID blocks generated will have this size
- m_gtid_generator_for_sidno - holds generator object for each defined sidno
- get_next_available_gtid - generates GTID for the specific member_uuid and
sidno
- recompute - recalculates each sidno generator state, reassigns intervals
- initialize - initialization function
- changes in the 'certify' function :
* decisions that certifier may take are implemented in the newly introduced
Certification_result enumeration type. Possible states:
a) positive - transaction is certified positively
b) negative - transaction is certified negatively, but there was no error,
GR may proceed
c) error - transaction was certified negatively, an error occurred
- changes in the GNO generation functions
Return value change:
Functions that are responsible for generation of the GNO have a hidden
control flow encoded in the possible values of generated GNO. Before
this WL, the function might return:
* value greater than 0 - generated GNO
* value 0 (some of the functions only) - GNO was not generated
* value -1 gno numbering exhausted for the given UUID (group UUID), it means
that generation of GTID is no longer possible - fatal error, stop
* value -2 (some of the functions only) - GNO free values exhaustion for
currently used interval. This state indicates that a new interval
should be assigned for the current member of the group.
This control flow with GNO values is replaced with newly introduced
enumeration type, Gno_generation_result with possible states:
* ok - successful generation
* gno_exhausted - no free GNO number, error
* gno_overflow - generated GNO is out of the scope of the current interval
Changes in function parameters:
GNO generation functions ingest additionally the rpl_sidno, that represents
the specified UUID (may be group UUID). The available GTID intervals are
allocated / deallocated with the same scheme as for AUTOMATIC_GTID using
the group UUID or rather the server representation of the UUID. If generated
GNO is out of the scope of the defined interval, function assigns the new
free GTID GNO interval for the given UUID. In case GR view changes,
free intervals are re-calculated from the GTID executed set.
### Additional refactorings:
Functionalities, like creation of the Transaction Termination Context
are delegated to separate Certifier helper functions.
Certification code is changed to use RAII MUTEX_LOCK macro defined in the
MySQL.
## Step 5. Protect execution of GTID_NEXT 'tagged' with privileges
This step implements new privilege: TRANSACTION_GTID_TAG.
SET_GTID_NEXT=::NUMBER and SET GTID_NEXT=AUTOMATIC:
is able to run under privileges specified in the "USER INTERFACE" section of
the HLS.
## Step 6. Allow GTID_NEXT with a tag to run under certain GTID modes. Allow changing GTID mode to compatible mode in case sessions with tag assigned are running.
This step implements:
### Allowing GTID_NEXT with a tag to be set under allowed GTID modes specified
When set_gtid_next function is executed, current mode of GTID is checked and
if it is not compatible with the value of GTID_NEXT, an appropriate error
is shown. Errors are specified in the "FAILURE MODEL SPECIFICATION" section
of the HLS.
### Allowing changing GTID mode to compatible mode in case there are sessions with GTID_NEXT set to AUTOMATIC:
When 'global_update' function is executed and the new mode is OFF_PERMISSIVE,
the number of running sessions with GTID_NEXT set to 'AUTOMATIC:' is
checked. If the number is zero, setting to OFF_PERMISSIVE is allowed. Otherwise,
command fails with an error.
Counter for the sessions running with GTID_NEXT set to 'AUTOMATIC:' is
implemented inside of the 'Gtid_state' class. Sessions increment or decrement
this counter in the 'set_gtid_next' function in case GTID_NEXT changed
from 'untagged' to 'tagged' or from 'tagged' to 'untagged'. Counter is also
decremented in case session is destroyed and GTID_NEXT was set to AUTOMATIC
with a tag.
## Step 7. Rename remaining '*sid*' related type names, function names and variables into '*tsid*'
This step performs renaming of type names, function names and variables
containing "sid" and using TSID definitions into "tsid", e.g.
Sid_map -> Tsid_map
## Step 8. Keep track of the next_free_gno for AUTOMATIC tagged GTIDs
This step changes next_free_gno from a integer variable into an
unordered map which tracks next_free_gno for each registered sidno.
## Step 9. Extend checking replica GTIDs during connection to the source
This step implements is_subset_for_sid function in the Gtid_set.
This function will check if this gtid set is a subset of the given gtid_set
on the given superset sidno and subset sidno.
Copyright (c) 2000, 2026, Oracle Corporation and/or its affiliates. All rights reserved.