WL#6972: Collect GTIDs to include in the protocol's OK packet
Affects: Server-5.7
—
Status: Complete
EXECUTIVE SUMMARY ================= This worklog implements a mechanism to collect the necessary set of GTIDs to be sent over the wire in the response packet. DETAILS ======= This worklog is a stepping stone towards implementing session consistency throughout a MySQL based replicated system. It is built on top of the GTIDs infrastructure. There are three levels : L1. SESSION_CONSISTENCY_BEST_EFFORT No session consistency guarantees. The current situation. No changes required. L2. SESSION_CONSISTENCY_READ_OWN_WRITES If an application, A, issues T1 on S1 (master) and then issues a read only or read write transaction T2 on S2 (slave), then T2 will be executed only after S2 has replayed T1 (i.e., through replication). As such, it is said that the application will always read its own writes, regardless of the server where these reads are issued. L3. SESSION_CONSISTENCY_READ_ALL_WRITES Assume an application, A, that issues the read-write transaction T1 on S1 (master) and then commits. Later, A issues another read-write transaction T2 on S1 and commits it also. Again, later, through a different connection, A issues a read only transaction T3. If A later goes to S2 (slave) and issues a read only transaction, T4, then the connector will ensure that T4 will only be set to executed after T1, T2 have been replicated and applied on S2. The decision on when and for which transactions to wait, will be done by leveraging the GTID information available and that will be exposed by this worklog. The connector will make the decision whether to wait or not. The server will provide the connector with sufficient knowledge (gtid set) for the it to make the decision. CONSISTENCY LEVELS CONFIGURATION -------------------------------- Three levels of consistency: - SESSION_CONSISTENCY_BEST_EFFORT - SESSION_CONSISTENCY_READ_OWN_WRITES - SESSION_CONSISTENCY_READ_ALL_WRITES These require that the server provides different information on the set of GTIDs that have been seen and/or introduced by the most recently executed statement. The server exports an interface to control which GTIDs to track. This interface builds on the session track GTIDs and is further detailed later in the Low-Level Description section. The new GTID tracker is controlled by the dynamic variable @@SESSION_TRACK_GTIDS, which can be set to one of the following values: - OFF - OWN_GTID - ALL_GTIDS The mapping to the consistency levels is the following: SESSION_CONSISTENCY_BEST_EFFORT -> SESSION_TRACK_GTIDS= OFF SESSION_CONSISTENCY_READ_OWN_WRITES -> SESSION_TRACK_GTIDS= OWN_GTID SESSION_CONSISTENCY_READ_ALL_WRITES -> SESSION_TRACK_GTIDS= ALL_GTIDS Finally the specs mandate that the following information is returned in the OK packet: |---------------------+-------+---------------+---------------------| | session_track_gtids | OFF | OWN_GTID | ALL_GTIDS | | vs | | | | | scenario | | | | |---------------------+-------+---------------+---------------------| | Single RW trx | Empty | Its own GTID | All GTIDs or delta* | | Single RO trx | Empty | Empty | All GTIDs or delta* | | Multiple RW trx | Empty | Its own GTIDs | All GTIDs or delta* | | Multiple RO trx | Empty | Empty | All GTIDs or delta* | | RW and RO trx | Empty | Its own GTIDs | All GTIDs or delta* | |---------------------+-------+---------------+---------------------| (* We go with "all gtids" to begin with. Later we can optimize.)
FR1. The following set of gtids MUST be saved before the reply packet is sent to the client after a transaction finishes. They SHALL be discarded after being included in the OK packet. FR2. The user MUST have means to dynamically tell the server what is the amount of GTIDs to gather. This facility captures the amount of GTIDs specified in the HLD table and is controlled through a new dynamic variable. This variable is introduced in the High Level Specification section. Its name: SESSION_TRACK_GTIDS. FR3. The functionality designed in this worklog shall only be available if the server is operating with GTID_MODE=ON. FR4. GTIDs for implicitly terminated transactions SHALL be collected and included in the response packet of the statement that terminated the ongoing transaction. (E.g., BEGIN INSERT BEGIN <-- returns the GTID for the INSERT.). FR5. FR5. No gtids shall be collected on ROLLBACK for OWN_GTID nor ALL_GTIDS. FR6. The act of collecting GTIDs for prepared statements observes the same rules for non-prepared statements. As such the same requirements listed above apply. FR7. The new system variable, system_track_gtids, SHALL NOT be settable inside a transactional context.
The changes to the protocol will happen in WL#4797. The gtids will be appended to the OK packet in WL#6128. So nothing to write about the changes in the protocol in this WL. The user visible interface is reduced to a new option: SESSION_TRACK_GTIDS ------------------- - Type: Server System Variable - Settable: Yes - Scope: GLOBAL, SESSION - Default: OFF - Valid Values: OFF, OWN_GTID, ALL_GTIDS - Description: Controls the GTID information that is appended to the mysql protocol OK packet. The values are: - OFF - no gtids are included in the OK packet - OWN_GTID - Collect GTIDs generated by successful committed RW transactions. Therefore, once a RW transaction is committed its GTID is included in the OK packet for the last statement in the transaction. RO transactions do not collect GTIDs, so no GTIDs are included in the OK packet for these transactions. A RW transaction GTID is included only once, by the time the RW transaction commits. - ALL_GTIDS - The GTID_EXECUTED at the time the current transaction commits, regardless whether it is RW or RO.
TASKS ----- The work needed to implement this worklog may be split into the following subtasks: A. Create replication context object for tracking session consistency related data. class Rpl_consistency_ctx { ... } B. Add the following properties to the context: /* To store the maps between sidno and sid (UUID).*/ Sid_map m_sid_map; /** Set holding the transaction identifiers of the gtids to reply back on the response packet. Lifecycle: Emptied after the reply is sent back to the application. Remains empty until: - a RW transaction commits and a GTID is written to the binary log. - a RO transaction is issued, the @@SESSION_TRACK_GTIDS is set to ALL_GTIDS and the transaction is committed. */ Gtid_set m_gtid_set; C. Create accessors for m_gtid_set in Rpl_consistency_ctx. const Gtid_set& get_gtids() { return m_gtid_set; } D. Create two member functions (of Rpl_consistency_ctx) to save relevant gtids. These will be called at certain points of the transaction execution flow. The goal is to encapsulate the logic of which GTIDs to store inside these functions. Therefore, calls to these functions shall be placed at the relevant points of the execution and then these functions will save either @@GLOBAL.GTID_EXECUTED or thd->variables.gtid_next.gtid . Note that alternatively, we could register this class as a Trans_observer and Binlog_storage_observer. But those hooks that trigger notifications for these observers are too much tied into the binary log at this point in time. Nonehtless, if later these get refactored, adapting the current logic will be fairly easy since the behavior is already encapsulated in these two new member functions. /** This function MUST be called when a GTID is propagated throughout the replication protocol. This could mean, for instance, that it has been written to the binary log, thus slaves will get it. This function SHALL store the gtid if thd->variables.session_track_gtids is set to OWN_WRITES. @param thd The thread context. @return true on error, false otherwise. */ void notify_after_transaction_replicated(THD *thd); /** This function MUST be called after a transaction is committed in the server. It should be called regardless whether it is a RO or RW transaction. Also, DDLs, DDS are considered transaction for what is worth. This function SHALL store GTID_EXECUTED if thd->variables.session_track_gtids is set to ALL_GTIDS. @param thd The thread context. @return true on error, false otherwise. */ void notify_after_transaction_commit(THD* thd); /** This function MUST be called after the response packet is set to the client connected. The implementation may act on the collected gtid state for instance to do garbage collection. @param thd The thread context. * @return true on error, false otherwise. */ bool notify_after_response_packet(THD* thd); E. Place calls to notify_after_transaction_commit on trans_commit_stmt, trans_commit and trans_commit_implicit. F. Place calls to notify_after_transaction_replicated in Gtid_state::update_on_flush. G. In sql_parse.cc, dispatch_command: Add this after responding to the client connection. thd->rpl_session_ctx.notify_after_response_packet(thd); H. In sys_vars.cc add the new option: static const char *session_track_gtids_names[]= { "OFF", "OWN_GTID", "ALL_GTIDS" }; static Sys_var_enum Sys_session_track_gtids( "session_track_gtids", "Controls the amount of global transaction ids to be " "included in the response packet sent by the server." "(Default: OFF).", SESSION_VAR(session_track_gtids), CMD_LINE(REQUIRED_ARG), session_track_gtids_names, DEFAULT(OFF), NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(NULL), ON_UPDATE(NULL));
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.