WL#7334: Group Replication: Read-set free Certification Module - Part 2
Affects: Server-5.7 — Status: Complete
EXECUTIVE SUMMARY ================= This worklog is a continuation of WL#6833. Under the global task to provide MySQL with clustered, fault-tolerant, multi- master update-everywhere replication, this worklog deals with tasks which were postponed for the first working prototype of GCS certifier, namely: - Stable transaction computation - Garbage Collection - Persistence of Next sequence number and some other functionalities, like providing the certification Db which can be used at some other module, like recovery. The Certifier module is responsible for four major tasks: 1. Identifying conflicts using Certification Database. 2. Updating Stable set using periodic message passing between all nodes in the GCS cluster. 3. Garbage Collection on the Certification Database using a periodic purging mechanism based on stable set computation. 4. Next Sequence Number is persisted at a Node for later use. When a node "re" joins a group, next_seqno value is initiated from the higher value of the two: from LAST_GTID_EXECUTED's GNO at local server and highest GNO value at GTID_Retrieved from the donor. At WL#6833, first task is handled. Rest, 2 & 3, are to be handled at this worklog.
For Certifier Related Functional and Non-functional Requirements, please refer: WL#6833. For the tasks handled here Functional Requirements: F1. Next sequence number to be persisted. Whenever a node, previously part of the GCS cluster, rejoins the same group, this number is re-used or updated, based on the state of the group in terms of present sequence number. Non-Functional Requirements: NF1. Stable Transactions are to be mapped in the CertifcationDb until garbage collection. NF2. Transactions marked stable are to be persisted in the stable list until garbage collection. NF3. On recovery, CertificationDb, and last stable transaction information is to be exported and must be ready to use as soon as node status changes from Recovering to Online in the group. NF4. CertificationDb and stable transaction information is not persisted for a node which leaves the group.
The basic interface is set by the public methods in certifier class. The ones made available under this worklog are as follows: 1. This shall provide certification database to the recovery module. hash_map get_CertificationDb(void); 2. The latest stable transaction information is also shared across components. list
get_stable(void); 3. Garbage Collection shall be invoked for various events and at regular intervals of time. bool garbage_collect(void); These are the basic methods which allow other modules in GCS to communicate with Certifier. For further details: Refer WL#6833
/** * This member function shall add transactions to the stable list * * @param ctid GTIDs of transactions to be added to the stable set. This * transaction hence has successfully committed and is not * part of snapshot for subsequent optimistic transactions. * * @returns false, if successfully sets the transaction to stable, * true otherwise. */ Bool set_stable(Gtid_set gtid ); /** * This member function shall return the set of GTIDs of the stable * transactions. * This information is further used for garbage collection. * */ Gtid_set get_stable(void); /** * This member function shall garbage collect all transactions that are stable * and remove them from the Certification Database, i.e. the Certified item * hash-mapped to the transaction sequence number. * * @return false, if successful garbage collection, true otherwise. */ bool garbage_collect(void); /** * This member function shall return the certified hash-map. * */ cert_db get_CertificationDb(void); Here, we do the stable set calculation by broadcasting informational messages between nodes, carrying GTID_executed, at regular intervals. This message is sent using a new thread spawned to timeout at equal intervals. The timeout is reset at all nodes every time there is a view change. The message is handled at the handle_message_delivery event handler. Such informational messages would be: - handled by the certifier
- information about gtid executed persisted till all nodes' broadcast is received. For garbage collection, after every fixed number of rounds of stable set calculation, garbage collection is invoked. Here, Certification Db is locked while deleting records. Note:  Informational message, can be reusing the MSG_STATE_EXCHANGE from WL#7332, OR a new message type can be introduced.  When providing certification DB, a copy of the map is made, with guarded access.
Copyright (c) 2000, 2021, Oracle Corporation and/or its affiliates. All rights reserved.