WL#6833: Group Replication: Read-set free Certification Module (DBSM Snapshot Isolation)

Affects: Server-5.7   —   Status: Complete

This task implements the conflict detection component:

1. The snapshot version is the full GTID_EXECUTED extracted when the transaction
   starts the commit command. Example:
     GTID_EXECUTED: 8a94f357-aab4-11df-86ab-c80aa9429489:1-4,
                    aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-2
2. For Identifying Stable Transactions
  Proposed Mechanism: Periodic message passing (with time period Ts) between all
  members in the group with the GTID_Executed set. Taking a set_intersection
  for all members would give the set of transactions that have been committed on
  all members.
3. For Garbage Collection
  Periodic purging of certification info based on stable transaction set
  computation on the member. For all transactions in the stable set,
  remove entries from the certification info.

For a transaction T1, the certifier takes T1 snapshot version and T1
write set. Then, for each item on T1 write set, it checks if it
exists in certification info. If it does, certifier compares T1
snapshot version with the version of the transaction that
corresponds to the same write set item (certified before).
If T1 snapshot version is a subset (subset includes equal sets) of
the already existent snapshot version, then it means that T1 was
executed on outdated data. That is, it implies that T1 has not seen
the already certified transaction that modified the same row, hence
T1 must not pass certification. Such transaction is dropped.
Otherwise, it is approved and T1 write set and snapshot version is
added to certification info.
Therefore, the 'first committer wins rule is enforced.


GTID
----
A transaction can be committed with a specified GTID or with a
automatic GTID.
If a GTID is specified, it will be used to identify the transaction.
Otherwise, a GTID with group UUID will be generated to identify the
transaction.
Functional Requirements:
F1. Certify Transactions.
    For any two transactions that change the same item at two different
    sites concurrently, this component shall identify the conflict.
F2. In case of a conflict, first transaction to arrive certification goes for
    commit.
F3. Functionality for users to monitor transactions that are
    successfully/unsuccessfully certified to be supported.
F4. Hash-map for items must be kept collision free.
F5. For optimistic transactions arriving certification, since the Certification
    info is consistent at all members, a transaction which is discarded at one
    member, is also discarded on every other member.
F6. A transaction that is certified unsuccessfully at the originating site shall
    be rolled back.
F7. A transaction that is certified unsuccessfully at a remote site shall be
    discarded.
F8. Transactions added to Certification info (Certified hash-map) must commit.

Non-Functional Requirements:
NF1. Stable Transactions are to be mapped in the Certifcation info until garbage
     collection.
NF2. On recovery, Certification info and last group GNO are to be exported and
     must be ready to use as soon as member status changes from Recovering to
     Online in the group.
NF3. Certification info and stable transaction information is not persisted for a
     member which leaves the group.
This component tries to solve the following issues:
1. Conflict detection in a group, with write-everywhere architecture.
2. Ordering of incoming transactions, based on first come first to commit.

It takes in the Transaction Context Log Event from the pipeline, processes it,
and outputs the result as a sequence number, in case of successful
certification, else returns zero, hence marking the complete
transaction as discarded.


Assuming that we have a group of three members:

Let present next GNO value at certifier be 4.

Present status of certification info:

Hash of item in Writeset     snapshot version (Gtid_set)
#1                           
#2                           UUID1:1
#3                           UUID1:1-2

Here, #1, #2, #3 are the items in write set, which originally are the PKEs of
modified rows. Snapshot version is the GTID_EXECUTED of the last transaction
which was successfully certified to modify the mapped hash of write set.

Let's say a new transaction Ti comes on Member 1.
The snapshot version is UUID1:1-3.

And say that the transaction arrives at certifier in the form of an event which
looks like this:
<#2, UUID1:1-3> Here, #2 is the hash of the item in write set, and UUID1:1-3 is
snapshot version.

Now, at certifier, this transaction is certified successfully, since the snapshot
version against #2 is UUID:1, that is, UUID1:1-3 is not a subset of UUID1:1.
Which means that Ti was executed after seen UUID1:1.

Hence, this transaction is successfully certified.
If no GTID is assigned to Ti, next GNO, with value 4, will be assigned
to transaction, and after that incremented to 5.
If Ti has a GTID assigned, no change is made to next GNO or to
transaction identifier

Certification info is updated as follows:

Hash of item in Writeset     snapshot version (Gtid_set)
#1                           
#2                           UUID1:1-3
#3                           UUID1:1-2

Now, meanwhile, we have the stable transaction-set computation invoked by our
periodic algorithm.
Let's assume a GTID_Executed Set at the 3 members.

Member1: UUID1:1-4
Member2: UUID1:1-3
Member3: UUID1:1-3


We take the intersection set of the three GTID_Executed sets and
hence arrive at UUID1:1-3 as the stable set.
After certification garbage collection procedure, on which we remove
entries against transactions that are a part of the stable set, we
have the following certification info.

Hash of item in Writeset     snapshot version (Gtid_set)
#2                           UUID1:1-3


Now, say another transaction Tj arrives at Member2

Again, say, the snapshot version is UUID1:1-3.

And say that the transaction arrives at certifier in the form of an event which
looks like this:
<#2, UUID1:1-3> Here, #2 is the hash of the write set, and UUID1:1-3 is
snapshot version.

Now, at certifier, this transaction is *not* certified successfully, since the
Tj snapshot version is a subset of snapshot version of #2 at certification info.
Please note, that subset is also true for equal sets, so UUID1:1-3 is a subset
of UUID1:1-3.

Hence, this transaction is rejected.



The basic interface is set by the public methods in certifier class.

1. It provides certification for an incoming transaction.
   rpl_gno certify(const Gtid_set *snapshot_version,
                   std::list *write_set,
                   bool generate_group_id);

2. This shall provide certification database to the recovery module.
   void get_certification_info(std::map *cert_info,
                               rpl_gno *seq_number);

3. The latest stable transaction information is also shared across components.
   Gtid_set* get_group_stable_transactions_set();

4. Garbage Collection shall be invoked for various events and at regular
   intervals of time.
   void garbage_collect();

These are the basic methods which allow other modules in GCS to communicate with
Certifier.
A BRIEF DESIGN FOR CERTIFIER MODULE
===========================================

/**
 * This class is a core component of the database state machine
 * replication protocol. It implements conflict detection based
 * on a certification procedure.
 */

typedef std::map Certification_info;

class Certifier
{
private:
  /**
    * Map of transactions write set to snapshot version.
    */
  Certification_info certification_info;

  /**
    * List of transactions that are committed on all group members. 
    */
  Gtid_set *stable_gtid_set;
  
  /**
    * Next Sequence Number to be assigned to incoming transaction that 
    * certifies.
    */
  rpl_gno next_seqno;

public:
  /**
    This member function SHALL certify the set of items against transactions
    that have already passed the certification test.

    @param snapshot_timestamp The incoming transaction snapshot timestamp.
    @param write_set          The incoming transaction write set.
    @returns >0               sequence number (positively certified);
              0               negatively certified;
             -1               error.
   */
  rpl_gno certify(const Gtid_set *snapshot_version,
                  std::list *write_set,
                  bool generate_group_id);
  
  /**
    This member function shall add transactions to the stable set

    @param gtid     The GTID set of the transactions to be added
                    to the stable set.
    @returns        False if adds successfully,
                    True otherwise.
   */
  bool set_group_stable_transactions_set(Gtid_set* executed_gtid_set);

  /**
    Returns the transactions in stable set, that is, the set of transactions
    already applied on all group members.

    @returns                 transactions in stable set
   */
  Gtid_set* get_group_stable_transactions_set();
  
  /**
    Removes the intersection of the received transactions stable
    sets from certification database.
   */
  void garbage_collect();

  /**
    Retrieves the current certification info and sequence number.

     @note if concurrent access is introduce to these variables,
     locking is needed in this method

     @param[out] cert_info     a pointer to retrieve the certification database
     @param[out] seq_number  a pointer to retrieve the sequence number
  */
  virtual void get_certification_info(std::map *cert_info,
                                      rpl_gno *seq_number);

  /**
    Sets the certification info and sequence number according to the given values.

    @note if concurrent access is introduce to these variables,
    locking is needed in this method

    @param cert_info
    @param sequence_number
  */
  virtual void set_certification_info(std::map *cert_info,
                                      rpl_gno sequence_number);