WL#6559: Optimize GTIDs for passive slave - store GTIDs in table

Affects: Server-5.7   —   Status: Complete

EXECUTIVE SUMMARY
=================
If gtid_mode is enabled, store GTIDs in table always as below:
  - If binlog is disabled, store transaction owned GTID in table within each
transaction.
  - If binlog is enabled and log_slave_updates is enabled, store set of GTIDs of
the last binlog in table on each binlog rotation and on server shutdown.
  - If binlog is enabled and log_slave_updates is disabled, do below:
      - slave SQL thread or slave worker stores transaction owned GTID in table
within each transaction.
      - store set of GTIDs of the last binlog in table on each binlog rotation
and on server shutdown.


PROBLEM STATEMENT
=================
The main purpose of this worklog is to make the server execute faster and use
much less disk space, if the server is a replication slave that uses Global
Transaction Identifiers. This is achieved by allowing GTIDs to be used with
binary log turned off and instead store GTIDs in a system table.

It is crucial for correct operation that GTIDs are durable. Currently, GTIDs can 
only be stored in the binary log, and therefore GTIDs cannot be used without the 
binary log. A master must always have the binary log on. A slave that is a 
candidate to be promoted to master during a failover also must always have the 
binary log on. However, a slave that is only used for read scale-out and is
never going to become a master may not have any use for the transactions in the
binary log, but it may have a use for the GTID feature (to fail over to a new
master). On such a slave, storing GTIDs in a table instead of in the binary log
can be a crucial optimization.

The table will require much less disk space than the binary log - in all
practical situations it can be implemented so that it only requires small, fixed
size. Moreover, the binary log requires a disk sync of its own, whereas a table
can re-use the same disk sync that the storage engine uses to make the
transaction durable.

FUNCTIONAL REQUIREMENTS
=======================
F-1: Make GTIDs work fine without the binary log.
F-2: Introduce a user variable:
       NAME: executed_gtids_compression_period
       SCOPE: global
       TYPE: numeric
       RANGE: [0, 4294967295]
       DEFAULT: 1000
       DESCRIPTION: the variable is used to control how often a thread should be
woken up to compress the gtid_executed table regularly. When binlog is disabled,
to the user variable 'executed_gtids_compression_period', if value is 0, the
thread never wakes up, if value is 1, wakes up every transaction. If value is
1000, wakes up every 1000 transactions.
F_3: Introduce a system table named 'gtid_executed', user can query executed
GTIDs from the table by 'SELECT * FROM mysql.gtid_executed;'.


NON-FUNCTIONAL REQUIREMENTS
===========================
NF-1: When binlog is disabled, storing GTID into a table within each
transaction. When binlog is enabled, store set of GTIDs of the last binlog into
the table on each binlog rotation and on server shutdown.

NF-2: Storing gtid set of last binlog into table on binlog rotation and server
shutdown did not show negative impact on the overall performance when binlog is
enabled and gtid_mode is enabled on master.

NF-3: Storing gtid into table within each transaction did not show negative
impact on the overall performance with single slave SQL thread when binlog is
disabled and gtid_mode is enabled on passive slave.

NF-4: Implicit requirements: Follow the SQL standard, work on all platforms, do
not impact other internal MySQL components (MySQL internal tools can get correct
GTID state by querying GTID_EXECUTED directly as before).
Interface Specification
=======================

I-1: Introduce a system table, a row in the table can contain an interval of
several GTIDs, all having the same source_uuid and an interval of consecutive
numbers. So the table definition should be:

  CREATE TABLE gtid_executed(
    source_uuid CHAR(36) NOT NULL,            -- uuid of the source where the
transaction was originally executed
    interval_start BIGINT NOT NULL,           -- First number of interval
    interval_end BIGINT NOT NULL,             -- Last number of interval
    PRIMARY KEY(source_uuid, interval_start)  -- PK on source_uuid and interval
start
  );


High-Level Specification
========================
1. The passive slave never generates any new gtid when binlog is disabled and
gtid_mode is enabled. What's the impact?
   We don't generate GTIDs because they won't exist anywhere (on replication
stream). So there is no impact.

2. Should we disallow a server that has GTIDs stored in gtid_excuted table to be
promoted master?
   No. I agree to allow a server that has GTIDs stored in gtid_excuted table to
be promoted master base on that we can get purged/missed GTIDs back from
dump/restore.

3. We must keep binlogs history consistent.
1. Create a MySQL internal system table named 'gtid_executed' if it does not
exist, a row in the table can contain an interval of several GTIDs, all having
the same source_uuid and an interval of consecutive numbers as following:

  CREATE TABLE gtid_executed(
    source_uuid CHAR(36) NOT NULL,            -- uuid of the source where the
transaction was originally executed
    interval_start BIGINT NOT NULL,           -- First number of interval
    interval_end BIGINT NOT NULL,             -- Last number of interval
    PRIMARY KEY(source_uuid, interval_start)  -- PK on source_uuid and interval
start
  );

  Note: The gtid_executed table always exists after creation at any time as it
is mysql internal system table.


2. If gtid_mode is enabled, store GTIDs into the gtid_executed table always as
below.
  - If binlog is disabled, store transaction owned GTID into the table before
transaction prepare as following:
      INSERT INTO gtid_executed VALUES (SID, GNO, GNO);

  - If binlog is enabled and log_slave_updates is enabled, store set of GTIDs of
the last binlog into the table on each binlog rotation and on server shutdown as
following:
      INSERT INTO gtid_executed VALUES (SID1, GNO_start, GNO_end);
      INSERT INTO gtid_executed VALUES (SID2, GNO_start, GNO_end);
      ......
  - If binlog is enabled and log_slave_updates is disabled, we do the following
two things:
      - slave sql thread or slave worker stores transaction owned gtid into
table before transaction prepare.
      - store set of GTIDs of the last binlog into the table on each binlog
rotation and on server shutdown.
    

3. If gtid_mode is disabled, no GTID is inserted into the gtid_executed table.


4. The gtid_executed table would be filled with singleton interval as we cannot
use UPDATE instead of INSERT, because any concurrently executing transactions
would conflict on the update. So we propose to create a FOREGROUND thread (we
can not create a BACKGROUND thread as we have to create THD object to open
gtid_executed table) and introduce a user variable
'executed_gtids_compression_period' to control how often the thread should be
woken up to compress the gtid_executed table regularly when binlog is disabled.
(When binlog is enabled, store set of GTIDs of the last binlog in table and
notify compression thread to compress the table on each binlog rotation,
regardless of the value of executed_gtids_compression_period)
  - To the user variable 'executed_gtids_compression_period', if value is 0,
the thread never wakes up, if value is 1, wakes up every transaction. If value
is 1000, wakes up every 1000 transactions. It is a global variable only and can
be set dynamically. It has UINT type and accepts numbers from 0 to 4294967295.
Default value is 1000.
  - We also added the thread's state info (suspending, compressing gtid_executed
table) into performance_schema.threads table, so that user can monitor it and
tune the user variable 'executed_gtids_compression_period' (The thread is in
suspended mode most of the time, wakes up only when it needs).
  - The thread runs if the gtid_mode is enabled, user cannot kill it as we
didn't add it into PROCESSLIST.
  - The thread opens a transactional context on an non-autocommit mode to
compress the gtid table completely in one or more transactions as following:
    - The thread reads each row by the PK(source_uuid, interval_start) in
increasing order, and compresses the first consecutive range of GTIDs in a
single transaction.
    - End the compression if the last transaction reached the end of the table,
otherwise continue to compress the table by employing another transaction until
the gtid table is compressed completely.


5. If gtid_mode is enabled and binlog is enabled, report @@GLOBAL.GTID_EXECUTED
and @@GLOBAL.GTID_PURGED from both binlog and gtid_executed table as following:
  - GLOBAL.GTID_EXECUTED = all gtids of gtid_executed table and binary logs;
  - GLOBAL.GTID_PURGED = GLOBAL.GTID_EXECUTED - (GTID_EXECUTED_BINLOG -
GTID_PURGED_BINLOG);
  (GTID_EXECUTED_BINLOG: gtid_set of gtids is ever logged in binary logs.
   GTID_PURGED_BINLOG  : gtid_set of gtids is purged from binary logs.)

  - Store set of GTIDs of the last binlog into table on each binlog rotation and
on server shutdown.
  - The set of GTIDs of the last binlog is not saved into the gtid table if
server crashes, so we add it into gtid table and GLOBAL.GTID_EXECUTED during
recovery from the crash.
  - Introduce 'previous_gtids_logged' Gtid_set, which contains the previous
stored groups in the last binlog file.
  - GLOBAL.GTID_EXECUTED is initialized with all GTIDs of gtid_executed table
during server restarting. Every transaction's GTID is added into the
GLOBAL.GTID_EXECUTED if gtid_mode is enabled after server restarts.
  - GLOBAL.GTID_PURGED is initialized with GTIDs from gtid_executed table and
binlogs and precomputed as above during server restarting. After server
restarts, added purged GTIDs into GLOBAL.GTID_PURGED when purging logs.
  - Introduce 'GTID_ONLY_IN_TABLE' Gtid set. If binlog is enabled,
GTID_ONLY_IN_TABLE is initialized with different GTIDs from GLOBAL.GTID_EXECUTED
and GTID_EXECUTED_BINLOG during server restarting, GTID_ONLY_IN_TABLE will never
change since starting server, a previous log event with GTID_EXECUTED_BINLOG
(GTID_EXECUTED - GTID_ONLY_IN_TABLE) set is created when rotating binlog. If
binlog is enabled and log_slave_updates is disabled, slave SQL thread or slave
worker thread adds transaction owned gtid into global gtids_only_in_table.

  - If 'gtid_executed' table is empty and GTID_EXECUTED_BINLOG is not empty,
save all GTIDs in GTID_EXECUTED_BINLOG into 'gtid_executed' table during server
restarting (handle the following three cases:
            1. the upgrade case.
            2. the case that a slave is provisioned from a backup of
               the master and the slave is cleaned by RESET MASTER
               and RESET SLAVE before this.
            3. the case that no binlog rotation happened from the
               last RESET MASTER on the server before it crashes.).
  - The GLOBAL.GTID_EXECUTED is initialized with gtid_purged when user is
initializing it through SET GLOBAL gtid_purged, and the set of these GTIDs is
saved into 'gtid_executed' table on next binlog rotation. (When binlog is
enabled, it is possible to update the value of gtid_purged, but only by adding
GTIDs to those already listed, and only when gtid_executed is unset—that is, on
a new server.)

6. If gtid_mode is enabled and binlog is disabled, initialize
GLOBAL.GTID_EXECUTED from gtid table during server startup and save gtid into
executed_gtids for every transaction, report @@GLOBAL.GTID_PURGED from
executed_gtids, since @@GLOBAL.GTID_PURGED and @@GLOBAL.GTID_EXECUTED are always
same, so we did not save gtid into lost_gtids for every transaction for
improving performance.

7. If gtid_mode is disabled, No GTID is inserted into gtid state and the
'binlog_gtids' table.


8. The gtid_executed table is reset when resetting master.


9. Gtid_table_persistor class
   It manages all operations on the gtid table.

   - m_count
     Count the append rows of the table when binlog is disabled.

   - save(Gtid *gtid)
     Insert the GTID into table.

   - save(Gtid_set *gtid_set)
     Store the set of GTIDS into the table.

   - compress()
     Compress intervals into consecutive GNOs in the table completely.

   - reset()
     Delete all rows from the table.

   - fetch_gtids(Gtid_set *gtid_set)
     Fetch gtids from the table and store them into gtid_set.