EXECUTIVE SUMMARY ================= Store GTIDs in table for passive slave. Make GTIDs work fine without the binary log. PROBLEM STATEMENT ================= The purpose of this worklog is to make the server execute faster and use much less disk space, if the server is a replication slave that uses Global Transaction Identifiers. This is achieved by allowing GTIDs to be used with binary log turned off and instead store GTIDs in a system table. It is crucial for correct operation that GTIDs are durable. Currently, GTIDs can only be stored in the binary log, and therefore GTIDs cannot be used without the binary log. A master must always have the binary log on. A slave that is a candidate to be promoted to master during a failover also must always have the binary log on. However, a slave that is only used for read scale-out and is never going to become a master may not have any use for the transactions in the binary log, but it may have a use for the GTID feature (to fail over to a new master). On such a slave, storing GTIDs in a table instead of in the binary log can be a crucial optimization. The table will require much less disk space than the binary log - in all practical situations it can be implemented so that it only requires small, fixed size. Moreover, the binary log requires a disk sync of its own, whereas a table can re-use the same disk sync that the storage engine uses to make the transaction durable. REFERENCES ========== rb4176: http://rb.no.oracle.com/rb/r/4176/ BUG#14730192: MYSQL 5.6, GTID AND BINLOGS ON SLAVES (Feature Request) WL#7521 Extend gtid_executed table definition to final version for saving gtids in an index table.
FUNCTIONAL REQUIREMENTS ======================= F-1: Make GTIDs work fine without the binary log. F-2: Introduce a user variable 'executed_gtids_compression_period' to control how often a thread should be woken up to compress the gtid_executed table regularly. To the user variable 'executed_gtids_compression_period', if value is 0, the thread never wakes up, if value is 1, wakes up every transaction. If value is 1000, wakes up every 1000 transactions. NON-FUNCTIONAL REQUIREMENTS =========================== NF-1: Storing gtid into table showed some negative impact on the overall performance. SysQA Performance Evaluation for WL#6559: https://mediawiki.us.oracle.com/mysql/index.php/SysQA:Eventum47261#Test_Results_-_Feb_6 For instance, performance test base on a relay log with 50,000 sort transactions on slave server. The performance decreases a little when storing gtids into system table, but which has almost the same performance impact with storing gtids into binlog. See below. # description time column/CONTROL CONTROL 245740 1 TABLE 260086 1.058378774 BINLOG 260138 1.05859038 BINLOG+TABLE 269643 1.097269472 The above results can be repeated by running rpl_gtid_in_table_slave_perf.test on the following prototype patch: - bzr+ssh://myrepo.no.oracle.com/bzrroot/server/mysql-trunk-wl6559-prototype NF-2: Storing GTIDs in a system table makes the gtid commit together with the transaction. NF-3: Implicit requirements: Follow the SQL standard, work on all platforms, do not impact other internal MySQL components (MySQL internal tools can get correct GTID state by querying GTID_EXECUTED directly as before).
Interface Specification ======================= I-1: Introduce a system table, a row in the table can contain an interval of several GTIDs, all having the same SID and consecutive GNOs. So the table definition should be: CREATE TABLE gtid_executed( sid CHAR(36) NOT NULL, -- Source ID gno_start BIGINT NOT NULL, -- First GNO of interval gno_end BIGINT NOT NULL, -- Last GNO of interval PRIMARY KEY(sid, gno_start) -- PK on interval start ) ENGINE = InnoDB; NOTES ===== 1. The passive slave never generates any new gtid when binlog is disabled and gtid_mode is enabled. What's the impact? We don't generate ids because they won't exist anywhere (on replication stream). So there is no impact. 2. Should we disallow a server that has GTIDs stored in gtid_excuted table to be promoted master? No. I agree to allow a server that has GTIDs stored in gtid_excuted table to be promoted master base on that we can get purged/missed gtids back from dump/restore. 3. We must keep binlogs history consistent.
1. Create a mysql system table named gtid_executed if it does not exist, a row in the table can contain an interval of several GTIDs, all having the same SID and consecutive GNOs as following: CREATE TABLE IF NOT EXISTS gtid_executed( sid CHAR(36) NOT NULL, -- Source ID gno_start BIGINT NOT NULL, -- First GNO of interval gno_end BIGINT NOT NULL, -- Last GNO of interval PRIMARY KEY(sid, gno_start) -- PK on interval start ) ENGINE = InnoDB; Note: The gtid_executed table always exists after creation at any time as it is mysql internal system table. 2. Store gtids into the gtid_executed table always if gtid_mode is enabled as following: INSERT INTO gtid_executed VALUES (SID, GNO, GNO) - If binlog is disabled and gtid_mode is enabled, store gtids from master's transactions, relayed BINLOG transactions and 'SET @@SESSION.GTID_NEXT' statement into gtid_executed table right before transaction prepare. - If binlog is enabled and gtid_mode is enabled, store GTIDs into both binlog and gtid_executed table. Note: No gtid is inserted into the gtid_executed table if gtid_mode is disabled. 3. The gtid_executed table would be filled with singleton interval as we cannot use UPDATE instead of INSERT, because any concurrently executing transactions would conflict on the update. So we propose to create a FOREGROUND thread (we can not create a BACKGROUND thread as we have to create THD object to open gtid_executed table) and introduce a user variable 'executed_gtids_compression_period' to control how often the thread should be woken up to compress the gtid_executed table regularly. - To the user variable 'executed_gtids_compression_period', if value is 0, the thread never wakes up, if value is 1, wakes up every transaction. If value is 1000, wakes up every 1000 transactions. - We also added the thread's state info (suspending, compressing gtid_executed table) into performance_schema.threads table, so that user can monitor it and tune the user variable 'executed_gtids_compression_period' (The thread is in suspended mode most of the time, wakes up only when it needs). - THe thread runs always, user cannot kill it as we didn't add it into PROCESSLIST. - The thread opens a transactional context to execute the following within a single transaction on an non-autocommit mode: - Read each row by the PK in increasing order, delete consecutive rows from the gtid_executed table and fetch these deleted gtids at the same time. - Store compressed intervals from these deleted gtids into the gtid_executed table. 4. If gtid_mode is enabled and binlog is enabled, report @@GLOBAL.GTID_EXECUTED and @@GLOBAL.GTID_PURGED from both binlog and gtid_executed table as following: - GLOBAL.GTID_EXECUTED = all gtids of gtid_executed table; - GLOBAL.GTID_PURGED = GLOBAL.GTID_EXECUTED - (GTID_EXECUTED_BINLOG - GTID_PURGED_BINLOG); (GTID_EXECUTED_BINLOG: gtid_set of gtids is ever logged in binary logs. GTID_PURGED_BINLOG : gtid_set of gtids is purged from binary logs.) - GLOBAL.GTID_EXECUTED is initialized with all gtids of gtid_executed table during server restarting. Every transaction's gtid is added into the GLOBAL.GTID_EXECUTED and gtid_executed table if gtid_mode is enabled after server restarts. - GLOBAL.GTID_PURGED is initialized with gtids from gtid_executed table and binlogs and precomputed as above during server restarting. After server restarts, added purged gtids into GLOBAL.GTID_PURGED when purging logs. - A previous log event with GTID_EXECUTED_BINLOG (GTID_EXECUTED - GTID_ONLY_IN_TABLE) set is created when rotating binlog, so GTID_ONLY_IN_TABLE variable is maintained when binlog is enabled. GTID_ONLY_IN_TABLE is initialized with difference gtids from gtid_executed table and the latest binlog during server restarting. GTID_ONLY_IN_TABLE will never change since starting server. (we don't need maintain GTID_ONLY_IN_TABLE when binlog is disabled) - If gtid_executed_table is empty, add all gtids in GTID_EXECUTED_BINLOG into gtid_executed table during server restarting (Handle the upgrade case, and the case that a slave is provisioned from a backup of the master and the slave is cleaned by RESET MASTER and RESET SLAVE before this.). - The gtid_executed table is also initialized with gtid_purged when user is initializing it through SET GLOBAL gtid_purged. (It is possible to update the value of gtid_purged, but only by adding GTIDs to those already listed, and only when gtid_executed is unset—that is, on a new server.) 5. If gtid_mode is enabled and binlog is disabled, initialize GLOBAL.GTID_EXECUTED from gtid table during server startup and save gtid into executed_gtids for every transaction, report @@GLOBAL.GTID_PURGED from executed_gtids, since @@GLOBAL.GTID_PURGED and @@GLOBAL.GTID_EXECUTED are always same, so we did not save gtid into lost_gtids for every transaction for improving performance. 6. If gtid_mode is disabled, No gtid is inserted into gtid state and the 'binlog_gtids' table. 7. The gtid_executed table is reset when resetting master. 8. Gtid_table_persistor class It manages all operations on the gtid table. - m_count Count the append rows of the table. - save(Gtid *gtid) Insert the gtid into table. - save(Gtid_set *gtid_set) Store gtid set into the table. - compress() Compress intervals into consecutive GNOs in the table. - need_compress() Check if we need compress intervals into consecutive GNOs. - reset() Delete all rows from the table. - fetch_gtids_from_table(Gtid_set *gtid_set) Fetch gtids from the table and store them into gtid set.