WL#7592: GTIDs: generate Gtid_log_event and Previous_gtids_log_event always

Affects: Server-5.7   —   Status: Complete   —   Priority: Medium

When GTID_MODE = ON, the binary log contains two events that do not currently exist when GTID_MODE = OFF. In this worklog we will make it so that similar events are generated also when GTID_MODE = OFF:

  1. When GTID_MODE = ON, every transaction is preceded by an event with type code GTID_LOG_EVENT. No such event exists when GTID_MODE = OFF.

    However, after GTID_LOG_EVENT was introduced, it has gradually turned into an event with a more generic purpose, as fields not related to GTIDs were added to the event. These fields include logical timestamps used for applying transactions in parallel on the slave, as well as physical timestamps used for monitoring. It is also likely that more per-transaction fields will be needed in the future.

    Therefore, we need to generate a per-transaction event also when GTID_MODE = OFF; this is needed e.g. for WL#7083 and WL#7165.

    The event type that we need exists in the code already, although it is not used: it has the type code ANONYMOUS_GTID_LOG_EVENT. This type code is different from GTID_LOG_EVENT, but internally, the two event types share the same class, called Gtid_log_event.

  2. When GTID_MODE = ON, every binary log has a Previous_gtids_log_event in the header (just after the Format_description_log_event). When GTID_MODE = OFF, there is no such per-binlog event.

    This has the problem that if gtid_mode is changed from ON to OFF and then back to ON again, the values of GTID_EXECUTED and GTID_PURGED may get lost.

    Therefore, this worklog ensures that Previous_gtids_log_event is written to every binary log, regardless of gtid_mode.

    This has the additional benefit that it will allow the recovery of GTID_PURGED and GTID_EXECUTED to be optimized.

An additional benefit from generating both Gtid_log_event and Previous_gtids_log_event unconditionally is that binary logs will have similar structures when GTID_MODE = ON and GTID_MODE = OFF. This allows us to simplify both code and test cases, as otherwise we would need special code to handle the two different cases.

This work was previously part of WL#7083 but will be more practical to fix separately.


The bug can later be fixed using the refactoring in this worklog.

Functional requirements:

When GTID_MODE = OFF, every transaction shall be preceded by an Anonymous_gtid_log_event in the binary log.
When GTID_MODE = OFF, every binary log shall start with a Previous_gtid_log_event.
Cross-version replication OLD->NEW shall not harmed by this worklog.
Cross-version replication NEW->OLD shall work as follows:
FR4.1. NEW->5.6.21 shall work if gtid_mode=ON. It will not work if gtid_mode=off.
FR4.2. NEW->5.6.22 shall work regardless of gtid_mode.
Footnote: the reason that 5.6.21 and older cannot work is BUG#74683. Once the bug is fixed, cross-version replication shall work.

Non-functional requirements:

The code shall be structured such that it does not assume all Gtid_log_events and Anonymous_gtid_log_events have the same size.
This requirement makes the code more maintainable.
This shall not cause more than 3% performance degradation.
The restrictions imposed by GTID_MODE=ON (e.g., enforce-gtid-consistency, and disallowing sql_slave_skip_counter) shall not apply when GTID_MODE=OFF, even when we generate events with type code ANONYMOUS_LOG_EVENT.

Note. Analysis of storage requirements

  • A Gtid_log_event takes 57 bytes. In the case that Gtid_log_event is not include in the binary log, Query_log_event will grow by 8 bytes since a logical timestamp used for MTS is stored in the Query_log_event instead of in the Gtid_log_event. So the real overhead is 57-8=49 bytes.
  • A Query_log_event takes 79 bytes or more, plus the length of the query, plus the length of the database name.
  • A Xid_log_event takes 31 bytes.
  • A table_map event takes 45 bytes or more.
  • A row event takes 40 bytes or more.

So a *minimal* DML transaction without Gtid_log_event takes:

   79 + 1 + 5 (query_log_event(use 'x'; BEGIN)))
 + 79 + 1 + 21 (query_log_event(use 'x'; INSERT INTO t SET a=1))
 + 31 (Xid_log_event)
 = 217 bytes.

Thus, the overhead of the Gtid_log_event is at most 49/216=22%. But this is a theoretical worst case. A real transaction would have significantly large queries, and often a larger number of queries, which would make the relative overhead of the Gtid_log_event smaller.

Moreover, the query_log_event(BEGIN) is redundant and the XID of the Xid_log_event could be stored in the Gtid_log_event. This worklog will pave the way for the future work of removing both these events, and then it will unconditionally be an improvement even compared to a binary log without Gtid_log_event: it will save 79+1+5+31-49=67 bytes per transaction.



  1. When GTID_MODE = OFF, an Anonymous_gtid_log_event shall be generated and every transaction or DDL statement in the binary log shall be preceded by it.
  2. When GTID_MODE = OFF, a Previous_gtids_log_event shall be generated at the beginning of every binary log and every relay log.
  3. When GTID_MODE = ON, there is no user-visible change.

Cross-version compatibility

OLD->NEW: This will always work. The server can handle old events and there is no problem.

NEW->OLD[5.5]: This will never work. Even if ANONYMOUS_GTID_LOG_EVENT has LOG_EVENT_IGNORABLE_F set, this flag was only introduced in 5.6, so the 5.5 slave does not know about the flag and will fail.

NEW->OLD[5.6 or 5.7]:

  • If GTID_MODE=ON, there is no problem, since nothing changed.
  • If GTID_MODE=OFF, there are two cases:
    • If slave is 5.6.21 or older, or 5.7.5 or older, replication will stop with an error, due to BUG#74683.
    • If we fix BUG#74683 in 5.6.22, and slave is 5.6.22 or later, then replication will work fine, since PREVIOUS_GTIDS_LOG_EVENT is skipped by the slave receiver thread and ANONYMOUS_GTIDS_LOG_EVENT has LOG_EVENT_IGNORABLE_F set.


The behavior has changed for DROP DATABASE statements that fail after deleting some tables.

This error can happen e.g. because of extra files in the database directory. When this happens, the server generates an error, but it leaves the tables deleted. To log this, the server generates a DROP TABLE statement listing all the tables. If there are many tables, the DROP TABLE statement becomes very long. In this case, the server generates multiple DROP TABLE statements. This is very strange but it is how the server works before this patch.

If the error happens on a master, there is no problem from GTID perspective: it will generate one Anonymous_gtid_log_event per DROP if GTID_MODE=OFF, and one Gtid_log_event per DROP if GTID_MODE=ON.

If the error happens on a slave, and GTID_MODE=OFF, there is also no big problem: it generates an Anonymous_gtid_log_event per DROP.

If the error happens on a slave, and GTID_MODE=ON, we are in a worse situation. The GTID must be preserved, but the statement needs to be logged as multiple transactions, each having its own GTID. The error cannot be detected until after the statement has completed and it cannot be rolled back. Our solution is to generate an error and not log anything at all. We introduce a new error message for this:

"DROP DATABASE failed on slave; some tables may have been dropped but the database directory remains. The GTID has not been added to GTID_EXECUTED and the statement was not written to the binary log. Fix this as follows: (1) remove all files from the database directory %-.192s; (2) SET GTID_NEXT='%-.192s'; (3) DROP DATABASE `%-.192s`."


CREATE...SELECT is allowed when GTID_MODE=OFF. Prior to this worklog, this was executed as one transaction. In statement format it was logged as a single statement. In row format, it was logged as:

 ...row events...

This worklog changes the logging of CREATE...SELECT in row format, so now it is:

 ...row events...

This also means that there is a storage engine commit between the CREATE and the insertion of rows.

Stricter checks for missing Gtid_log_event in slave applier thread

When slave has GTID_MODE=ON, it must not accept any transactions that are missing a Gtid_log_event. If the slave applier thread sees an event that is part of a transaction that does not have a Gtid_log_event, then the slave stops with an error.

Prior to this worklog, the check was omitted for BEGIN and COMMIT events. Now, the check is stricter and is done also for BEGIN and COMMIT. This should not make any different to users, but required modifying a few tests cases that did strange seeking in the binary log in order to simulate errors.


There was a bug in how GTIDs were displayed PERFORMANCE_SCHEMA. This is not directly related to the scope of this worklog, but had to be fixed to make existing tests pass.

Background: The GTID is changed during the lifetime of a transaction as follows:

  • On a master, the transaction is "AUTOMATIC" while executing. Only when it commits is it assigned a GTID of the form UUID:NUMBER.
  • On a slave, the transaction is assigned a GTID before it starts to execute.
  • When GTID_MODE = OFF, transactions use the special GTID "ANONYMOUS" rather than "UUID:NUMBER".

This was not reflected correctly in the GTID column of these performance_schema tables. Among other things, the history table could contain 'AUTOMATIC'. This should never happen since it makes it impossible to identify the transaction.


The output from mysqlbinlog changes a little bit:

  • Previous_gtids_log_event and Anonymous_Gtid_log_event are present even if gtid_mode=off.
  • For Query_log_event, WL#7165 introduced a commented-out text that shows the logical timestamps used for MTS. Now, these timestamps do not occur at all for Query_log_event, so they have been removed from the output of mysqlbinlog.
  • For Gtid_log_event, there was a commented-out text saying 'commit=yes'. This once had a meaning in a development tree of the GTID feature, but the part of the GTID feature that this refers to was never pushed to the main trees. So the text is completely meaningless. In this worklog we remove the text.


- To implement FR1, we need to generate an Anonymous_gtid_log_event even when
- To implement NFR1, we need change the life cycle of Gtid_log_event
 and Anonymous_gtid_log_event.
 Currently, Gtid_log_event is generated the first time something is written to
 the cache. This means that the size has to be determined before the
 transaction executes, which violates NFR1.
 To conform to NFR1, we move the Gtid and anonymous events out of the
 transaction and statement caches. Instead of generating the events at the
 beginning of the transaction, we generate them just before flushing the
 transaction to the binary log. I.e., we generate them in (a function called
 from) binlog_cache_data::flush(). (Nothing prevents us from generating the
 GTID and/or Gtid_log_event in any other places, if that is needed in the
 In do_write_cache(), we keep the event outside the cache and update
 end_log_pos and compute the checksum separately, before processing the
 actual cache.


We will implement this worklog as multiple patches:

1. Small simplifications.
   While debugging the feature, a few small things had to be fixed, e.g.
   more DBUG output, etc. This patch collects all such simplifications,
   so that they don't distract the rest of the worklog.
2. Remove dead code related to empty GTID transactions.
   While developing the feature, we found parts of the GTID code was
   not used at all. Since this is code related to the feature, we fix
   it in this worklog by removing the dead code.
3. Clean up binlog.cc:gtid_empty_group_log_and_cleanup.
   This feature affects binlog.cc:gtid_empty_group_log_and_cleanup.
   This function was unnecessarily complex, and a few things were
   done the same way by all callers of the functions. We refactor
   this function to simplify the code and have a more clear view of
   how GTIDs work in the server.
4. Refactor @@session.gtid_executed implementation.
   Before this patch, the implementation of the session variable
   @@session.gtid_executed relies on Gtid_log_event being stored
   in the transaction cache or statement cache. Since we will move
   Gtid_log_event out of the caches, we change the implementation
   of @@session.gtid_executed so that it does not rely on the caches.
5. Refactor owned_gtid.
   In order to make MTS work correctly in the presence of
   Anonymous_gtid_log_events, we must make Relay_log_info::is_in_group
   understand that an anonymous transaction has started after an
   Anonymous_gtid_log_event, before the BEGIN is executed.
   Currently, is_in_group looks at the value of THD::owned_gtid.sidno
   to determine if any GTID is owned. To make this work for
   Anonymous_gtid_log_events, we make SET GTID_NEXT='ANONYMOUS'
   set THD::owned_gtid.sidno = -2, and make any committing statement
   reset THD::owned_gtid even in this case. We introduce the symbolic
   constant OWNED_SIDNO_ANONYMOUS for -2.
6. Allow writing Log_event header to memory.
   This is a small refactoring to allow writing Gtid_log_event to a
   memory buffer rather than to an IO_CACHE.
   When we write Gtid_log_event always, the Gtid_log_event will live
   outside the transaction cache. Before the event is flushed to the
   binary log, it has to be stored in a memory buffer. However, there
   was no function for writing log events to a memory buffer.
   This patch adds the function Log_event::write_header_to_memory
   that writes the 19 byte Log_event header to a memory buffer, and
   Gtid_log_event::write_to_memory that writes an entire
   Gtid_log_event to a memory buffer.
7. Refactor MYSQL_BINLOG::do_write_cache.
   This patch rewrites do_write_cache so that it becomes more
   maintainable and so that we can write out-of-cache events to the
   binary log. There are two reasons for this.
   Rationale 1:
   Before this patch, do_write_cache mixes two tasks:
    1. It reads from the statement or transaction cache and assembles
       pieces of events that are split over multiple pages in the
    2. It processes the event and writes it to the binary log.
   In this worklog, we need to store the Gtid_log_event outside the
   statement and transaction caches. The event needs the processing
   in (2), but not that in (1). Hence, we need de-couple these two
   tasks. Task (2) is implemented by a new class and task (1) remains
   in do_write_cache.
   Rationale 2:
   Before this patch, do_write_cache was very difficult to
   understand, as it kept lots of state in variables that were
   updated throughout the function. The reason was that the logic was
   organized in an unpractical manner: the outermost loop iterated
   over pages of the IO_CACHE and tried to keep various pieces of
   state of half-written events in local variables. This state
   information was updated all over the function, which made it
   difficult to understand what were the loop invariants.
   This patch changes so that the outermost loop iterates over
   events.  This makes the state information is much more
   short-lived. For instance, state related to the beginning of an
   event is only needed at the beginning of the iteration and can be
   stored in a very short-lived variable. So there is less state kept
   between iterations, and this makes the loop invariants much
8. Generate Gtid_log_event_always.
   After all the previous preparation steps, we can now generate
   Gtid_log_event always. This does change server behavior. Moreover,
   it breaks many existing test cases that assume there is no
   Gtid_log_event. However, since we have another big change to make
   (generate Previous_gtids_log_event always), we do not touch tests
   in this patch. Later patches will fix the tests.
   This patch has three primary goals:
    8.1. Remove Gtid_log_event from the statement/transaction cache.
    8.2. Generate Gtid_log_event just before flushing the
         statement/transaction cache. Store the event in a separate buffer.
    8.3. Generate Gtid_log_event even when GTID_MODE=OFF.
   In order to make this work, we have to do the following additional tasks:
    8.4. Previously, commit sequence number was generated when flushing the
         cache. Since Gtid_log_event is not part of the cache now, we generate
         it in Gtid_log_event.
    8.5. Since Anonymous_log_events now exist, a case must be added to
         MTS code so that it reads the commit_seq_no from the
         ANONYMOUS_GTID_LOG_EVENT just like it reads from the GTID_LOG_EVENT.
   In addition we do the following cleanup tasks:
    8.6. Remove binlog.cc:gtid_before_write_cache. This functionality is now
         in MYSQL_BIN_LOG::write_gtid. In addition, we move the part of this
         function that deals with generating the GTID into the new function
    8.7. Avoid taking a lock in Gtid_log_event constructor.
    8.8. Remove the '[commit=yes/no]' output when mysqlbinlog prints a
         Gtid_log_event. This was only relevant for a pre-GA version of
         the GTID feature.
9. Generate Previous_gtid_log_event always.
   This patch generates Previous_gtids_log_event even if
   gtid_mode=off.  This changes server behavior. It also breaks a lot
   of existing test cases that assume there is no
   Previous_gtids_log_event. We address the test cases in the
   following patch.

10. Fix failing tests.

   This patch fixes all test cases that fails due to the previous two
   patches.  In addition it fixes two code bugs that were
   exposed/introduced due to the previous two patches, and which
   caused some of the test failures.
     The problem was: some Rotate_log_events in the relay log are
     generated on the slave, not on the master. Thus, their
     end_log_pos field is relative to the slave relay log. Since
     MASTER_LOS_POS is relative to the master binary log, we must not
     evaluate the MASTER_LOS_POS condition for such slave-generated
     Rotate_log_events. But the logic to skip the until check for
     slave-generated events was missing, and this caused tests to
     The fix is to avoid evaluating the until condition for
     slave-generated events. This is easy: such events are easily
     distinguishable since their server_id is zero. So we check if
     the server_id==0, and in that case we don't evaluate the until
     This did not cause any tests to fail before this worklog,
     because the events appeared so early in the relay log that their
     positions would be smaller than the position specified by
     MASTER_LOG_POS. However, after this patch, the events appear
     after Previous_gtids_log_event, which moves the position forward
     so much that it causes the slave thread to stop before the
     rotate event, which causes the test to fail.
   - Fix bug in sql_slave_skip_counter with GTIDs.
     sql_slave_skip_counter did compute transaction boundaries
     correctly in the presence of Gtid_log_events. This did not cause
     any problems before this patch since sql_slave_skip_counter is
     not allowed when gtid_mode=on.
     sql_slave_skip_counter is supposed to decrease for every event
     processed, except it should not decrease down to 0 in the middle
     of a group. This ensures that the applier thread does not stop
     in the middle of a transaction. However, the applier thread did
     not consider Gtid_log_event to be part of a group, and therefore
     it could stop after the Gtid_log_event.
     The problem was that Gtid_log_event implement a specialized
     do_shall_skip function. This caused it to decrease the counter
     down to zero. The fix is to implement
     Gtid_log_event::do_shall_skip and make it call continue_group.
   - Fix simplified-binlog-recovery.
     Writing Previous_gtids_log_event always broke the logic for
     Beforeā€Ž this patch, simplified-binlog-recovery would avoid
     iterating over multiple binary logs only in the case that the
     binlog lacks a Previous_gtids_log_event.
     Since we now generate Previous_gtids_log_event always, recovery
     would iterate over all binary logs even when
     simplified-binlog-recovery was enabled.
     Make it so that simplified-binlog-recovery skips the rest of the
     binary logs also in the case that the first binary log contains
     a Previous_gtids_log_event and no Gtid_log_event.

11. Remove dead code.

   This patch removes some code that became dead after this worklog:
   - Remove code to store logical timestamps in Query_log_event. This
     is now done only in Gtid_log_event.
   - Remove class Group_cache. This is not used now since the
     Gtid_log_event is not written to the transaction or statement
   - Remove the G_COMMIT_TS field. This was a single byte that had
     the constant value 1, and was stored in the binary log for
     Gtid_log_events. It did not server a purpose, so we remove it.
   - Remove IO_CACHE::commit_seq_no and
     IO_CACHE::commit_seq_offset. These were clearly misplaced in the
     first place, and are now not needed any more since the timestamp
     is generated in the Gtid_log_event constructor.
   - Remove enumeration value INVALID_GROUP from
     enum_group_type. This was used in two places. The first one was
     Group_cache, which is now removed.  The second was
     Gtid_specification::get_type. But get_type was only used in
     Gtid_specification::is_valid. We can merge get_type into
     is_valid (this is a simplification), and the result is that we
     don't need INVALID_GROUP.

12. Fix GTIDs in P_S tables.

   In this patch we correct the GTID shown in the GTID columns of
   performance_schema.events_transactions_current and
   The GTID is changed during the lifetime of a transaction as
   - On a master, the transaction is "AUTOMATIC" while
     executing. Only when it commits is it assigned a GTID of the
     form UUID:NUMBER.
   - On a slave, the transaction is assigned a GTID before it starts
     to execute.
   - When GTID_MODE = OFF, transactions use the special GTID
     "ANONYMOUS" rather than "UUID:NUMBER".
   This was not reflected correctly in the GTID column of these
   performance_schema tables. Among other things, the history table
   could contain 'AUTOMATIC'. This should never happen since it makes
   it impossible to identify the transaction.
   This issue is not directly related to WL#7592. However, it showed
   up as a test failure in perfschema.transaction after a refactoring
   that was part of WL#7592. Therefore, we fix it in order to make
   tests pass for WL#7592.

13: Fix binlogging of strange SQL statements.

   A few SQL statements have strange semantics that causes trouble for GTIDs.
   This includes DROP TABLE with multiple tables,
   DROP TEMPORARY generated by client disconnect.
    1. Background:
       When DROP TABLE is used with multiple tables, and the tables are of
       different types (transactional/non-transactional or
       temporary/non-temporary), tables of the same type get grouped together
       and each group is logged as a separate statement. For example:
         DROP TABLE temporary, non_temporary
       gets logged as
         DROP TABLE temporary; DROP TABLE non_temporary.
       When GTID_MODE = ON, each such statement is assigned its own GTID.
       In order to generate the GTID, mysql_rm_table_no_locks must call
       mysql_bin_log.commit for each group of tables.
       mysql_bin_log.commit is only called when gtid_mode != OFF. When
       gtid_mode == OFF, all the statements are written to the binary
       log in one operation. So after WL#7592 there is only one
       Anonymous_gtids_log_event, instead of one for each DROP TABLE.
       Call mysql_bin_log.commit unconditionally.
       Note: inside a transaction, and only temporary tables are dropped,
       we should not call mysql_bin_log.commit, since the transactional
       context must remain open in this case.
    2. Background:
       Prior to this patch, when binlog_format=row, CREATE...SELECT gets
       written to the binary log as
         row events
       CREATE...SELECT is not allowed when gtid_mode=on (in fact, not when
       Although CREATE without SELECT has an implicit commit, it appears in
       the middle of a transaction on the slave. Thus, after this worklog
       and prior to this patch, it gets logged as:
         row events
       This causes problems on an MTS slave.
       Call mysql_bin_log.commit after writing the CREATE statement.
    3. Background:
       If DROP DATABASE fails after dropping some tables (e.g., if there
       are extra files in the database directory), then it writes a
       DROP TABLE statement that lists all the tables that it dropped.
       If there are many tables, this statement gets long. In this case,
       the server splits the statement into multiple DROP TABLE statements.
       If this happens when GTID_NEXT='UUID:NUMBER', then there is no way
       to log this correctly. So we must generate an error and log nothing.
       Introduce a new error code,
       generate the error if GTID_NEXT='UUID:NUMBER' and DROP DATABASE
       needs to be logged as multiple DROP statements.
    4. Background:
       OPTIMIZE/REPAIR/ANALYZE/CHECKSUM TABLE are written to the binary
       log even if they fail, after having called trans_rollback.
       trans_rollback calls gtid_state::update_on_rollback, which normally
       releases GTID ownership. But we must not release ownership before
       writing to the binary log.
       This was already fixed for the case gtid_mode=on; for that case we
       set a special flag in the THD object which tells
       gtid_state::update_on_rollback to not release ownership. Now we need
       to fix the case gtid_mode=off, so we set the flag in this case too.
    5. Background:
       CREATE TEMPORARY and DROP TEMPORARY behave very strange. If executed
       outside transactional context, they behave as DDL: they get logged
       without BEGIN...COMMIT and cannot be rolled back. If executed in
       transactional context, they behave as non-transactional DML: they
       get logged inside BEGIN...COMMIT, leave the transactional context
       open, but cannot be rolled back.
       Before this patch, CREATE TEMPORARY and DROP TEMPORARY call
       gtid_end_transaction unconditionally.
       gtid_end_transaction ends the transactional context and releases
       ownership. This was not a problem before WL#7592 since
       gtid_end_transaction could only be called when gtid_mode=on, and
       when gtid_mode=on we disallow CREATE TEMPORARY and DROP TEMPORARY
       inside transactional context. However, after WL#7592, we call
       gtid_end_transaction also when gtid_mode=off, and gtid_end_transaction
       releases anonymous ownership.
       Do not call gtid_end_transaction for CREATE TEMPORARY and
       DROP TEMPORARY inside transaction context.
    6. Background:
       When a client that has open temporary tables disconnects, the
       temporary tables are dropped and DROP TEMPORARY is written to the
       binary log.
       After WL#7592 and before this patch, if a client disconnects when
       GTID_NEXT='ANONYMOUS', the client would not hold anonymous ownership
       when writing to the binary log, which would trigger an assertion in
       There was no problem when GTID_NEXT='UUID:NUMBER', since this case
       was taken care of already before WL#7592. In this case, we set
       GTID_NEXT='AUTOMATIC' before dropping any tables.
       Set GTID_NEXT='AUTOMATIC' regardless of GTID_MODE.