WL#4162: Add start/end of trx/stm to binlog

Affects: Server-5.7   —   Status: In-Design   —   Priority: Medium

Executive Summary
=================
This worklog provides information at each event logged to binary log
about transactions boundaries by setting markers in the logged event.

These markers use three single bits of the Log_event Common-Header
flags.

By using single bits of the Log_event Common-Header flags this feature
will need no additional space in binary log to store these markers.

With such markers, by reading a single Log_event header from an event
stream, it will be possible to state if the event:

a) is from a server with the feature enabled;
b) has no transaction boundary information;
c) is a self contained event;
d) is starting a transaction;
e) is inside a transaction;
f) is ending a transaction;
g) is in the middle of a transaction;
h) should not be considered in the transaction boundary parsing.

Before this worklog it was needed to parse a set of events in an event
stream to get information about transactions boundaries. The old
method will still be used when dealing with events without boundary
markers.

After this worklog, this feature will be always enabled. There will
be no option to disable tagging of transaction boundary markers in
MySQL server. 

References
==========
Bug#17943188
BUG#26395, WL#3735
Functional requirements
=======================

F1) Markers should be written to the binary log.

F2) Markers should be visible as comments in mysqlbinlog output.

F3) BINLOG statements inner events are left untouched in respect
    to transaction boundary markers.


Non-Functional requirements
===========================

NF1) The feature must be supported regardless the use of a specific
     binary log format.

NF2) The feature must be supported regardless the use of GTIDs.
Concepts and Definitions
========================

C1) Transaction Stream/Transaction Event Group/Commit Group: a sequence
    of events related to a single transaction.

D1) TBM: Transaction Boundaries Markers. The bits in the binary log
    event common-header in with a transaction boundary type will be
    written.

D2) Transaction Boundary Type: the type of transaction boundary of a
    single event in a server with the transaction boundaries feature
    enabled.

    Of the eight possible types (three bits), this worklog is using
    six, letting the remaining two types available for future use.

    The transaction boundary types defined in this worklog are:

  D2.0) Not Defined:
        The event was logged by a server without the feature.

  D2.1) Ignore Boundary:
        The event can appear in the middle of a transaction and should
        be ignored by the transaction boundary parser. This is the case
        of fake Rotate events sent by the dump_thread when a slave
        connects asking data.
        Examples: Format_description, Rotate.

  D2.2) Self Contained:
        Any event that is self contained (doesn't depend on previous or
        following events).
        Examples:
        D2.1a) Any Query event with an auto-commit statement (DDL for
               example) if GTIDs are disabled and without Intvar, Rand
               or User_var related events;
        D2.1b) Start, Stop, Rotate, Previous_gtids, etc.

  D2.3) Start Transaction:
        Any event starting a Transaction Stream.

  D2.4) Inside Transaction:
        Any event that is part of a Transaction Stream between the
        event that starts and the event that ends the stream.

  D2.5) End Transaction:
        Any event ending a Transaction Stream.



High-Level Specification
========================

Boundary Type
-------------

Represents a transaction boundary type.

Every Log_event object have a boundary type.

The boundary type for an event will be determined by the event type,
but it can be changed before writing to binary log depending on:

 a) Use of GTIDs;

 b) Dependency on Intvar, Rand or User_var events;

A slave I/O thread reading events from the master can use the event's
boundary type to verify if the event is the middle of a Transaction
Stream. This will be important to WL#7349 and WL#7355.

Transaction Boundary Parser
---------------------------

Responsible for keeping track of transactions boundaries in an event
stream and also for assuring that the event stream is valid.  

A server with this feature enabled will expect that a valid stream of
events follows the rules defined below:

BOUNDARY_TYPE_NOT_DEFINED
(
 (BOUNDARY_TYPE_SELF_CONTAINED|BOUNDARY_TYPE_IGNORE_BOUNDARY)*
 BOUNDARY_TYPE_START_TRANSACTION
 (BOUNDARY_TYPE_INSIDE_TRANSACTION|BOUNDARY_TYPE_IGNORE_BOUNDARY)*
 BOUNDARY_TYPE_END_TRANSACTION
)*

Where (<boundary type>)* means 0 or more repetitions of the types. 

The Transaction Boundary Parser have implemented the rules to support
the verification of a event stream either by a slave I/O thread
reading events from the master or by the mysqlbinlog client program
reading events from a binlog file.

            Transaction Boundary Parser Transition Table 
 ====================================================================
 |From boundary type ->|   0    |   2    |   3    |   4    |   5    |
 |                     |  Not   |Self Con| Start  | Inside |  End   |
 |To boundary type     |Defined | tained | Trans  | Trans  | Trans  |
 |=====================|========|========|========|========|========|
 |0) Not Defined       |      This is the parser starting type      |*1
 |---------------------|--------|--------|--------|--------|--------|
 |1) Ignore Boundary   |   The parser is never set to this type     |*2
 |---------------------|--------|--------|--------|--------|--------|
 |2) Self Contained    |   Ok   |   Ok   |  Err   |  Err   |   Ok   |
 |---------------------|--------|--------|--------|--------|--------|
 |3) Start Transaction |   Ok   |   Ok   |  Err   |  Err   |   Ok   |
 |---------------------|--------|--------|--------|--------|--------|
 |4) Inside Transaction|  Err   |  Err   |   Ok   |   Ok   |  Err   |
 |---------------------|--------|--------|--------|--------|--------|
 |5) End Transaction   |  Err   |  Err   |   Ok   |   Ok   |  Err   |
 ====================================================================
*1 - When feeding the transaction boundary parser with an event with
     BOUNDARY_TYPE_NOT_DEFINED in its markers, the transaction
     boundary parser will define (but will not mark) the event boundary
     type based on same assumptions the master do for marking.
     So, we can have a server without this feature replicating to a
     slave with the feature and, instead of only reading the markers
     (the easy way), the slave will "guess" the boundary type of the
     events to take benefit of the feature.
*2 - When feeding the transaction boundary parser with an event with
     BOUNDARY_TYPE_IGNORE_BOUNDARY in its markers, the transaction
     boundary parser will keep its current type.

A transaction boundary parser must be "reset()" every time the server
move from a current event stream to another one (example: when the
slave server connects to a new master). This reset procedure will move
the boundary type of the parser to BOUNDARY_TYPE_NOT_DEFINED.

When reading events from a master server (or from a binlog file), the
Master_info boundary parser (or the mysqlbinlog client program's one)
will be fed with the buffers of the event (this contains the event
header and, off course, the transaction boundary markers).

If the stream is fed with an event that result in an incorrect
transition in the state machine (examples: Self -> End Transaction or 
End Transaction -> Inside Transaction), a warning/error message I2/I3
described in the Interface Specification will be logged.

What this message means to the user: if a slave connects to a master
and asks events pointing an binlog position in the middle of a
transaction, the error will be thrown (probably stating that it was
"Unable to change boundary parser from Not Defined to Inside
Transaction)". 

How can the user fix things so that he does not get the message again:
In the example above, the user can check where the transaction is
starting (using mysqlbinlog client program) and try to position the
slave connection correctly using CHANGE MASTER TO. 

Interface Specification
=======================

I1) mysqlbinlog client program dump will show the boundary type for
    the events.

I2) A new warning message can be logged by the slave server if the I/O
    thread fails to feed an event into the transaction boundary
    parser. This could happen if the event boundary type breaks the
    rules of the boundary parser. It will also make an assert to fail
    in debug builds.

I3) A new warning message can be logged by mysqlbinlog client program
    if a break in the transaction boundary parser rules happens when
    an event from an event stream is fed.
Low-Level Design Specification
==============================

Transaction boundary types for log event types
==============================================

Log_event type                   |State | Set by
---------------------------------+------+------------------------------
Log_event                        |SELF  | Log_event constructor
  Query_log_event                |SELF  | Log_event constructor      *1
           BEGIN                 |START | Query_log_event constructor
           COMMIT/ROLLBACK       |END   | Query_log_event constructor
           not immediate logging |INSIDE| Query_log_event constructor
           BEGIN                 |INSIDE| binlog write_event         *2
           immediate logging     |END   | binlog write_event         *2
    Execute_load_query_log_event |      | Query_log_event constructor*3
  Load_log_event                 |SELF  | Log_event constructor      *1
    Create_file_log_event        |SELF  | Log_event constructor      *1
  Start_log_event_v3             |SELF  | Log_event constructor      *1
    Format_description_log_event |IGNORE| Log_event constructor
  Intvar_log_event               |INSIDE| Intvar constructor
  Intvar_log_event               |START | binlog write_event         *4
  Rand_log_event                 |INSIDE| Rand_log_event constructor
  Rand_log_event                 |START | binlog write_event         *4
  User_var_log_event             |INSIDE| User_var constructor
  User_var_log_event             |START | binlog write_event         *4
  Xid_log_event                  |END   | Xid_log_event constructor
  Stop_log_event                 |SELF  | Log_event constructor      *1
  Rotate_log_event               |SELF  | Log_event constructor      *1
  Rotate_log_event               |IGNORE| Log_event constructor      *5
  Append_block_log_event         |INSIDE| Append_block... constructor
    Begin_load_query_log_event   |INSIDE| Append_block... constructor
  Delete_file_log_event          |SELF  | Log_event constructor      *1
  Execute_load_log_event         |SELF  | Log_event constructor      *1
  Unknown_log_event              |SELF  | Log_event constructor      *1
  Table_map_log_event            |INSIDE| Table_map constructor
  Rows_log_event                 |INSIDE| Rows_log_event constructor
    Write_rows_log_event         |INSIDE| Rows_log_event constructor
    Update_rows_log_event        |INSIDE| Rows_log_event constructor
    Delete_rows_log_event        |INSIDE| Rows_log_event constructor
  Incident_log_event             |SELF  | Log_event constructor      *1
  Ignorable_log_event            |IGNORE| Ignorable_log... constructor
    Rows_query_log_event         |INSIDE| Rows_query_log... constructor
  Heartbeat_log_event            |SELF  | Log_event constructor      *1
  Gtid_log_event                 |START | Gtid_log_event constructor
  Previous_gtid_log_event        |SELF  | Log_event constructor      *1
=======================================================================
*1 - No special code was added to these events.
     The boundary type is "inherited" from Log_event. 
*2 - See binlog.cc modification M1.
*3 - The state will be set by Query_log_event constructor.
*4 - See binlog.cc modification M2.
*5 - See rpl_binlog_sender.cc modification M1.

Note: not sure yet about boundary type for Incident_log_event.

Modifications in source files
=============================

rpl_trx_boundary_parser.h/rpl_trx_boundary_parser.cc (new files)
----------------------------------------------------------------

M1) Defined the boundary markers macros for dealing with event flags
    bitwise operations, the possible boundary types and their names.

M2) Defined the Transaction_boundary_parser class.


log_event.h
-----------

M1) Added an include to rpl_trx_boundary_parser.h

M2) Log_event class definition:

    M2a) Added a public boundary type named boundary_type to the
         Log_event class definition;

    M2b) Created a new protected method called
         update_boundary_type_from_flags() to update the boundary type
         of the event with the values of the Log_event flags.

M3) Query_log_event class definition:

    M3a) Created a new method is_begin_stmt();

    M3b) Created a new method is_commit_stmt();

    M3c) Created a new method is_rollback_stmt();

    M3d) Changed starts_group() method to use is_begin_stmt();

    M3e) Changed ends_group() method to use is_commit_stmt() and
         is_rollback_stmt().

M4) Intvar_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_INSIDE_TRANSACTION.

M5) Rand_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_INSIDE_TRANSACTION.

M6) Xid_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_END_TRANSACTION.

M7) User_var_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_INSIDE_TRANSACTION.

M8) Rows_query_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_INSIDE_TRANSACTION.


log_event.cc
------------

M1) Log_event constructors:

    Added code to mark the event as BOUNDARY_TYPE_SELF_CONTAINED.

M2) Log_event::write_header:

    Added code to apply the boundary type into the flags of the event
    before writing it.

M3) Log_event::read_log_event:

    Added code to set event's boundary type according to event's flags.

    Obs.: The *_event constructors that have code to set their boundary
          types are those not used to reconstruct a previously logged
          event. When the event is being reconstructed (like using
          read_log_event), the boundary type is set by this
          modification.

    Changed a debug message to show also event's boundary type.

M4) Log_event::print_header:

    Prints the boundary type of the event.

M5) Query_log_event constructor:

    Before the verification of 'ignore_cmd_internals', we adjust the
    event's boundary type to BOUNDARY_TYPE_INSIDE_TRANSACTION if it is
    a BEGIN statement or to BOUNDARY_TYPE_END_TRANSACTION if it is a
    COMMIT or ROLLBACK statement.

    Later, if the Query event is not using immediate logging, we set
    the event boundary type to BOUNDARY_TYPE_INSIDE_TRANSACTION.

M6) Append_block_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_INSIDE_TRANSACTION.

M7) Rows_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_INSIDE_TRANSACTION.

M8) Table_map_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_INSIDE_TRANSACTION.

M9) Gtid_log_event constructor:

    Added code to mark the event as BOUNDARY_TYPE_START_TRANSACTION.


binlog.cc
---------

M1) binlog_cache_data::write_event:

    Every Gtid_log_event is marked as BOUNDARY_TYPE_START_TRANSACTION
    by Gtid_log_event's constructor.

    If a Gtid_log_event have to be logged before the event to be
    written, it is needed to adjust the boundary type of the event to
    be written as follows:

    M1a) If ev->boundary_type is BOUNDARY_TYPE_START_TRANSACTION, we
         have to mark it as BOUNDARY_TYPE_INSIDE_TRANSACTION;

    M1b) If ev->boundary_type is BOUNDARY_TYPE_SELF_CONTAINED, we have
         to mark it as BOUNDARY_TYPE_END_TRANSACTION.

M2) MYSQL_BIN_LOG::write_event:

    Here it is possible to insert Intvar_log_event, Rand_log_event or
    User_var_log_event if needed before writing the event asked to be
    written.

    As all Intvar_log_event, Rand_log_event and User_var_log_event are
    marked as BOUNDARY_TYPE_INSIDE_TRANSACTION by their constructors,
    if the event to be written is marked as
    BOUNDARY_TYPE_SELF_CONTAINED (it is a DDL, for example), we change
    the first inserted event markers to BOUNDARY_TYPE_START_TRANSACTION
    and the event to be written is marked as
    BOUNDARY_TYPE_END_TRANSACTION.


rpl_binlog_sender.cc
--------------------

M1) Binlog_sender::fake_rotate_event:

    This method will send a rotate log_event without instantiating an
    Rotate_log_event object.

    The flags of the fake rotate event to be sent is marked as
    BOUNDARY_TYPE_IGNORE_BOUNDARY.

M2) Binlog_sender::send_heartbeat_event:

    This method will send a heartbeat log_event without instantiating
    an Heartbeat_log_event object.

    The flags of the heartbeat event to be sent is marked as
    BOUNDARY_TYPE_SELF_CONTAINED.


rpl_mi.h
--------

M1) Added an include to rpl_trx_boundary_parser.h

M2) Master_info class definition:

    Added an Transaction_boundary_parser object named event_parser to
    the Master_info class definition.


rpl_slave.cc
------------

M1) Added an include to rpl_trx_boundary_parser.h

M2) handle_slave_io:

    Added event boundary type to a debug "info" message stating:
    "IO thread received event..."

    Added a call to mi->tbm_state_machine.reset() right before
    entering the loop reading the events.

M3) queue_event:

    Added code to feed the Master_info event_parser with the
    the buffer of the event to being queued.

    If the mi->event_parser returns that it was an invalid
    transition, throw an warning message stating: "Unable to change
    boundary parser from %s to %s." and the boundary type names of the
    previous boundary type in the state machine and the last boundary
    type which the state machine was fed.

    After throwing the warning message, as the event_parser should
    now be in the error state, we will reset() the event parser, moving
    it to the "Not Defined" type.

    For the debug builds, we will add an assert to verify if the
    even_event parser has changed to a type other than error.

    For non debug builds, the I/O thread will throw a warning message
    on each event queued until it reaches a new transaction boundary
    parser consistent state (a self contained event or a start
    transaction).

mysqlbinlog.cc
--------------

M1) Added an include to rpl_trx_boundary_parser.h

M2) Added an Transaction_boundary_parser object named event_parser.

M3) dump_local_log_entries:

    M3a) Added a call to event_parser.reset() to initialize the
         verification of the event stream to be read.

    M3b) Added code to feed the event_parser with the buffer of the
         event read.

         If the event_parser returns that it was an invalid
         transition, do two things:

         M3b1) show a warning message stating: "Unable to change
               boundary parser from %s to %s." and the boundary type
               names of the previous boundary type and the boundary
               type of the event read;

         M3b2) Issue an event_parser.reset() to remove the
               transaction boundary parser from the error state.

M4) Added an include to rpl_trx_boundary_parser.cc at the end of the
    file.