WL#9175: Correct recovery of DDL statements/transactions by binary log.

Affects: Server-8.0   —   Status: Complete

With advent of new data-dictionary it becomes possible to package
metadata changes and writing binary log into single transaction.
We aim to make DDL statements crash-safe by using this feature.

However there is a gotcha here. If crash happens after the moment
when DDL statement is committed in InnoDB but before binary log is
flushed to disk we will get inconsistency between DD and binary log
after restart.

Notice that for normal DML transactions this problem does not occur
because we use 2-phase commit protocol for them and binlog serves
as transaction coordinator log in this case. I.e. if transaction is
absent from binary log it is discovered as prepared but not yet
committed and is rolled back during server restart.

The mechanism implementing recovery of DML transactions relies on
special type of binary log event Xid_log_event which is not used
for DDL statements/transactions.

The goal of this task is to implement support for correct recovery
for DDL transactions based on the binary log data. This means that
we need to use 2-phase commit for such transactions plus add
an equivalent of Xid_log_events for them.

The worklog limits the crash-recovery support to DDL:s that are 2pc capable. At
this point this subclass includes all ACL DDL:s including new CREATE|DROP role.
FR1. The server must be able to recover from crash to commit or rollback
     every 2pc-capable DDL command that was in progress.
     Commit is done to commands that had reached Engine-prepared status,
     and got successfully logged into binary log.
     Otherwise they are rolled back.

FR2. In case (a) GTID is ON, or (b) --relay-log-info-repository=table
     the slave server must be able to recover from a failure 
     including crash to honor FR1 with GTID of the recoverable DDL:s preserved.
     Not recoverable DDL:s (a) release their GTID:s or (b) preserve
     the last executed group position prior the failing query, as it's supposed
to upon rollback.
     Note (a) and (b) are not really disjoint options, any combination of
     the feature values is allowed.

FR3. In case GTID is ON and --skip-log-bin, update to
     the 'mysql.gtid_executed' table is ensured to a be part of the DDL
     "auto-commit" transaction.

FR4. Cross-replication OLD->NEW and NEW->OLD must be seemless.

FR5. Enhanced transactional DDL binary logger does not affect
     event group marking with (a) logical timestamps for LC MTS and (b)
     the accessed databases for DB MTS.
HLS/IS

Recoverable DDL needs 2PC with a transaction coordinator, such as binlog.
The binlog-based recovery is traditional method to tolerate crashes.
The gtid-based recovery serves to the slave server or a backup-restoring server.

The current WL elaborate both frameworks, which we discuss one by one.


*The binlog-based recovery*
---------------------------

Prior the WL a DDL query has been logged as Query_log_event and that
remains.
The query logging is turned to be transactional, very similarly to how takes place
with an autocommit DML query, with a difference that there won't any BEGIN
nor the terminal Xid_log_event (or Commit query-log-event).
The XID info is present in the DDL Query-log-event as a new member of the status
var set.

XID info is needed only on the "master" side where the recovery procedure
is augmented to recongnize such DDL kind of Query_log_event.

Old -> New replication is not effected at all.
New -> Old works thanks to a general property of the Query-log-event status var:s
       that are not sensed by the Old slave when the var:s are "new".


*The gtid-based recovery*
-------------------------

Before this WL the GTID data associated with DDL were never stored into
the gtid_executed table transactionally with the DDL operations.
This is to change now.

When DDL processes to the commit stage, insert into gtid_executed is
made as a part of the DDL transaction before the DDL prepares. The
insert effectively extends the DDL transaction to not actually require
2pc in this case.

Regular DML transaction extension framework was introduced by
'mysql.gtid_executed' table worklog.
The current WL makes sure DDL exploit it.


*The slave --relay-log-info-repository=table based recovery*
-------------------------------------------------------------

When DDL processes to the commit stage and the slave applier is configured
with TABLE type for @@global.relay-log-info-repository, the info table
gets updated as a part of the DDL transaction. When the applier fails
to perform the DDL the info table remains unchanged thanks to transaction
rollback, and the slave applier is to restart off the failing event.
This must be ensure for both the STS and MTS execution mode.
Find in this section low-level details of implementation of functional
requirements.

=== Xid info association with DDL ===

The Xid info is recorded into the DDL Query_log_event as a new status var.
DDL Query_log_event is turned from demanding the immediate logging
to use the transaction cache. In order to avoid caching a terminal
Xid_log_event or Commit MYSQL_BIN_LOG::commit() method is refined
to check a new THD::binlog_ddl_query member.
When its value is not the default that indicates a DDL is being handled
and the terminal event recording is skipped.

The new THD::binlog_ddl_query is initialized with the default value at THD::THD,
changes when the DDL Query gets to binlogging routine to instantiate Query_log_event
with the new status var holding xid info,
and eventually is reset standardly via THD::reset_for_next_command().


=== Binlog-based recovery for DDL ===

MYSQL_BIN_LOG::recover() employs a "custom" replication event group parser.
It gets augmented to recognize a new case when a Query-log-event contains
the xid info.
No other change to any group parser is done.

=== Presenting XID info in mysqlbinlog output ===

It's proposed simply print out a new member along with other attributes
that are displayed currently\footnote{%
a fancier method can be implemented as well to print out Xid info uniformly
like:

  SET GTID_NEXT=val; Query; SET GTID_NEXT=val; BEGIN; Commit /*Xid*/;


=== Augmentation of the slave DDL transaction ===

When --relay-log-info-repository=TABLE the DDL transaction is set to contain
an update to the slave info table. This is implemented via arrangement of a new
statement to execute at proper time. The augmentation is done only to the
2pc-capable DDL subclass. Other DDL types are continued to be handled by the
slave applier as before, that is the info table remains to be updated
after the DDL operation has been completed.