WL#7743: New data dictionary: changes to DDL-related parts of SE API

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

As part of the New Data-Dictionary project, in order to allow InnoDB to get rid
of its internal data-dictionary and support atomic/crash-safe DDL, we need to
extend SQL-layer code and parts of SE API which are related to opening tables
and DDL.

The following needs to be supported:

1) Auxiliary columns and keys (hidden system columns and keys
   which InnoDB adds to the tables implicitly).
2) Access to se_private_* values for DD objects during opening
   tables and updating them during DDL.
3) Atomic/crash-safe DDL. While implementing this item it also makes
   sense to change DDL user-visible behavior to more atomic.

InnoDB also needs to be able to store info about auxiliary tables (needed
for FTS) in the DD. However it seems that existing support for this in DD
API is sufficient for now.

F-1)  DROP TABLES statement which fails due to missing table should not
      have any side-effects. DROP TABLES statement which fails due to
      other errors is allowed to have side-effects (but see FutureF-1).

NF-1) Crash-safety of multi-table DROP TABLES should be improved. I.e.
      discrepancy between SEs, the DD and binary log should be limited
      to one table at most.

F-2)  It should be possible to replicate successful DROP TABLES statement
      from the older servers even in GTID mode.

F-3)  DROP DATABASE should be mostly atomic regarding stored routines and
      events. I.e. it should either succeed and drop all of them, or fail
      and drop neither of them (except cases when problem is related to
      removal of database directory). Failing DROP DATABASE is allowed to
      have side effects on tables and files (but see FutureF-2).

NF-2) Crash-safety of DROP DATABASE should be improved. I.e. discrepancy
      between SEs, the DD and binary log should be limited to one table at
      most.

F-4)  It should be possible to replicate DROP DATABASE statement from older
      servers even in GTID mode.

F-5)  It should be still possible to use DROP TABLES IF EXISTS to delete
      tables for which there are entries in the data-dictionary, but which
      for some reason are absent from SE. Note that this functionality is
      currently undocumented.

Requirements to be supported once InnoDB implements WL#7016:

FutureF-1) DROP TABLES which involves only tables in SEs supporting atomic
           DDL should be fully atomic from user point of view. I.e. either
           succeed and drop all the tables or fail and do not have any
           side-effects.

FutureF-2) DROP DATABASE on database which contains only tables in SEs
           supporting atomic DDL should be mostly atomic towards database
           objects from user point of view. I.e. either succeed and drop the
           whole database or fail doesn't have any side-effects on database
           objects (except cases when problem is related to removal of
           database directory). Some effects on ignorable objects in database
           directory like .TMD files are allowed in the latter case.

FutureF-3) Other table-related DDL which concerns only tables in SEs
           supporting atomic DDL should be fully atomic from user point of
           view.

FutureN-1) Table-related DDL statements concerning tables in SEs supporting
           atomic DDL should be fully crash-safe. I.e. there should be no
           discrepancy between the SE, the DD and binary log in case of
           crashes.
Support for auxiliary columns and keys
======================================

Auxiliary columns are special columns which are automatically added by
SE to table for its internal purposes and are not visible to users (e.g.
DB_TRX_ID, DB_ROW_ID for InnoDB). Similarly auxiliary keys are hidden,
internal keys which are automatically added to table by SE (e.g. implicit
primary key for InnoDB).

Even though such columns/keys are not visible to users and SQL-layer it
might be still convenient for SE to store information about them in DD.

We support this by allowing SE adjust table definition represented by
dd::Table object and add such hidden columns/keys to it during CREATE/ALTER
TABLE operations.

New 

int handler::add_extra_columns_and_keys(const HA_CREATE_INFO *create_info,
                                        const List<Create_field> *create_fields,
                                        const KEY *key_info, uint key_count,
                                        dd::Table *table_def)

method is introduced which is called during dd::create_table() execution.
SE adds hidden columns/keys by adjusting dd::Table object passed as method
in/out parameter and then adjusted table definition is saved to the DD.
Note that SQL-layer won't do any validation for columns/keys added, it
is responsibility of SE to ensure that they are valid.

QQ) Marko says he doesn't need any arguments except dd::Table, should we
    omit them?

QQ) At which point exactly to call this method? Currently it is called
    before handling partitions, because we want to create Partition_index
    objects for hidden indexes too. But this feels wrong as the dd::Table
    object passed to add_extra_columns_and_keys() is half-constructed.

When table is opened SE is passed dd::Table object as argument to
handler::open() method and can get information about these hidden
objects.


Providing access to se_private_* values and methods to update them during DDL
=============================================================================

In order to get rid of its internal DD InnoDB needs to be able store some
engine-private information and identifiers associated with various objects
in the DD. The DD has provision for this in the form of se_private_data 
and se_private_id attributes associated with these objects and methods
for reading/setting these attributes.

We extend SE API methods related to DDL to allow SEs to adjust these
attributes for tables (and its subobjects) and tablespaces.

We also add "const dd::Table *" argument to handler::open() call so SE
will be able to get values of these attributes when table is opened.
Similar arguments are to be added to DDL-related methods which operate
on tables without opening them.

Let us sum up changes described above:

int handlerton::alter_tablespace(handlerton *hton, THD *thd,
                                 st_alter_tablespace *ts_info,
+                                const dd::Tablespace *old_ts_def,
+                                dd::Tablespace *new_ts_def);

New old_ts_def in-argument contains pointer to dd::Tablespace object
describing old version of tablespace being altered (NULL in case
when tablespace is being created).
New new_ts_def in/out-argument contains pointer to dd::Tablespace object
describing new version of tablespace being altered (NULL in case
when tablespace is being dropped). SE is allowed to adjust this
object (e.g. set se_private_data/id attributes for table).

int handler::open(TABLE *table, const char *name, int mode,
                  int test_if_locked,
+                 const dd::Table *table_def);

New table_def in-argument contains pointer to dd::Table object
describing the table being opened.

Note that optimizer uses custom code for creation of its temporary tables,
as result such tables do not have proper dd::Table objects associated with
them. Therefore handler::open() will get NULL as table_def argument for
them.

int handler::truncate(
+                     dd::Table *table_def);

New table_def in/out-argument points to dd::Table object describing
the table being truncated. SE is allowed to adjust this object.

int handler::rename_table(const char *from, const char *to,
+                         const dd::Table *from_table_def,
+                         dd::Table *to_table_def);

New from_table_def in-argument contains pointer to dd::Table object
describing the table prior to rename. to_table_def is in/out-parameter
which points to dd::Table object describing the table after rename.
The latter object can be adjusted by SE.

int handler::delete_table(const char *name,
+                         const dd::Table *table_def);

New table_def in-argument contains pointer to dd::Table object
for table being dropped.

int handler::create(const char *name, TABLE *form, HA_CREATE_INFO *info,
+                   dd::Table *table_def);

New table_def in/out-argument contains pointer to dd::Table object
for table being created. SE is allowed to adjust this object.

bool handler::prepare_inplace_alter_table(TABLE *altered_table,
                                          Alter_inplace_info *ha_alter_info,
+                                         const dd::Table *old_table_def,
+                                         dd::Table *new_table_def);
                                                                                
bool handler::inplace_alter_table(TABLE *altered_table,
                                  Alter_inplace_info *ha_alter_info,
+                                 const dd::Table *old_table_def,
+                                 dd::Table *new_table_def)

bool handler::commit_inplace_alter_table(TABLE *altered_table,
                                         Alter_inplace_info *ha_alter_info,
                                         bool commit,
+                                        const dd::Table *old_table_def,
+                                        dd::Table *new_table_def)

For the above 3 methods new old_table_def in-arguments point to
dd::Table describing old version of table being altered.
New new_table_def in/out-argument point to dd::Table for the new
version. The latter can be adjusted by SE.

int Partition_handler::truncate_partition_low(
+                                             dd::Table *table_def);

New table_def in/out-argument points to dd::Table object describing
the table which partition is being truncated. SE is allowed to adjust
this object.

Also we introduce new method to Partition_handler to encapsulate
SE-specific details of partition exchange in SE:

+ int Partition_handler::exchange_partition_low(const char *part_table_path,
+                                               const char *swap_table_path,
+                                               uint part_id,
+                                               dd::Table *part_table_def,
+                                               dd::Table *swap_table_def)

This new method has part_table_def and swap_table_def in/out-parameters
which point to dd::Table describing partitioned and table being swapped
with partition correspondingly. SEs are allowed to adjust these objects.

After calling any of the above methods which allow adjustement of
table definition SQL-layer will save updated definition to the DD.

To avoid problems with the DD and SE information getting out of sync
we will allow such adjustments only to engines which carry out DD
update and changes in SE as a single atomic transaction long-term
(i.e. to engines supporting atomic DDL). SQL-layer will enforce this
by simply not storing adjusted objects in the DD for other engines.
However, to make it possible to work on WL#7141 independently of
WL#7016 we might allow such adjustements for all SEs short-term.


Storing info about auxiliary tables and other similar objects
=============================================================

InnoDB needs to be able to store information about auxiliary tables,
special internal tables created to support FTS, in the DD.
It also might need to store information about other implicitly
created objects, which are not tightly coupled to the main table,
e.g. about tablespace created for the table when innodb-file-per-table
mode is on.

It is possible to do so using existing DD API by:

1) acquiring X MDL on the object using dd::acquire_exclusive_tablespace_mdl(),
2) creating appropriate DD object using dd::create_object<>,
3) filling the object according to SE needs and marking it as hidden
   if necessary,
4) calling dd::get_dd_client()->store() to save it in the DD

during execution of appropriate SE method (like handler::create()).
Then SQL-layer will commit this change along with the adjusted table
definition. Other operations like deletion or updates of auxiliary/
implicit objects can be handled in similar fashion.


Supporting atomic/crash-safe DDL
================================

On high-level we can say that to make DDL atomic/crash-safe we need to
pack its updates to the DD, changes in SE and writes to binary log into
single atomic transaction (i.e. it should either commit and have its
effect properly reflected in DD, SE and binary log or rollback and
doesn't have any effect at all).

To implement this we need to ensure that:

1) There are no intermediate commits on SQL-layer during DDL (to be
   addressed by this WL)
2) There are no intermediate commits in SE methods called by DDL.
   Also SEs should register themselves as part of ongoing transaction.
   (Both items to be addressed by WL#7016 in InnoDB.)
3) SE can do redo/rollback of DDL (to be addressed by WL#7016)
   This WL supports this capability by introducing new handlerton
   post_ddl() hook to be called after DDL is committed or rolled
   back to let SE do necessary post-commit/rollback work (see
   examples below).
4) Write to binary log happens as part of the DDL transaction
   (addressed by this WL. WL#9175 is necessary to ensure that
   binlog supports correct crash recovery for DDL statements).

Also while adding atomicity/crash-safeness to DDL from implementation
point of view, it also makes sense to:

5) Change behavior of some DDL statements (e.g. DROP TABLES) to make
   user-visible behavior more atomic (e.g. try to avoid side-effects
   from failed statements when possible).

We also need to keep in mind that not all SEs will support DDL atomicity.
Such SEs should be accounted for while implementing the above changes.

To differentiate SEs which support and which doesn't support atomic DDL
new handlerton flag HTON_SUPPORTS_ATOMIC_DDL is introduced.

Let us discuss changes for each of DDL statements in details.

Note that we don't discuss ALTER TABLE variants related to partitioning
which are currently implemented through "fast alter partitioning" code
path below as this code is to be removed soon.

A) CREATE TABLE (including CREATE TABLE LIKE)
---------------------------------------------

Currently process of table creation looks like:

1) Create dd::Table object describing table to be created
2) Store dd::Table object in DD tables and commit this change.
3) "Open" table, construct TABLE_SHARE, TABLE and handler objects for
   the table.
4) Call handler::create(name, TABLE, HA_CREATE_INFO) method to create table
5) Statement is written to binary log

This is to be replaced with:

1)  Create dd::Table object describing table to be created
2)  Use dummy handler object to call new handler::add_extra_columns_and_keys()
    method to add additional hidden columns and keys which will be created
    by SE to the dd::Table object.
3)  Store dd::Table object in DD tables.
4)  Commit this change if engine is not capable of atomic DDL.
    The latter is necessary to ensure that in case of crash we
    won't get "orphan" tables in SE which do not have entries
    in DD.
5)  "open" table, construct TABLE_SHARE, TABLE and handler objects for it.
6)  Call handler::create(name, TABLE, HA_CREATE_INFO, dd::Table*) method
    for the table.
    Note that this method can update se_private_* fields in in-memory DD
    object. It also can create additional objects in DD like dd::Tablespace
    for file-per-table tablespaces or hidden dd::Table for auxiliary tables
    needed for FTS. These additional changes are not to be committed yet.
    Long-term such updates will be allowed only for engines which support
    atomic DDL.
7)  Store dd::Table object (which was possibly adjusted on previous step)
    into DD tables. Long-term this step will be executed only for engines
    supporting atomic DDL. Short-term, for engines not capable of atomic
    DDL this change will be committed.
8)  Write statement to the binary log (to the cache if SE supports atomic DDL).
9)  Transaction is committed or rolled back.
10) Call new handlerton post_ddl hook to let engines which are capable
    of atomic DDL do necessary post-commit changes (e.g. we might want
    to remove files in SE on rollback).

Note that for engines supporting atomic DDL the above steps 1) ... 9)
are going to be part of the single transaction, i.e. will be atomic
even if crash occurs.

This also means that such engines should not commit the transaction
internally during DDL until SQL-layer requests to do so.

For engines which are incapable of atomic DDL we still try to execute
statement in a manner which reduces risk of ugly side-effects in case
of crash - e.g. DD and SE getting out of sync, having "orphan" tables
in SE but not in DD,...

Long-term the plan is to get rid of "name", TABLE and HA_CREATE_INFO
parameters in handler::create() call and be able to create table only
from its DD representation.


B. DROP TABLES
--------------

Current approach to dropping tables looks like (simplified):

1) For each table in the table list:
   1.1) Try to drop table in SE by calling handler::ha_delete_table(),
        if error either proceed to next table or goto 2) depending
        on error type.
   1.2) Remove table from the DD and commit this change.
2) Write up-to 3 artificial DROP TABLES statements for tables which were
   successfully dropped to binary log -  we write separate statements for
   all transactional temporary tables, all non-transactional temporary
   tables and all base tables we have managed to drop.

While this schema is not crash safe it is at least ensures that
we get correct binary log in case when DROP TABLES statement cannot
be completed fully due to inability drop some table (e.g. due to
foreign keys or some other error).

It works OK in cases when we are executing DROP TEMPORARY TABLES
statement in the middle of transaction.

It also works correctly in GTID mode. We never split DROP TABLES
statement into several statements in binary log in it, because
in this mode we prohibit DROP TABLES statement which mix temporary
and non-temporary tables, or temporary transactional and temporary
non-transactional tables.

Of course, the above means that statement user-visible behavior is
not atomic, i.e. that it can be partially executed and fail still
have some side-effect. This is counter-intuitive for many users
and doesn't play well with replication.

With advent of atomic DDL it becomes possible to improve DROP TABLES
implementation. Some important points to consider while working on this
are:

a) DROP TABLES should be atomic both from crash-safety and user-visible
   behavior points of view when all tables which are dropped are in SEs
   which support atomic DDL.
b) When we have a mix of engines we still should try to be as crash-safe
   as possible. Atomicity from user-visible perspective is also nice.
   It is probably a bad idea to have side effect from failed DROP TABLES
   on tables in SEs which support atomic DDL.
c) It should be possible to replicate DROP TABLES statements even from
   older servers, possibly sacrificing some corner cases and/or
   crash-safety for them.
d) GTID mode should work, even when we replicate from older servers.
   Again some compromises are possible.

After discussion with Replication Team the following algorithm for
improved DROP TABLES was suggested (somewhat simplified):

1)  For each table in the table list check to which one of 5 classes
    it belongs:
    a) non-existent table
    b) base table in SE which doesn't support atomic DDL
    c) base table in SE which supports atomic DDL
    d) temporary non-transactional
    e) temporary transactional
   
    In the process check if temporary tables to be dropped are used by
    some outer statement. Report and abort execution if there are any.

2)  If this DROP TABLES doesn't have IF EXISTS clause and there are
    non-existent tables report appropriate error. This could have been
    done of previous step if we wouldn't need to include list of all
    missing tables in the error message.

2') Once WL#6929 is implemented we can check if we trying to drop parent
    tables in some FK without dropping child in the same statement and
    report an error here.

Note that this way DROP TABLES will be able to handle most common error
cases without having any side-effect.

3)  For each table from class b) (base table in SE which doesn't support
    atomic DDL).

    3.1) Call the handler::delete_table(const dd::Table) to delete
         the table in SE.
    3.2) Remove table description from DD and commit the change
    3.3) If {we are not in GTID mode} OR
            {we are in GTID mode AND there is only one table in class b) AND
             classes a) and c) are empty}
         write DROP TABLE statement for the table to binary log.
         Else we need to construct a single DROP TABLES statement for the
         GTID and write it to the binary log later.

    In case of error during any of the above steps, report it and abort
    statement execution.

4)  For each table from class c) (in SE supporting atomic DDL) and class a)
    (non-existent).
    4.1) Call the handler::delete_table(const dd::Table) method to mark
         table as dropped in SE.
    4.2) Update DD to remove the table from it. Do not commit this change.

5)  If {we are not in GTID mode} OR
       {we are in GTID mode AND class b) is empty}
       write DROP TABLES statement including all tables from to binary log
    Else we will need to construct a single DROP TABLES statement for the
    GTID and write it to the binary log later.

6)  Commit or rollback
7)  Call new handlerton post_ddl() method in order to wait until SE
    completes real removal of table supporting atomic DDL. Concurrent DDL
    operations on the table should be blocked at this stage.

If any error occurs on steps 4) .. 6) report it and abort statement
execution immediately.

Note that we handle non-existent tables in the same way as supporting
atomic DDL in order to have single nice DROP TABLES statement for the
"main" InnoDB-only case.

Note that we handle non-atomic tables first and then tables in SEs
supporting atomic DDL in order to avoid situation when DROP TABLE
fails while dropping non-atomic table and also drops some atomic
tables as side-effect.

Also note that with exception of problems with writing to binary log
the DROP TABLES statement can't really fail after this point (see
comments explaining why below).

8)  If we are in GTID mode and had to postpone writing to binary log
    on steps 3.3) and 5) because of this, write DROP TABLES statement
    containing all tables we have managed to drop to the binary log.
  
The above is necessary to handle replication in GTID mode from older
servers or in cases when master and slave have different SEs for the
same tables. Obviously we sacrifice crash-safety to compatibility here.

9)  For each table from class d) (non-transactional temporary) call
    close_temporary_table() function to drop the table (this function
    will call handler::delete_table() in SE).

Note that close_temporary_table() can't fail if the check which was done
on step 1) was successfull.

10) Construct DROP TEMPORARY TABLES statement for tables from class d)
    and write it to binary log.

Note that we don't have problem with GTIDs here since DROP TABLES statement
doesn't allow mixing tables from class d) with any others in GTID mode.

11) For each table from class e) (transactional temporary) call
    close_temporary_table() function to drop the table (this function
    will call handler::delete_table() in SE).

Again close_temporary_table() can't fail here if the check which was done
on step 1) was successfull.

12) Construct DROP TEMPORARY TABLES statement for tables from class d)
    and write it to binary log (actually to its transaction cache).

Same comment about absence of problem with GTIDs as above applies here.

C) CREATE TABLE ... SELECT
--------------------------

New approach to implementing this statement:

1)  Create dd::Table object describing table to be created
2)  Use dummy handler object to call new handler::add_extra_columns_and_keys()
    method to add additional hidden columns and keys which will be created
    by SE to the dd::Table object.
3)  Store dd::Table object in DD tables.
4)  Commit this change if engine is not capable of atomic DDL.
    The latter is necessary to ensure that in case of crash we
    won't get "orphan" tables in SE which do not have entries
    in DD.
5)  "open" table, construct TABLE_SHARE, TABLE and handler objects for it.
6)  Call handler::create(name, TABLE, HA_CREATE_INFO, dd::Table*) method
    for the table.
    Note that this method can update se_private_* fields in in-memory DD
    object. It also can create additional objects in DD like dd::Tablespace
    for file-per-table tablespaces or hidden dd::Table for auxiliary tables
    needed for FTS. These additional changes are not to be committed yet.
    Long-term such updates will be allowed only for engines which support
    atomic DDL.
7)  Store dd::Table object (which was possibly adjusted on previous step)
    into DD tables. Long-term this step will be executed only for engines
    supporting atomic DDL. Short-term, for engines not capable of atomic
    DDL this change will be committed.

The above steps are the same as for simple CREATE TABLE.

8)  If we are in RBR mode write CREATE TABLE statement describing table
    structure into binary log (note that in reality at this point statement
    should end-up in transactional cache and not in on-disk binary log).
9)  Insert data into table (by reading from source tables and doing
    handler::write_row() on newly created table).
    This should be part of the same transaction as above calls to
    handler::create() and upcoming writes to the binary log.
    In RBR mode this also writes events to binary log transactional
    cache.
10) In SBR mode write statement in binary log (for engines supporting
    atomic DDL to transactional cache).
11) Transaction is committed or rolled back. (Once support for atomic DDL
    in InnoDB is implemented handler::create() call, changes to on-disk DD,
    writing to binary log are going to be part of the same transaction,
    i.e. will be atomic even if crash occurs).
12) Handlerton post_ddl() hook is called to let SE do the necessary
    steps which should happen after transaction commit (e.g. in case
    of rollback we might want to wait for deletion of files belonging
    to table we failed to create).

Note that to handle an error (e.g. during row insertion phase) for
engines supporting atomic DDL it is enough to rollback the transaction.
For engines without such support table needs to be dropped explicitly
by calling handler::delete_table(), removing it from the DD and committing
this change.


D) ALTER TABLE ALGORITHM=COPY
-----------------------------

New approach to implementing this statement:

1)  Create dd::Table object describing new version of the table
2)  Use dummy handler object to call new handler::add_extra_columns_and_keys()
    method to add additional hidden columns and keys which will be created
    by SE for new table version to the dd::Table object.
3)  Store dd::Table object in DD tables.
4)  Commit this change if engine of new version is not capable of atomic
    DDL. The latter is necessary to ensure that in case of crash we
    won't get "orphan" tables in SE which do not have entries in DD.
5)  "open" table, construct TABLE_SHARE, TABLE and handler objects for it.
6)  Call handler::create(name, TABLE, HA_CREATE_INFO, dd::Table*) method
    for the new version of the table.
    Note that this method can update se_private_* fields in in-memory DD
    object. It also can create additional objects in DD like dd::Tablespace
    for file-per-table tablespaces or hidden dd::Table for auxiliary tables
    needed for FTS. These additional changes are not to be committed yet.
    Long-term such updates will be allowed only for engines which support
    atomic DDL.
7)  Store dd::Table object (which was possibly adjusted on previous step)
    into DD tables. Long-term this step will be executed only if engine
    of new version suppors atomic DDL. Short-term, for engines not capable
    of atomic DDL this change will be committed.

Again the above is pretty similar to the first part of CREATE TABLE
implementation.

8)  Copy the contents from old version of table to new version of table
    note that unlike in current code this should not commit the transaction
    if engine of new version supports atomic DDL.

Note that if engine of new table version supports atomic DDL the
error on any of the above steps can be handled by simply rolling back
transaction. For other engines explicit deletion will be required.

9)  Replace old table version with a new table version. To do this
    we need:

    9.1) Commit the transaction if engine of the old version of the
         table is not capable of atomic DDL.
    9.2) Inform engines about old version of table being replaced with
         new version. This is done through a series of
         handler::rename_table() calls. Update data in DD tables in the
         process accordingly.
    9.3) If engine of old version or new version don't support atomic
         DDL commit changes after each rename operation during step 9.2).

10) Call handler::delete_table() for old version of the table. Remove it
    from DD.
11) Again if either of engines doesn't support atomic DDL it makes sense
    to commit the above change to minimize DD <-> SE discrepancy in case
    of crash.
12) Write to binary log

Again if both engines of old and new table versions support atomic DDL
it is possible to handle errors during the above steps by simple rollback.
If at least one of them is not, then we need to take explicit actions,
like reverting renames and deleting the new version. Moreover after
point 10) totally correct error handling becomes impossible.

13) Commit or rollback (with advent of atomic DD all the above
    should be part of single atomic transaction)
14) Call handlerton post_ddl() hook to wait until SE completes real
    removal of old version of the table (or new version if rollback
    has happened). Concurrent DDL on the table should be blocked at
    this stage.


E) ALTER TABLE ALGORITHM=INPLACE
--------------------------------

1)  Create dd::Table object describing new version of the table
2)  Use dummy handler object to call new handler::add_extra_columns_and_keys()
    method to add additional hidden columns and keys which will be created
    by SE for new table version to the dd::Table object.
3)  Store dd::Table object in DD tables.
4)  Commit this change if engine of new version is not capable of atomic
    DDL. The latter is necessary to ensure that in case of crash we
    won't get "orphan" tables in SE which do not have entries in DD.
5)  "open" table, construct TABLE_SHARE, TABLE and handler objects for it.
6)  Construct Alter_inplace_info object by comparing old and new versions
    of table.
7)  Call handler::check_if_inplace_alter_supported() to figure out if
    in-place algorithm is applicable.
8)  Call handler::ha_prepare_inplace_alter_table(Alter_inplace_info).
    This method can adjust se_private_* fields in dd::Table object
    describing new version of the table and do other modifications to DD
    if necessary. Long-term this will be allowed only if SE supports
    atomic DDL.
9)  Store dd::Table object (which was possibly adjusted on previous step)
    into DD tables. Long-term this step will be executed only if engine
    supports atomic DDL and should not commit transaction. Short-term, for
    engines not capable of atomic DDL this change will be committed.

10) Call handler::inplace_alter_table() method for the table.
11) Call handler::commit_inplace_alter_table() method for the table.
    Similarly to step 8) this method can adjust dd::Table object and DD
    in general (long-term only for engines supporting atomic DDL).
12) Store dd::Table object (which was possibly adjusted on previous step)
    into DD tables. Again long-term this step will be executed only if engine
    supports atomic DDL and should not commit transaction.
    Short-term, for engines not capable of atomic DDL this change will be
    committed immediately.
13) Replace old table version in DD with a new version.
14) If table engine doesn't support atomic DDL commit the above change
    needs to be committed to reduce chances of DD and SE getting out of sync.
15) Inform storage engine about possibly required table rename by calling
    handler::rename_table().
16) Update DD accordingly. Again if SE is not atomic-DDL-capable this
    change should be commited.
17) Write statement to binary log
18) Commit or rollback transaction. Note that once atomic DDL is supported
    for InnoDB all of the above steps will be part of one atomic transaction.
19) Call handlerton post_ddl() method to wait until SE completes real removal
    of indexes which were dropped and other similar operations which should
    happen post commit. Concurrent DDL on the table should be blocked at
    this stage.

F. TRUNCATE TABLE
-----------------

There are two paths in TRUNCATE TABLE implementation, one for
HTON_CAN_RECREATE engines and another for other engines.
Here we will cover the latter as it is the only which is relevant
for engines which will support atomic DDL/InnoDB:

1)  Call handler::truncate() for the table. SE is allowed to adjust
    "se_private_*" attributes for the table and do other DD modifications
    during this call. Long-term this will be allowed only for SEs which
    support atomic DDL.
2)  Store dd::Table object (which was possibly adjusted on previous step)
    into DD tables. Long-term this step will be executed only if engine
    supports atomic DDL and should not commit transaction. Short-term, for
    engines not capable of atomic DDL this change will be committed.
3)  Write statement to the binary log
4)  Commit transaction or rollback it.
5)  Call handlerton post_ddl() method in order to wait until SE will
    really finish truncation (e.g. remove old tablespace in case of
    commit, remove new tablespace in case of rollback). Concurrent
    DDL operations on the table should be blocked at this stage.

As in previous cases once support for atomic DDL is implemented in InnoDB
steps 1) .. 4) will become part of single atomic and crash-safe
transaction from SQL-layer point of view.

Note that new implementation of TRUNCATE PARTITION will be pretty similar
to the one described above.

E. RENAME TABLES
----------------

1) For each element in rename list

   1.1) Call handler::rename_table(). Again SE is allowed to adjust
        dd::Table object describing new version of table and do other
        DD modifications during this call. And again long-term this
        will be allowed for engines supporting atomic DDL only.
   1.2) Store dd::Table object describing new version of table in the DD
        (including updates to it on the previous stage). If engine
        doesn't support atomic DDL or we have met such engine on
        previous iterations of the loop commit the transaction.
2) Write to binary log
3) Commit or rollback transaction (this is only relevant if all
   engines participating in RENAME support atomic DDL).
4) Use handlerton post_ddl() method to complete renaming in the storage
   engine (might be no-op).

Note that if all engines involved in RENAME TABLE support atomic DDL
steps 1) - 3) become part of single atomic and crash-safe transaction
from SQL-layer point of view. Also error handling in such case boils
down to simple transaction rollback.

If at least one engine involved doesn't support atomic DDL RENAME TABLE
becomes non-atomic. Handling of error requires renaming of tables
in reverse order by calling handler::rename_table() and updating
DD accordingly.

F. CREATE/ALTER/DROP TABLESPACE
-------------------------------

1) Prepare DD objects for operation:
   1.1) If we are processing CREATE TABLESPACE construct dd::Tablespace
        object for tablespace being created. Save the object in the DD.
        Do not commit this change if SE supports atomic DDL. Commit the
        change otherwise.
   1.2) If we are processing ALTER TABLESPACE prepare dd::Tablespace
        objects describing new and old versions of tablespace.
   1.3) In case of DROP TABLESPACE prepare dd::Tablespace object
        describing tablespace to be dropped.
2) Call handlerton::alter_tablespace() method. SE is allowed to adjust
   attributes of tablespace being created/altered during it. Long-term
   this will be allowed only for SEs which support atomic DDL.
3) Store updated version of dd::Tablespace object (this includes
   adjustments during step 2)). Delete the tablespace from the DD
   if it is DROP TABLESPACE. Commit the changes right away if SE
   doesn't support atomic DDL.
4) Write statement to the binary log.
5) Commit or rollback transaction.
6) Use handlerton post_ddl() method to complete operation in SE
   (e.g. to remove files of tablespace being dropped).


G. DROP DATABASE
----------------

Similarly to DROP TABLES it makes sense to change user-visible behavior
of DROP DATABASE to more atomic one. And indeed replication compatibility
considerations are important for DROP DATABASE as well.

Here is the description of new DROP DATABASE implementation:

1)  Check if database directory contains any extra files which are not
    safe to remove directly and which will not be removed by dropping
    tables, fail if it does.
    Check if server has enough privileges to remove database directory,
    fail if it does not.
1') Once WL#6929 is implemented we can check if we will be trying to drop
    parent tables in some FK without dropping child and  report an error
    here.
2)  Remove files which do not belong to tables and which are known to
    be safe to delete.
3)  Drop all tables in SEs which don't support atomic DDL one-by-one:
    3.1) Call handler::delete_table() to remove table in SE
    3.2) Remove table from the DD and commit the change immediately.
    3.3) Unless we are in GTID mode write DROP TABLES IF EXISTS statement
         for the table dropped to binary log.

   Note that the goal of item 3.3) is to improve crash-safety. One possible
   alternative which sacrifices it but makes binary log more compact is to
   delay write to the binary log until we can write successfull DROP DATABASE
   to it, or when we know that there was some error during it and can write
   artificial DROP TABLES IF EXISTS statement for all tables which we have
   managed to drop.

4) In a single atomic transaction:
  
   4.1) Drop all tables belonging to SE supporting atomic DDL by calling
        for each table handler::delete_table() and then removing it from
        the DD.
   4.2) Remove all stored functions and procedures in the database.
   4.3) Remove all events in the database.
   4.4) Write DROP DATABASE statement to the binary log
   4.5) Commit or rollback the transaction

   Any error in the process is handled by rolling back the transaction.
   If this happens and we have delayed writing to the binary log deletion
   of some atomic-DDL-non-capable table because of GTID mode report a
   special error (this is what happens now in similar situation).
  
5) Call post_ddl() handlerton method to let SEs finalize deletion of the
   tables.

6) Delete database directory from the filesystem.

Of course, the above means that there is hole in atomicity if crash occurs
after 4.5) and before 6). This problem requires introduction of redo log
for database directory removal and will be solved outside of this WL.

H. ALTER TABLE EXCHANGE PARTITION
---------------------------------

There is additional problem with current implementation of this statement.
It breaks encapsulation of partitioning support in SEs since it swaps table
and partitions by simple rename of tables in SE and thus disclosing the
fact that partitions are just another kind of tables.

We solve this problem by introducing new 

Partition_handler::exchange_partition[_low](const char *part_table_path,
                                            const char *swap_table_path,
                                            uint part_id,
                                            dd::Table *part_table_def,
                                            dd::Table *swap_table_def)

method. SEs which support native partitioning need to implement this
method. Non-native partitioning will be no longer supported thanks to
WL#8971.

After that new implementation of this statement starts looking like:

1) Check if table and partition have compatible metadata and can be
   exchanged.
2) Call Partition_handler::exchange_partition() method to exchange
   table and partition. SE can adjust dd::Table objects for both
   non-partitioned and partitioned table as well as do other DD
   modifications during this step. Long-term this will be allowed
   only for SEs which support atomic DDL.
3) Save adjusted table definitions to the DD. Long-term this will be
   done only for SEs which support atomic DDL. Short-term for other
   SEs we will commit these changes immedeately.
4) Write statement to the binary log
5) Commit or rollback the transaction
6) Call handlerton post_ddl() method to let SE finalize
   exchange (might be no-op).

Similarly to other statements if SE supports atomic DDL any error can be
handled by simple rollback. For SEs which do not support it, exchange of
table and partition in opposite direction might be required to do this.