WL#3931: Multi-table statement involving self-logging engines

Affects: Server-5.1   —   Status: Complete

Some engines, e.g., NDB Cluster, take care of their own logging by setting the
``HA_HAVE_OWN_BINLOGGING`` table flag.

When attempting to log a statement involving such an engine, it is necessary to
ensure that all changes to the database are logged in the correct order.

RELATED ISSUES
--------------

:BUG#28722: Multi-engine statements on has_own_binlogging engine
SITUATION
---------

For tables with the table flag ``HA_HAS_OWN_BINLOGGING`` set, no rows will be
logged currently. The assumption is that the engine uses the injector interface
to insert the rows into the binary log. For the purpose of this worklog, we
examplify a self-logging engine using NDB (since this is currently the only
self-logging engine).


PROBLEM DESCRIPTION
-------------------

Suppose that we are trying to execute the following statement---where ``tbl`` is
a non-NDB table and ``ndb`` is a NDB table---with row-based binary logging active::

  UPDATE tbl, ndb SET ndb.x = tbl.x, tbl.y = ndb.y;

Since the changes to ``tbl`` will be logged separately and the changes to the
NDB table will be logged through the injector interface as a single transaction,
it is possible that the slave ends up in a state where ``ndb.x == tbl.x`` and
``ndb.y != tbl.y`` or ``ndb.x == tbl.x`` and ``ndb.y != tbl.y``.


ALTERNATIVE SOLUTIONS
---------------------

1. Allow the changes to non-NDB engines to be logged together with the changes
   injected into the binary log.

2. Forbid execution of statement that can result in that different parts of a
   statement is logged separately, i.e., which cannot be logged atomically.


PROPOSED SOLUTION
-----------------

We should not allow statements that cannot be logged atomically to execute,
since this will produce a faulty binary log. Hence this worklog will focus on
the second solution: producing proper error messages when the atomicity
constraint is violated for a statement.

To be able to log the statement as a single atomic unit, the following
requirements have to be fulfilled for each statement:

R1. When statement-based binary logging is in effect, then no engine
    for the statement may be self-logging, i.e., the table flag
    ``HA_HAS_OWN_BINLOGGING`` must be clear.

R2. When row-based binary logging is used and the engine of at least one
    table is self-logging, then all tables for the statement has to be
    handled by the same engine. If no engine is self-logging, then there
    are no requirements on the tables.

Note that the table flags are fetched before each statement starts as part of
locking the tables. This allows a storage engine to dynamically decide if it
will handle the binary logging.

We solve these requirements by generating an error message when the statement
cannot be logged as an atomic unit, that is, if and only if more than one engine
is involved in the statement and at least one is self-logging.
No LLD needed since code is simple.
-- Lars, 2007-07-05