WL#4284: Transactional DDL locking

Affects: Server-5.5 — Status: Complete

Split from WL#3726.
Implement transaction-long metadata locks. I.e. if a table is used in a
transaction, a metadata lock should be kept for this table until the transaction
is committed. This will prevent concurrent transactions from issuing a DDL
against this table. Necessary to fix BUG#989 If DROP TABLE while there's an
active transaction, wrong binlog order

In order to ensure transaction serializability, 
the server must not allow a concurrent connection to perform a DDL
statement on a table that is used in an uncompleted transaction.

Currently, for each table used in a statement, the server
acquires a shared metadata lock till the end of the statement.
When the statement is complete, the shared metadata lock is released.
A simple deadlock avoidance technique is employed to address the
problem of deadlocks in case of a lock conflict (see
http://forge.mysql.com/wiki/MySQL_Internals_Data_and_meta-data_locking for
details). 

In order to ensure serializability, the server must defer the
release of metadata locks from the end of a statement till the end
of the encompassing transaction.

For a statement that doesn't start a transaction, e.g. a statement
on a non-transactional table/tables, this brings no change.

For a connection that runs in autocommit mode, the change also has
no effect.

Since each DDL statement commits the transaction, if any, in the
beginning, the change has no effect on lock conflicts between
concurrent DDL statements.

The other class of conflicts is between DDL and DML statements.
When a connection doesn't use autocommit and utilizes
multi-statement transactions, a new scenario
of lock conflicts is introduced. A conflict  may happen between
the acquirer of a shared lock and a concurrent DDL, at the time
when the acquirer has executed several statements and possibly
modified or retrieved data.

Let's consider an example.

Transaction 1 modifies table t1, then table t2, and then commits.
The first statement of the transaction will acquire a shared
metadata lock on table t1, and it will be kept till COMMIT, in
order to ensure serializability.

At the moment when the second statement attempts to acquire a
shared metadata lock on t2, a concurrent ALTER or DROP statement
may have locked t2 exclusively.
The prescription of the current locking protocol is that the
acquirer of the shared lock backs off -- gives up all his current
locks and retries.
For a multiple-statement scenario that implies that the entire
multi-statement transaction has to be rolled back.
Indeed, imagine that the concurrent connection issued
RENAME t2 to t3, t1 to t2, t3 to t1, rather than a simple ALTER.

In that case it will have acquired an exclusive lock on t2 at the
moment it encounters that t1 is locked in shared mode, and the
owner is not about to give away its locks. 
Unless the DML transaction back off, a deadlock is unavoidable.

In future we may consider implementing a deadlock detection
algorithm that would allow us to detect the scenario described
above while allowing for other scenarios of lock conflicts.

One possible solution is given in Appendix A.

The scope of this task, however, is to simply abort the
transaction that attempts to acquire a shared lock.
Note, that the abortion only takes place if the transaction has
executed statements -- in other words, has acquired shared
metadata locks that ought to be released. If a lock conflict
happens in the first statement of the transaction, the locking
algorithm back off and retries as before.

If a user has set a savepoint, ROLLBACK TO SAVEPOINT should release
all metadata locks acquired after the savepoint has been set.

Apart from these two changes, the locking protocol should be left
intact.

Appendix A: a possible approach to deadlock detection
-----------------------------------------------------

If a shared/exclusive lock conflict occurs when attempting to acquire an
exclusive lock, a "pending exclusive" lock is placed instead.
This lock is "upgraded" to exclusive when all transactions 
that own the conflicting lock have committed.
Existence of a pending exclusive lock prevents all new transactions
from acquiring the lock: they back off. Existing transaction, still,
can proceed, as long as new locks that they attempt to acquire do
not lead to deadlocks.
This latter fact can be checked easily by maintaining the list of
waiters on a given lock, and checking if some transaction is
present twice in the list.

Extend the metadata locking subsystem to keep metadata locks till the
end of a transaction.

Changes to existing APIs:

 - none

Changed files and functions:

1) Introduce ha_in_transaction() that would return TRUE 
   if there is a started transaction in a given connection.
2) Store metadata locks in transaction memory root, rather than
   statement memory root.
3) Modify close_thread_tables() to call mdl_release_locks() only
   if we're outside the context of some transaction. Since
   close_thread_tables() is called after each statement,
   including COMMIT and ROLLBACK, that will ensure that metadata
   locks are released automatically whenever a transaction has
   either ended automatically, or was committed/aborted
   manually.
 
   BEGIN word is a no-op at MySQL. Operations on non-transactional
   engines do not start a transaction either.  Therefore, for 
   statements that use non-transactional tables outside the 
   context of some transaction, locks will be released at the end of each
   statement, just like before.

 3) Modify open_table() to raise rollback the transaction and
    return ER_LOCK_DEADLOCK error in case of a shared/exclusive
    lock conflict in a non-first statement of a transaction.

 4) Modify ha_savepoint() to remember, and ha_rollback_to_savepoint()
    to rollback to the remembered set of metadata locks.

Effect on LOCK TABLES TRANSACTIONAL

Effect on other statements

None

Bug reports:
BUG#989 If DROP TABLE while there's an active transaction, wrong binlog order
BUG#12347 InnoDB: DROP TABLE ignores locks
BUG#39945 "WITH CONSISTENT SNAPSHOT" does not isolate against table changes
(i.e. RENAME)