WL#9335: Enable MDL Locking for Recovered and Detached Prepared XA Transactions

Affects: Server-8.0   —   Status: Complete

This WL addresses two scenarios related to XA where MDL locks are not properly
taken/held.

Scenario 1:
If a client connection is broken/ended when XA transaction is in PREPARED state,
transaction is detached from THD and stays in XA transaction cache. MDL locks
associated with such transactions are released during connection cleanup.
This breaks XA Transaction integrity as participating tables are open for DDL by
other connections.

To fix this incorrect behaviour it is required to enable holding MDL locks for
prepared XA transaction in case client disconnects.

Scenario 2:
If server goes down(graceful or crash) when XA transaction is in PREPARED state,
Innodb recovers such XA transactions during server start up and puts the
transaction into server XA transaction cache. But currently there is no MDL locks
taken on participating tables. This leads to integrity problems.

To fix this incorrect behavior it is required to take proper MDL locks
during XA transaction recovery.

For both use cases it is required to implement a MDL backup mechanism
to hold MDL locks during absence of active connection(THD object)
as described in above scenarios.

This work fixes BUG#9335.
Functional Requirements:
============================
Requirement 1:
MDL locks for an XA transaction in PREPARED state MUST be held for
disconnected client.

Requirement 1.1: MDL locks MUST be in granted state and no new requests
are allowed to go through until XA COMMIT/XA ROLLBACK is done for
unfinished XA transactions.

Requirement 2:
All MDL locks held by XA transaction(s) in PREPARED state before server
restart/crash MUST be acquired on server start up as part of Innodb
recovering procedure.

Requirement 2.1: All XA transactions in PREPARED state MUST take proper
MDL locks during recovery and server MUSTN'T grant MDL locks on
participating tables to a new connection until unfinished XA transactions
are committed/rolled back.

Non-Functional Requirements:
============================
Requirement 1. Implement MDL locks for multiple XA during recovery: Multiple XA
transaction in recovery MUST be handled and proper locks should be taken in
association with each recovered XA transaction.

Requirement 2. New solution MUST be implemented as independent as possible: New
system should be extensible for other features where MDL_locks need to he held in
absence of active connection, not just for XA.

General remarks about availability of a prepared XA transaction from another 
connection.
----------------------------------------------------------------------------
XA transaction that was prepared from one connection can be committed or
rolled back from another connection only in case the XA transaction is in
recovery mode. XA transaction is in recovery mode when it was prepared and
an original session (from where a transaction was initiated) has been the
disconnected.

If XA transaction was initiated from some connection and moved to PREPARED state
(using the statement XA PREPARE) but connection is still alive (no disconnection
happened or server restarted) then such XA transaction unknown from other
connections.

It means that it safe to backup MDL locks on a session disconnect.

High-Level Specification / Interface Specification
----------------------------------------------------
This WL introduces two new classes named MDL_context_backup and
MDL_context_backup_manager.

MDL_context_backup_manager implements below set of functionalities,

1. Take backup of MDL_locks from MDL_context to MDL_context_backup object.

2. Restore backed up MDL_locks from MDL_context_backup object to a given
MDL_context.

3. Store and manage multiple objects of MDL_context_backup class based on given 
key.
(transaction id).

MDL_context_backup class is implementation detail and hidden from external 
access.

In order to resolve issue described in the Scenario 1 from High-Level
description the following steps are done:

1. During client disconnect, MDL_context_backup_manager::create_backup() is
called with current XID as key to take MDL_lock backup.

2. MDL_context_backup_manager::create_backup() creates a new MDL_context_backup
object to hold all the locks in the backup.

3. Then calls MDL_context::clone_tickets() to clone all tickets from current
context to new backup_object. This new function will call
MDL_context::clone_ticket()
for each ticket present in the current MDL_context.

4. MDL_context_backup_manager::create_backup() adds the backup object to its
object map in association with transaction XID on completion of the above steps.

5. Any connection attempt to resume the transaction to either commit or rollback
calls MDL_context_backup_manager::restore_backup() with XID to retrieve all
relevant locks.

6. MDL_context_backup_manager::restore_backup() finds MDL_context_backup object
associated
with given XID and clones all locks in it to active MDL_context and returns.

7. On completion of Commit/Rollback MDL_context_backup_manager::delete_backup()
is called to release all relevant MDL_locks in the backup system.


In order to resolve issue described in the Scenario 2 from High-Level
description the following steps are done:

1. During XA transaction recovery, along with XA transaction_id, list of tables
got modified by the transaction is retrieved from InnoDB.

2. For each transaction being recovered, a new MDL_request_list is created
request for each modified table.

3. Then MDL_context_backup_manager::create_backup() is called with
MDL_request_list and XID to take necessary locks and preserve till the 
transaction
is complete.

4. MDL_context_backup_manager::create_backup() creates a new MDL_context_backup
object to hold all the locks in the backup.

5. It then calls MDL_context::acquire_locks() to take locks based on given
request_list.

6. Last MDL_context_backup_manager::create_backup() adds the backup object to
its object map in association with given transaction XID.

7. Any connection attempt to resume the transaction to either commit or rollback
calls MDL_context_backup_manager::restore_backup() with XID to retrieve all
relevant locks.

8. MDL_context_backup_manager::restore_backup() finds MDL_context_backup 
associated
with given XID  and clone's all locks to active MDL_context and returns

9. On completion of Commit/Rollback MDL_context_backup_manager::delete_backup()
is called to release all relevant MDL_locks in the backup system.
New Supporting Function
=======================
/**
  Create copy of all granted tickets of particular duration from given
  context to current context.
  Used by XA for preserving locks during client disconnect.

  @retval TRUE   Out of memory.
  @retval FALSE  Success.

*/                                                                              
bool MDL_context::clone_tickets(const MDL_context *ticket_owner,
                                enum_mdl_duration duration)


This function will go thought each ticket in the given context and clone it
to the current context using MDL_context::clone_ticket().

MDL_context_backup_manager
============================
MDL_context_backup_manager is created as singleton class and the single object
gets created during server startup from init_server_components(). Failure to do
so will result in server shutdown.

Rest of the system should use MDL_context_backup_manager::instance() to get
reference of singleton object to create, restore and delete MDL_lock backups.

During server shutdown MDL_context_manager::destroy() will be called from
clean_up() to do destruction and cleanup of any objects or locks residing in
MDL_context_backup_manager.

MDL_context_backup manager uses std::map to store and manage MDL_context_backup
objects and uses a mutex (m_LOCK_mdl_context_backup) to protect the collection
from concurrency issues.

and there is a prepared XA transaction

MDL_context_backup is created in two scenarios listed below.
------------------------------------------
1. During client disconnect.
If client disconnect happens and there is a prepared XA transaction,
THD::cleanup() will call MDL_context_backup_manager::instance().create_backup()
with XID as key to create backup of all MDL locks associated with current
transaction. Failure will be printed to error log.

2. During recovery
During transaction recovery xarecover_handlerton() will do the following set of
operation for each of the recovered XA transaction.

   a. Create MDL_request_list with MDL lock request for every table involved
   in a transaction being recovered.

   b. Invoke MDL_context_backup_manager::instance().create_backup() with
   MDL_request_list and XID of XA transaction as key.

Restoring and Deleting MDL_context_backup is done at two points listed below.
-----------------------------------------------------------------------
When a disconnected XA transaction Commit or Rollback issued from new session,
MDL_context_backup will be restored by trans_xa_commit() before proceeding for
the commit/rollback in the engine. After successful completion of the operation
trans_xa_commit() will delete MDL_backup from MDL_backup_manager.

Restore using: MDL_context_backup_manager::instance().restore_backup() with XID
as key.
Delete backup using: MDL_context_backup_manager::instance().delete_backup() with
XID as key.


Handler Layer Changes
---------------------
innobase_xa_recover() was returning list of recovered transaction ID's. This is
not enough to take proper MDL lock during recovery. innobase_xa_recover()
interface changed to include list of tables names in the out parameter and added
extra in parameter memroot, for allocating memory for tablename list.

innobase_xa_recover() definition changed to create list of tables and attach it
along with transaction XID and pass it via out parameter.

Removed trx_recover_for_mysql() function and added the same functionality to
innobase_xa_recover() for isolation of handler and sql layer data types from Innodb.

Existing Interface Definition
-------------------------------
int innobase_xa_recover(
	handlerton*	hton,	/*!< in: InnoDB handlerton */
	XID*		xid_list,/*!< in/out: prepared transactions */
	uint		len);	/*!< in: number of slots in xid_list */	

New Interface Definition
---------------------------

struct st_xarecover_txn
{
  XID id;
  List	mod_tables;
};

typedef struct st_xarecover_txn XA_recover_txn;

int innobase_xa_recover(
	handlerton*	hton,	/*!< in: InnoDB handlerton */
	XA_recover_txn*	txn_list,/*!< in/out: prepared transactions */
	uint		len,   /*!< in: number of slots in xid_list */
	MEM_ROOT*       mem_root); /*!< in: memory for table names */

MDL_context_backup_manager Interface Definition is listed below.
=======================================

/**
  MDL_context_backup_manager holds MDL locks when there is no THD object
  present to hold the same. This is a singleton class. Each element represents
  one MDL_context and holds related locks.

  This is used by XA transaction for storing MDL locks for PREPARED XA
  transaction during client disconnect. Later these locks will be handled by
  a thread which attempt to process the pending XA transaction.
*/
class MDL_context_backup_manager
{
  private:
    /*
    Key for uniquely identifying MDL_context in the MDL_context_backup map.
    */
    typedef std::basic_string MDL_context_backup_key;

  private:
    class MDL_context_backup;

  public:
    /* Uses pointer to store MDL_context_backup to reduce memory requirement
       of map. And using unique pointer to avoid object tracking.
    */
    typedef std::map,
                      std::less,
                      Malloc_allocator > > >
                      Element_map_type;              // Real map type.

  private:
    /* Singleton. No explicit Objects */
    MDL_context_backup_manager(PSI_memory_key key);

    /* Singleton Object */
    static MDL_context_backup_manager *m_single;

  public:
    /*
    Initialize member variables and singleton object
    */
    static bool init();

    /*
    Return singleton object
    */
    static MDL_context_backup_manager& instance();

    /*
    Cleanup and delete singleton object
    */
    static void destroy();

    /*
    destroy mutex and clear backup map
    */
    ~MDL_context_backup_manager();

  private:
    // PSI
    static void init_psi_keys(void);


  public:

    /**
      Create backup from given MDL_context by cloning all transactional
      locks to backup context and adds to backup context manager collection.

      This function does not set error codes beyond what is set by the
      functions it calls.

      @param[in]  context      MDL_context from which backup is created.
      @param[in]  key             Key to identity MDL_context
      @param[in]  keylen       Key Length

      @retval     true            Error, e.g. Fail to create backup object, fail
                                            to clone locks.
      @retval     false           Success
    */
    bool create_backup(const MDL_context *context,
                       const uchar* key,
                       const size_t keylen);

    /**
      Create backup MDL_context, process request on it and add
      to backup context manager collection.

      This function does not set error codes beyond what is set by the
      functions it calls.

	  @param[in]  mdl_requests   Requests need to be processed and backed up.
      @param[in]  key             Key to identity MDL_context
      @param[in]  keylen       Key Length
      @param[in]  lock_wait_timeout   Timeout for MDL_requests.

      @retval     true            Error, e.g. Fail to create backup object, fail
                                            to clone locks.
      @retval     false           Success
    */
    bool create_backup(MDL_request_list *mdl_requests,
                       const uchar* key,
                       const size_t keylen,
                       const ulong lock_wait_timeout);

    /**
      Restore locks from backup to a given MDL_context.

      This function does not set error codes beyond what is set by the
      functions it calls.

      @param[out] context     MDL_context to which locks will be restored.
      @param[in]  key             Key to identity MDL_context
      @param[in]  keylen       Key Length

      @retval     true            Error, e.g. There is no element in the
                                            collection matching given key, fail
                                            to restore locks.
      @retval     false           Success
    */
    bool restore_backup(MDL_context *context,
                       const uchar* key,
                       const size_t keylen);

    /**
      Delete backup context and release associated locks.

      This function does not set error codes beyond what is set by the
      functions it calls.

      @param[in]  key             Key to identity MDL_context
      @param[in]  keylen       Key Length

      @retval     true            Error, e.g. There is no element
                                  in the collection matching given key.
      @retval     false           Success
    */
    void delete_backup(const uchar* key, const size_t keylen);


  private:
    //Collection for holding MDL_context_backup elements
    Element_map_type *m_backup_map;

    //Mutext to protect m_backup_map
    mysql_mutex_t m_LOCK_mdl_context_backup;
};