WL#8003: Server support for attachable transactions

Affects: Server-5.7   —   Status: Complete

This WL is a server counterpart for WL#7828 (InnoDB: attachable
transaction support). The server code should be updated to be
able to use InnoDB attachable transactions.

This WL is a first step to full server support of nested transaction
contexts.
FR-1. The server must be able to initiate an attachable transaction
when the user (outer) transaction has been already started.

FR-2. The server must be able to mark the end of an active attachable
transaction.

FR-3. The server must make sure there is no more than 1 active attachable
transaction at a time.

FR-4. The server must be able to resume the user (outer) transaction
when an active attachable transaction is ended.

FR-5. The server must be able to open and lock system tables in an attachable
transaction even though user tables have been already opened and locked.
Terms
=====

"InnoDB attachable transaction" is introduced in WL#7828. It is proposed to
keep using this that term in this WL so that we clearly distinct it from
other transaction kinds.

Why?
====

This section briefly outlines why we need attachable transactions.

There are the following reasons why attachable transactions are needed:

  - We should read from the Data Dictionary using different snapshot and
    isolation mode than the one used by the main transaction;

    The isolation level must be READ COMMITTED

  - These reads should be non-locking.

  - We should be able to open the Data Dictionary tables at any point
    during statement execution, even though user tables have been opened
    and locked.

An attachable transaction is a read-only transaction dedicated to perform
reads from the Data Dictionary. Updates to the Data Dictionary are handled
using different technique.

High-level interface
====================

THD should provide two main operations:

  - begin a new attachable transaction:

    THD::begin_attachable_transaction()

  - end attachable transaction;

    THD::end_attachable_transaction()

There can be no more than one attachable transaction at a time. Starting a
new attachable transaction when there is already active one is a
programming error and results in assertion failure (crash) in the debug
build and undefined behavior in the release build.

In addition to that THD should provide auxiliary operation to check if
there is an active attachable transaction:

  THD::is_attachable_transaction_active()

Attachable transaction life cycle
=================================

Overview
--------

The attachable transaction life cycle looks like the following:

1. THD::begin_attachable_transaction()

  1.1. Save the state of the current regular transaction
  (including THD::ha_data);

  1.2. Prepare new THD-transaction-state
  (reset some members, set others to required values);

  1.3. Reset THD::ha_data -- that will make InnoDB to create a new
  transaction context on the next operation.

2. Perform a first read from any [system] table.

  2.1. When the control flow reaches InnoDB, InnoDB does not find
  its own transaction state for the given THD (as THD::ha_data was reset in
  1.3);

  2.2. InnoDB creates a new transaction state;

  2.3. InnoDB calls trans_register_ha() to notify the server that a new
  InnoDB-transaction has been created;

  2.4. InnoDB performs the requested read operation;

3. Perform a subsequent read from any [system] table.

  3.1. When the control flow reaches InnoDB, there is already
  InnoDB transaction state for the given THD (from the (2) step);

  3.2. InnoDB performs the requested read operation;

4. THD::end_attachable_transaction()

  4.1. Call close_thread_tables() in order to close all tables which have
  been opened in the attachable transaction;

  4.2. Call the close_connection() handlerton operation in order to
  instruct InnoDB to destroy its transaction context for this attachable
  transaction.

    4.2.1. InnoDB calls trx_commit_in_memory(). This function just closes
    InnoDB internal "read view" and throws the transaction context away. No
    operation from the SQL-level is called.

  4.3. Restore the THD-transaction-state to resume the regular transaction.

The life cycle from InnoDB view point
-------------------------------------

This section provides details on what steps are important for InnoDB to
begin or to end attachable transactions. In other words: how to instruct
InnoDB to begin or end attachable transactions.

See also WL#7828.

This is InnoDB specific and should be re-considered if other SE is going to
be used.

Beginning a new attachable transaction
++++++++++++++++++++++++++++++++++++++

This is THD::begin_attachable_transaction() operation.

A new attachable transaction suspends the current regular transaction. In
order to be able resume the regular transaction InnoDB requires the
THD::ha_data array to be saved.

The main thing to initiate a new attachable transaction is to reset the
THD::ha_data array.

Other actions in THD::begin_attachable_transaction() do not matter to
InnoDB and are mainly divided into two groups:
  - backup the state of the current regular transaction;
  - reset the state and set the needed transaction properties.

The only thing that matters is resetting of THD::ha_data array.

Committing attachable transaction
+++++++++++++++++++++++++++++++++

Attachable transactions in InnoDB can not be instructed to commit or
rollback. Instead, InnoDB can be instructed to end (or forget) an
attachable transaction.

Technically, InnoDB does the following steps to end an attachable
transaction:

  - trx_commit_in_memory() is called

  - since attachable transactions are AUTOCOMMIT and read-only, the only
    essential thing trx_commit_in_memory() is doing is closing the read
    view

  - closing the read view is an InnoDB internal operation, no call to the
    server layer is made

  - the InnoDB attachable transaction context is just thrown away

In other words, the ending of an attachable transaction will not include
calling neither of ha_commit_trans() / ha_rollback_trans() /
trans_commit*() / trans_rollback*().

As mentioned above, THD will provide an explicit call
(THD::end_attachable_transaction) to get out of scope of the active
attachable transaction, but the transaction will not be "committed" per
say.

Ending attachable transaction
+++++++++++++++++++++++++++++

This is THD::end_attachable_transaction() operation.

The following sequence of operations is required to close the active
transaction context in InnoDB:

  1. Close the tables opened in the attachable transaction:
  
    close_thread_tables(thd);
  
  2. Call close_connection() method of InnoDB handlerton:
  
    ht->close_connection(ht, thd);

After that, the regular transaction must be resumed (it was suspended when
the attachable transaction started). In order to do that THD::ha_data must
be restored.

Other actions in THD::end_attachable_transaction() do not matter to InnoDB.
They just restore the THD-transaction-state.

The life cycle from the SQL level view point
--------------------------------------------

Beginning a new attachable transaction
++++++++++++++++++++++++++++++++++++++

Some THD members which constitute the transaction state are saved and
reset. For the extensive list of these member, see the "Attachable
transaction context" section.

It's worth to check the THD::transaction_rollback_request flag (it is a
flag set by SE indicating that SE requests a transaction rollback). The
flag must be unset. If it is set, that means we're starting an attachable
transaction when the regular (outer) transaction should be reverted.

InnoDB requirement for starting a new attachable transaction is that
THD::ha_data array is cleared. We are going to save/reset and restore the
whole THD::ha_data array, and not only its elements for InnoDB.

This puts an important limitation on any other Storage Engine which can be
accessed in an attachable transaction: that SE should support the same
semantics for attachable transactions as InnoDB. That means that:

  - either SE detects the fact that THD::ha_data was reset and start a new
    attachable transaction, closes attachable transaction on
    close_connection and resumes regular (outer) transaction when
    THD::ha_data is restored;

  - or SE completely ignores THD::ha_data and close_connection like MyISAM
    does.

An SE that conforms with that limitation is "compatible with the attachable
transaction requirements".

Opening new tables
++++++++++++++++++

In order to open new [system] tables to read from them in the scope of
an attachable transaction common open_and_lock_tables() can be used.

NOTE: there is a refactoring that makes sure that system tables can be
opened via special calls only:
  - open_nontrans_system_tables_for_read()
  - open_trans_system_tables_for_read()

Committing statement-level transaction
++++++++++++++++++++++++++++++++++++++

The functions we use for dealing with the tables call trans_commit_stmt()
and trans_rollback_stmt() under some circumstances.

Calling neither of trans_commit_stmt() / trans_rollback_stmt() functions
should happen when we're dealing with the system tables in the scope of
attachable transaction.

In order to ensure that, we will add debug-asserts to these functions
checking that there is no active attachable transaction.

Ending attachable transaction
+++++++++++++++++++++++++++++

There are 3 steps:

  - close the tables opened in the attachable transaction by
    close_thread_tables();

  - instruct InnoDB to end the attachable transaction;
  
  - restore the saved state of the regular transaction.

In general case close_thread_tables() requires a call to
trans_commit_stmt() or trans_rollback_stmt() prior to it. This is not
relevant in the case of attachable transaction since this requirement is
relaxed when we close tables outside of main open-tables-state.

It's again worth to check the THD::transaction_rollback_request flag. The
flag must be still unset as InnoDB is not expected to request a rollback
during attachable transaction.

Attachable transaction context
==============================

The main goal of this WL is to provide an attachable transaction context so
that operations done in scope of that transaction do not affect the regular
(outer) transaction.

The set of THD members which should be saved/restored largely consists of
the following subsets:

  1. the members which need to be changed in order to essentially start an
    attachable transaction;

    - HA-data array (THD::ha_data)
    
      That's the requirement from InnoDB.

    - The current SQL-command (THD::lex->sql_command)

      An InnoDB attachable transaction must be SQLCOM_SELECT. InnoDB uses
      the sql-command in ha_innobase::store_lock() to determine the
      required locks to be taken.

    - SQL_MODE value (THD::variables.sql_mode)

      SQL_MODE should be reset to default to avoid getting weird results.

    - Transaction parameters:

      - Transaction isolation level (THD::tx_isolation)

        An InnoDB attachable transaction must be of READ COMMITTED
        isolation level. That's also required by the Data Dictionary (see
        NOTE in WL#7751).

      - Transaction read-only flag (THD::tx_read_only)

        An InnoDB attachable transaction must be READ ONLY.

      - Transaction options (THD::variables.option_bits)

        An InnoDB attachable transaction must be AUTOCOMMIT.

  2. the members which can be read or updated by code which opens, locks,
  reads from/closes system tables in attachable transaction.

  Failure to save/restore updated members might break user transaction.
  Failure to save/reset/restore members which are read might cause problem
  for execution of attached transaction as well.

  This big subset in turn consists from a few smaller subsets:

    a) Members used/updated by open_tables() and lock_tables()

      - Open-tables state

        The open-tables state should be reset to be able to open another
        set of tables (data dictionary tables) when the user tables have
        already been opened.

      - Query-tables state

    b) Members used/updated by trans_register_ha()

      - The transaction state (THD::m_transaction)

        The transaction state contains transaction specific information
        such as savepoint data (allocated on the transaction memory root),
        transaction memory root and others.

        In order to simplify saving, resetting and restoring of the
        transaction state THD::m_transaction should be made a pointer so
        that it can be easily replaced.

      - Server status flags (THD::m_server_status)

      - Transaction instrumentation object (THD::m_transaction_psi)

    c) Members used/updated by close_thread_tables()

      (no members here except the ones listed above)

other places/operations which happen within attachable transaction are not
important/relevant in the context of this WL. Thus, consideration of only
the above operations is required.

Changes to Handler API
======================

A new flag will be added to the Handler API:

  HA_ATTACHABLE_TRX_COMPATIBLE

In short, the flag means that the SE is "compatible with the attachable
transaction requirements" (see the section "The life cycle from the SQL
level view point" above).

To recap, being "compatible with the attachable transaction requirements"
means that the SE

  - either SE detects the fact that THD::ha_data was reset and start a new
    attachable transaction, closes attachable transaction on
    close_connection and resumes regular (outer) transaction when
    THD::ha_data is restored;

  - or SE completely ignores THD::ha_data and close_connection like MyISAM
    does.

The detailed semantics of HA_ATTACHABLE_TRX_COMPATIBLE flag is as follows:

  - if HA_ATTACHABLE_TRX_COMPATIBLE is set, that means that the SE is
    compatible with the attachable transaction requirements.
    
    In other words, it means that the tables from the SE can be used within
    an attachable transaction. It does NOT mean that the SE supports
    attachable transactions per say:

      - InnoDB supports attachable transactions;

      - MyISAM does not support attachable transactions, but it is
        compatible with the attachable transaction requirements because it
        is simply unaware of THD::ha_data and close_connection handlerton
        call. Thus, resetting THD::ha_data and calling close_connection
        handlerton call does not affect MyISAM behavior.

  - if HA_ATTACHABLE_TRX_COMPATIBLE is reset, that means that the SE is not
    compatible with the attachable transaction requirements.
    
    In turn, that means that the tables from the SE can NOT be used within
    an attachable transaction. Using tables of such SE in an attachable
    transaction will result in a crash in the debug build, and an error in
    the release build.