WL#6470: InnoDB: optimize DML operations for temporary table

Status: Complete   —   Priority: Medium

DML temp-table optimization is part of effort to optimize InnoDB temptable.
DML operations form major part of operations that are done on 
temp-tables and so optimizing DML is important.

Scope of DML stmt covered by this work-log is:

How do we plan to optimize ?

- Experiment suggested that InnoDB is doing an extensive IO when compare to
  MyISAM. Detailed look revealed that some of this IO can be turned off
  for temp-table case. (For example: redo logging can be turned-off).
  Idea is to findout such blocks of code that are not needed for temp-table.

Visible difference:

- This WL aims at optimizing DML and so there shouldn't be any
  semantic or behavioral change except faster DMLs.



- w/o optimization:
  TOTALS                               393.00    9.06    0.16    9.22 20000000 

- w/ optimization:
TOTALS                               257.00    9.42    0.13    9.55 20000000

Note: workload involves insert/update/delete.


Ideally, given that temp-table are not shared across connections there shouldn't
be any contention except some for common resource which is thin level.

Here are results of multi-threaded suite just to rule out any last minute suprise.

Threads	Inserts	InnoDB	MyISAM	Memory

10	300K	14	15	5

100	30K	14	20	6

1000	3K	18	15	11

Looks like InnoDB continue to perform at stable rate.


Problem Statement:

- Improve temp-table DML performance (INSERT/UPDATE/DELETE).

High Level Approach Description:
Let's look at the code block that can be turned-off for temp-table.

- Logging: Undo + Redo.
  Redo logging can be turned off given that temp-table lifetime is bounded
  by server lifetime. On restart temp-table are not-restored and so no 
  need of redo logging.
  Undo logging is needed to ensure rollback (trx level or stmt level).
  (Note: undo logging would also result in redo logs generation. This redo
   log generation is required for undo logging correctness so left un-touched.)

- Change buffering can be disabled for temp-table given that most of the
  temp-table will be smaller in size (and might reside in memory).
  Also, this will reduce logging + IO associated which is big save 
  for temp-table given the saving we incur from change buffering.

- Locking can be disabled for temp-table as temp-table are visible only
  to the connection that created it.

- Semantics difference:
  - No difference. All DMLs continue to operate with same semantic.

- Behavioural difference:
  - Increased DML speed for DML operating on temp-table only.

This section describes low level changes 
(including code/api changes) that needs to be done.

DML ops (I/U/D) consist of following main actions: 
- insert/update entry to the clustered-index.
- insert/update entry to secondary index[es].


Logging for these actions is controlled by a mtr (mini-trx) that is 
created at each logical level that can result in db state change.

Each of these actions would result in redo logging as state of DB
is being updated. Parent level mtr is spawned at index level 
and then propogated for all related changes involving the completion
of the action.

MTR framework allows turn-off of REDO logging by setting appropriate
mode (MTR_LOG_NO_REDO). 

Turn-off REDO logging for mtr created in following APIs:

row_ins_clust_index_entry_low - clustered index insert.
row_ins_sec_index_entry_low - secondary index insert.
row_upd_clust_rec_by_insert - clustered index record update (including delete).
row_upd_sec_index_entry - secondary index update (including delete).

Special/Corner cases:

- DML may involve insert/update of big-record column (viz. BLOB column).
  General DML path identifies these columns presence in table and follows
  a separate path for insert/update.
  Ensure mtr used for these columns also has REDO logging turned off.

  btr_store_big_rec_extern_fields(): store big record and put pointer
  in original record.
  Note: there can be a separate mtr in existence for allocation and
  modification. Ensure REDO logging is turned off for both.

  row_ins_clust_index_entry_low(), row_upd_clust_rec(): follows 
  a separate path to update big-record.

- Besides normal (user triggered) dml operation, these operations
  can also be triggered as part of undo/rollback which can lead to 
  redo logging too.

  row_undo_ins_remove_clust_rec(): remove entry from clustered index.
  row_undo_ins_remove_sec_low(): remove secondary index entry.
  row_undo_mod_clust(): undo modification to clustered index.
  row_undo_mod_del_mark_or_remove_sec_low(): remove or delete-mark sec-index

- As part of insert during split of page, records are moved from
  one page to newly allocate page. This record movement is tracked using
  a special MTR_LOG_SHORT_INSERTS logging level. For temp-table
  as there is no logging needed we can avoid this special logging too.


Locking during DML can be controlled by setting appropriate flag:

In insert path following function will ensure that there is no other
trx trying to block insert of the data into table by current trx.

For temp-table this condition can't exist and so we can safely ignore
locking by setting appropriate flag.

        ulint           flags,  /*!< in: if BTR_NO_LOCKING_FLAG bit is
                                set, does nothing */

Set BTR_NO_LOCKING_FLAG at index level for temp-table to avoid locking.

        if (dict_table_is_temporary(index->table)) {
                flags |= BTR_NO_LOCKING_FLAG;

In delete/update path following function ensures there is no active trx
blocking from deleting current record.


Again, for temp-table this lock check can be turnedd off.


API to turn-off logging:

Turn-off redo-logging if temporary table. */
        const dict_table_t*     table,  /*!< in: table to check */
        mtr_t*                  mtr)    /*!< out: mini-transaction */