WL#6469: InnoDB: Improve CREATE/DROP performance for temporary tables.
Introduction: ------------- Who will be benefited ? ------------------------ - MySQL InnoDB user community that uses InnoDB temp-tables. How can we achieve it ? ------------------------ Temp-table is a special case with lifetime bounded to connection lifetime or server lifetime + limited visibility (only visible to trxs executing on the connection that created it). These 2 features will help us limit lot of actions that we have to execute for general-tables (non-temp-tables) especially logging. Validation of optimizaton ? -------------------------- sql-bench (mysql native performance utility) can be used to validate if optimization has really helped. test-create test-case of this utility try to create/drop tons of table (load is configurable) using various scenario. (Note: sql-bench by default will create/drop non-temp-table. Just modify server-cfg (a single line change) to enable create/drop of temp-table).
Optimize performance of InnoDB temp-table. There are 2 parts to this problem viz. - DDL optimization (this worklog) - DML optimization (WL#6470) DDL statements that will be affected: * CREATE * DROP * TRUNCATE * IMPORT/DISCARD TABLESPACE * ALTER TABLE ... Temporary tables are only visible in the connection/session in which they are created. They are also bound by the lifetime of the server. They are discarded on startup. For DDL changes we will not make the table definitions persistent. This will eliminate all IO during the table create/drop phase. Caveats ======= - Temp-table definitions will not be written to SYS_XXXX tables. This will limit visibility of temp-tables through SYS_XXXX tables. Parallely we have added I_S.INNODB_TEMP_TABLE_INFO table to view all temp-tables active in system when query is fired. - Number of Temp-Tables will be limited by the available memory on the system running the server process. This is because their definition cannot be put on the LRU list and swapped out. This memory is only needed for the table definitions. - As temp-tables information is not logged to SYS_XXXX table while recovering from crash it is not possible to identify orphaned temp-table objects left over by crash. This issue is addressed by WL#6560. - Insert Buffering is disabled for temp-table given the overhead vs its usage in temp-table sceanrio. ============================================================================ - Important Note from documenation perspective: - temp-tablespace default named as ibtmp1 will be located in data-dir. (parallel to ibdata1). (as against tmpdir). - user can change location of temp-tablespace using relative path syntax while specifying innodb_temp_data_file_path (Same as innodb_data_file_path)
DDL optimization is made up off optimizing multiple DDL stmts viz. CREATE/DROP/TRUNCATE/.... etc. We will go over each stmt and will see general actions that are performed in each stmt. That will help us markout actions that can be avoided for temp-tables. For some of these that we need for temp-table we may direct inerface for some we might need to develop a new one. ================================================================ CREATE: ------ - creation of table (temp + non-temp) involves following steps: - create in-memory table (dict_table_t) object along with fields. - create and assign innodb wide unique table-id. - create tablespace (per-table tablespace if configured). - create and insert entry to SYS_TABLES and SYS_FIELDS. - add in-memory object to dictionary cache. - create default pk if user hasn't specified one. - create all indexes as specified by user. this involves: - create in-memory index (dict_index_t) object. - create and assign innodb wide unique index-id. - create and insert entry to SYS_INDEXES/SYS_FIELDS table. - add in-memory object to dictionary cache. - create b-tree object for storing index data and update SYS_XXXX table accordingly. Importantly root-page-no. (root-page-no is also available in index->page_no in-memory structure). - looking at above steps from temp-table perspective we can reduce it to following: - create in-memory table object along with fields - create and assign table-id - add in-memory object to dictionary cache. - create indexes (in-memory objects) and related b-tree object. We are skipping adding of entries to SYS_XXXX tables as it further results in IO + logging actions. Temp-table can be maintained in memory given their lifetime is bounded by connection or server lifetime. While this limit their view through SYS_XXXX tables which is just an informative view and is not needed for correctness. Besides this even from existing action we do avoid logging for temp-table cases viz. during generation of table-id. As temp-table is not persistent to disk (in form of SYS_XXXX) tables we need to ensure temp-table object are not evicted from the memory and so added to non-LRU list. interface changes: Normal create-path will create and execute create-ins-graph-plan. Given temp-table has reduced actions we can directly invoke these actions from create_table_def()/create_index() This decision demands addition or modification of interfaces so that they can be invoked directly from create_XXXX() level. - build_tablespace(): Add build_tablespace interface. This interface will just create tablespace, if configured. Currently this action is embbeded within function that also update SYS_XXXX and so we segregated this as independent interface. Temp-Table path will just invoke this interface. - dict_build_index_def(): Update index information viz. index-id, space-id to in-memory structure. - dict_create_index_tree(): Create B-tree and update root-page-no to in-memory index object. dict_create_index_tree_step() also update root-page-no to SYS_XXXX table. ====================================================================== DROP ---- - dropping of table (temp + non-temp) involves following: - Remove entries from all SYS_XXXX tables. - drop indexes. this include freeing btree nodes too. - remove in-memory object from dictionary cache. - delete tablespace. For temp-table we just need to execute following actions: - drop indexes. - remove in-memory object. - delete tablespace. Dropping of indexes involves freeing of btree-node which is logged action. We avoid logging this action as we haven't logged creation of btree. For triggering this action we importantly need root-page-no. This is still accessible from dict_index_t object (in general case it is loaded from SYS_XXXX table) as temp-table is created in same session. Post these 2 actions we remove the per-table tablespace file if configured. interface change: - dict_drop_index_tree(): drop index tree (free btree-nodes). There is no corresponding SYS_XXXX table entry to update. - existing interface named dict_drop_index_tree() has been renamed to dict_drop_index_tree_step(). This naming is inline with other naming convention as this is just a step as part of drop-plan. ====================================================================== TRUNCATE -------- - truncate of table involves following: - drop and recreate the indexes. this will ensure clean 0 entries. - assign and update new table-id. - if re-create space, is configured then re-create the space and update SYS_XXXX tables. For temp-table same actions are present except way of getting page_no is different. for temp-table we get it directly from in-memory index object. For general table it is loaded by querying SYS_XXXX table. interface change: - dict_truncate_index_tree(): drop index tree (free btree nodes) and re-create new index tree using same index-id. Update index with new root-page-no. - existing interface named dict_truncate_index_tree() has been renamed to dict_truncate_index_tree_step(). This naming is inline with other naming convention as this is just a step as part of truncate-plan. ======================================================================= DISCARD/IMPORT TABLESPACE: -------------------------- - discard tablespace will de-associate per-table tablespace from table. - remove insert buffering reserved buffers. - assign undo segment. - update SYS_XXXX table to reflect discard status. - update SYS_XXXX table to reflect table/index id. - generate and assign new table-id. - remove tablespace. - for temp-table: - for temp-table discard/import of tablespace is blocked. ========================================================================= TURNING OFF LOGGING: -------------------- - Logging is are of 2 types: undo/redo. UNDO logging: undo logging is done to undo/rollback segment allotted within shared tablespace. For DDL stmt there is minimal or no undo logging. If present, we have avoided that by using separate path for temp-table. REDO logging: REDO logging is done to redo-log-files created in data-dir during server start-up. REDO-logging is done on each action (especially during create or drop of objects). Blocking redo logging at logging level when commit is being done is difficult as it is system-wide operation and so difficult to determine which log-stmt corresonds to which object. Instead we do block redo-logging at logical level viz. during creation of objects. For example: dict_hdr_get_new_id() Generation of new table/index id is logged. For temp-table we avoid this logging. Logging is controlled through a mini-trx (mtr). mtr has facility to select logging level. std logging level: MTR_LOG_ALL, MTR_LOG_NONE, MTR_LOG_NO_REDO, MTR_LOG_SHORT_INSERTS with default set to MTR_LOG_ALL. for temp-table we select MTR_LOG_NONE. ====================================================================== HOW TO CHECK FOR TEMP-TABLE? ---------------------------- - dict_table_t (table) object set flag for temporary table. We can test this flag using dict_table_is_temporary(). flags2 & DICT_TF2_TEMPORARY. ====================================================================== COMMON INTERFACE CHANGES: ------------------------- - dict_hdr_get_new_id(): generates table/index/space id. We now pass-in table object so that function can determine if action is for temp-table and accordingly turn-off logging.
Copyright (c) 2000, 2022, Oracle Corporation and/or its affiliates. All rights reserved.