WL#6560: InnoDB: separate tablespace for innodb-temp-tables.
Introduction: ------------ As part of upcoming release we are optimizing temp-tables for performance. To accompolish this goal we are trying to make temp-tables creation and removal light-weight by avoiding persisting temp-table meta-information to disk. To support this move, we need to move temp-tables to a separate tablespace so that recovery stage of temp-tables is single shot and stateless process. As per current flow, db-objects viz. tables and indexes are identified during server start-up using persisted meta-information. For temp-tables this info doesn't exist also there is no needed to recover temp-tables but same time to ensure we clean the system and avoid leaking db-objects we need stateless recovery process. By creating separate tablespace we can simply get rid of this tablespace on each start-up and create it new for upcoming server session. This change indenpendently will not make any +ve/-ve difference in server working but is needed for upcoming temp-table optimization. Visible difference: ------------------- - New separate tablespace named ibtmp1 in data-dir. This tablespace is re-created on each startup (except in case of --innodb_read_only). All non-compressed temp-table resides in this tablespace. (Compressed temp-table continue to use existing approach).
Problem Statement: ------------------ Create a new separate tablespace similar to system-tablespace for persisting innodb-temp-table data. Let's understand current configuration of tablespace: - by default all temp-tables goes to system-tablespace. (except compressed for which file-per-table is must). - user can configure file-per-table option and create a separate tablespace for each temp-table. What are we looking at ? - We need separate tablespace for temp-table so that it can be re-created on each server startup (normal or crash-recovery mode startup). This is needed so that we can avoid persisting information of temp-table to SYS_XXXX tables. - If we continue to use system-tablespace and no information is persisted to temp-table then in crash-recovery scenario we will leak db-objects as there is no reference to find these objects existed and so we can't clean them. - Alternative suggested is to have separate tablespace. So why no use file-per-table option ? - While this help achieve the goal it takes toll on performance. Creating tablespace per table and extended it indiviudually and then removing it is costly operation as against maintaining a common tablespace and extending it. High Level Approach Description: -------------------------------- - Temp-tablespace is created during server startup along with other tablespace. Single tablespace is created for storing all non-compressed temp-tables. User can't dictate creation of this tablespace. It is must to have a temp-tablespace. - Temp-tablespace is configurable similar to system-tablespace. Newly added cnf option for configuring temp-table: --innodb_temp_data_file_path. For example: --innodb_temp_data_file_path="ibtmp1:24M:autoextend" Note: Min. size of temp-tablespace is 12 MB. - Name of data-file(s) and temp-data-file(s) can't be same This is wrong: --innodb_data_file_path="ibdata1:20M;ibdata2:30M:autoextend" --innodb_temp_data_file_path="ibdata1:20M;ibdatatmp2:30M:autoextend" and would result in error in innodb_temp_data_file_path parsing. (Error message: system shared and system temp tablespace file name seems to be same) - Temp-tablespace can't reside on raw device. There is no added advantage of residing temp-tablespace on raw device and so this decision. (Error message: InnoDB: Tablespace doesn't support raw devices) - Inability or error during creation of temp-tablespace is treated as FATAL and will refuse server startup. - Temp-tablespace by default resides in data-dir parallel to ibdata1. User can specify relative path to change the location of temp-tablespace. - Temp-tablespace uses dynamically generate space-id. This space-id can change on each server re-start but that is fine as new temp-tablespace is created on every re-start. With this alternative, we also safe guard this enhancement with previous version compatibility. Using existing database with this-enhancement-folded mysql will continue to operate as space-id for temp-tablespace is dynamically generated. (Ensures no conflicts with existing space-ids). - All non-compressed temp-table (user or system generated) by default get assigned to this new tablespace. This also means all other related objects viz. indexes, etc... co-exist in same tablespace. - Compressed temp-tables continue to operate using existing semantics. - Temp-Tablespace being special kind of tablespace (similar to system-tablespace) there is no entry made to SYS_DATAFILES table about this tablespace. - Under readonly mode user is not allowed to create temp-table. There is no concept of existing temp-tables and so under this mode we don't touch temp-tablespace (leave it as is). - file-per-table=on by default. For non-compressed temp-tables we just ignore this flag and ensures non-compressed temp-tables resides in shared temp-tablespace. - Temp Tablespace is removed: - on normal shutdown (ir-respective of mode of shutdown 0,1,2) - on init abort (this can happen if user specfies wrong startup options). - On crash, temp-tablespace is not removed. DBA is free to remove it or if DBA re-starts server with same configuration then it will be removed on next restart.
Low Level Design: ----------------- Draft below helps understand changes needed to enable separate tablespace for temp-tables. Parse and store Temp-Tablespace options: innobase_init(): Parses options specified by user (through cnf or commandline) for innodb_data_file_path. Parallel to this add a parsing logic of innodb_temp_data_file_path. We can re-use the same function as used by innodb_data_file_path. Post parsing add logic to perfom quick sanity check to catch out file-name conflicts between system and temp tablespace files. If intersection is detected server start is aborted. It is must to do these sanity checks before we start the server to ensure we don't land up in irreversible state. - Sanity checks: - Ensure there is no name conflicts between system and temp-tablespace. File-names list of system and temp should mutually exclusive. For example: if file name are same then we might delete ibdata1 assuming it as temp-tablespace file which means we might loose lot of important data and put system in irreversible state. On completing all sanity check it is safe to delete temp-tablespace. (Note: Avoid deleting temp-tablespace on shutdown? Why? This will not take care of its removal during crash/recovery.) Create temp-tablespace innobase_start_or_create_for_mysql(): This function has existing hooks to open system-tablespace followed by other tablespace and then recovery. As temp-tablespace doesn't have anything to recovery, its creation is delayed till recovery is completed. Once recovery is done we create temp-tablespace. srv_open_tmp_tablespace(): - Get new space-id using dict_hdr_get_new_id interface. This ensures a next free space-id is alloted. Assign it to temp-tablespace. Will continue to use this space-id for all temp-tables (including dependent objects) for current server life-time. - Create temp-tablespace. (Physically create files on disk as configured by user). - Initialize temp-tablespace by writing header information. (fsp_header_init()) - Also, create space object in central space structure with respective space-id. (fil_space_create). - On successfully completion of this function temp-tablespace is ready for use. /******************************************************************** Create the temporary file tablespace. @return DB_SUCCESS or error code. */ static dberr_t srv_open_tmp_tablespace( Ignore File Per Table for non-compressed temp-table With recent release (I guess 5.6) file-per-table is default. For non-compressed temp-tables we need to ignore this flag and continue to reside then in temp-tablespace. Disabling of this feature is goverened at top-level function that dictates and set dict_table_t flag accordingly. innobase_table_is_noncompressed_temporary() - ignore file-per-table (use_tablespace=false). /*********************************************************************//** Check if table is non-compressed temporary table. @return true if non-compressed temporary table. */ UNIV_INTERN bool innobase_table_is_noncompressed_temporary( /*======================================*/ HA_CREATE_INFO* create_info, TABLE* form) Assigning of temp-space-id: dict_build_table_def_step(): is used to create per-table tablespace if configured. For other table space id is defaulted to 0 which means system-tablespace. We want non-compressed temp-table to reside in temp-tablespace and so we add conditional check to update space-id accordingly. Index and other related objects inherit table space-id and so resides in temp-tablespace.
Copyright (c) 2000, 2023, Oracle Corporation and/or its affiliates. All rights reserved.