WL#6560: InnoDB: separate tablespace for innodb-temp-tables.

Status: Complete   —   Priority: Medium


As part of upcoming release we are optimizing temp-tables for performance.
To accompolish this goal we are trying to make temp-tables creation and removal
light-weight by avoiding persisting temp-table meta-information to disk.

To support this move, we need to move temp-tables to a separate tablespace
so that recovery stage of temp-tables is single shot and stateless process.

As per current flow, db-objects viz. tables and indexes are identified during
server start-up using persisted meta-information. For temp-tables this info 
doesn't exist also there is no needed to recover temp-tables but same time 
to ensure we clean the system and avoid leaking db-objects we need stateless
recovery process. By creating separate tablespace we can simply get rid of this
tablespace on each start-up and create it new for upcoming server session.

This change indenpendently will not make any +ve/-ve difference in server
working but is needed for upcoming temp-table optimization.

Visible difference:

- New separate tablespace named ibtmp1 in data-dir.
  This tablespace is re-created on each startup (except in case of
  All non-compressed temp-table resides in this tablespace.
  (Compressed temp-table continue to use existing approach).

Problem Statement:

Create a new separate tablespace similar to system-tablespace for persisting
innodb-temp-table data.

Let's understand current configuration of tablespace:
- by default all temp-tables goes to system-tablespace.
  (except compressed for which file-per-table is must).
- user can configure file-per-table option and create a separate tablespace
  for each temp-table.

What are we looking at ?
- We need separate tablespace for temp-table so that it can be re-created on 
  each server startup (normal or crash-recovery mode startup).
  This is needed so that we can avoid persisting information of temp-table
  to SYS_XXXX tables.

- If we continue to use system-tablespace and no information is persisted to 
  temp-table then in crash-recovery scenario we will leak db-objects 
  as there is no reference to find these objects existed and so we can't 
  clean them.

- Alternative suggested is to have separate tablespace. 
  So why no use file-per-table option ?
  - While this help achieve the goal it takes toll on performance.
  Creating tablespace per table and extended it indiviudually and then removing it
  is costly operation as against maintaining a common tablespace and extending it.

High Level Approach Description:

- Temp-tablespace is created during server startup along with other tablespace.
  Single tablespace is created for storing all non-compressed temp-tables.
  User can't dictate creation of this tablespace. It is must to have a 

- Temp-tablespace is configurable similar to system-tablespace.
  Newly added cnf option for configuring temp-table: --innodb_temp_data_file_path.
  For example: --innodb_temp_data_file_path="ibtmp1:24M:autoextend"
  Note: Min. size of temp-tablespace is 12 MB.

- Name of data-file(s) and temp-data-file(s) can't be same 
  This is wrong:
  and would result in error in innodb_temp_data_file_path parsing.
(Error message: system shared and system temp tablespace file name seems to be same)

- Temp-tablespace can't reside on raw device. There is no added advantage of
  residing temp-tablespace on raw device and so this decision.
(Error message: InnoDB: Tablespace doesn't support raw devices)

- Inability or error during creation of temp-tablespace is treated as FATAL
  and will refuse server startup.

- Temp-tablespace by default resides in data-dir parallel to ibdata1.
  User can specify relative path to change the location of temp-tablespace.

- Temp-tablespace uses dynamically generate space-id. This space-id can change
  on each server re-start but that is fine as new temp-tablespace is created 
  on every re-start.
  With this alternative, we also safe guard this enhancement with previous version
  compatibility. Using existing database with this-enhancement-folded mysql will
  continue to operate as space-id for temp-tablespace is dynamically generated.
  (Ensures no conflicts with existing space-ids).

- All non-compressed temp-table (user or system generated) by default 
  get assigned to this new tablespace. 
  This also means all other related objects viz. indexes, etc... 
  co-exist in same tablespace.

- Compressed temp-tables continue to operate using existing semantics.

- Temp-Tablespace being special kind of tablespace 
  (similar to system-tablespace) there is no entry made to SYS_DATAFILES 
  table about this tablespace.

- Under readonly mode user is not allowed to create temp-table. 
  There is no concept of existing temp-tables and so under this mode we 
  don't touch temp-tablespace (leave it as is).

- file-per-table=on by default. For non-compressed temp-tables we just ignore
  this flag and ensures non-compressed temp-tables resides in 
  shared temp-tablespace.

- Temp Tablespace is removed:
  - on normal shutdown (ir-respective of mode of shutdown 0,1,2)
  - on init abort (this can happen if user specfies wrong startup options).
  - On crash, temp-tablespace is not removed. DBA is free to remove it
    or if DBA re-starts server with same configuration then it will be
    removed on next restart.

Low Level Design:

Draft below helps understand changes needed to enable separate tablespace
for temp-tables.

Parse and store Temp-Tablespace options:
    Parses options specified by user (through cnf or commandline) for 
    Parallel to this add a parsing logic of innodb_temp_data_file_path. 
    We can re-use the same function as used by innodb_data_file_path.

    Post parsing add logic to perfom quick sanity check to catch out 
    file-name conflicts between system and temp tablespace files. 
    If intersection is detected server start is aborted. 
    It is must to do these sanity checks before we start the server
    to ensure we don't land up in irreversible state.
    - Sanity checks:
  	- Ensure there is no name conflicts between system and temp-tablespace.
          File-names list of system and temp should mutually exclusive.
	    For example: if file name are same then we might delete ibdata1 
	    assuming it as temp-tablespace file which means we might loose 
	    lot of important data and put system in irreversible state.

    On completing all sanity check it is safe to delete temp-tablespace.
    (Note: Avoid deleting temp-tablespace on shutdown?
     Why? This will not take care of its removal during crash/recovery.)

Create temp-tablespace
    This function has existing hooks to open system-tablespace followed
    by other tablespace and then recovery. As temp-tablespace doesn't 
    have anything to recovery, its creation is delayed till recovery is 
    completed. Once recovery is done we create temp-tablespace.
	- Get new space-id using dict_hdr_get_new_id interface. This 
          ensures a next free space-id is alloted. 
          Assign it to temp-tablespace. Will continue to use this
	  space-id for all temp-tables (including dependent objects) 
          for current server life-time.
	- Create temp-tablespace. 
          (Physically create files on disk as configured by user).
	- Initialize temp-tablespace by writing header information. 
	- Also, create space object in central space structure with 
	  respective space-id. (fil_space_create).
	- On successfully completion of this function temp-tablespace 
	  is ready for use.

	Create the temporary file tablespace.
	@return DB_SUCCESS or error code. */

Ignore File Per Table for non-compressed temp-table
    With recent release (I guess 5.6) file-per-table is default. 
    For non-compressed temp-tables we need to ignore this flag and continue
    to reside then in temp-tablespace.
    Disabling of this feature is goverened at top-level function that dictates
    and set dict_table_t flag accordingly.

    innobase_table_is_noncompressed_temporary() - 
      ignore file-per-table (use_tablespace=false).

	Check if table is non-compressed temporary table.
	@return true if non-compressed temporary table. */
		HA_CREATE_INFO* create_info,
		TABLE*          form)

Assigning of temp-space-id:
	 is used to create per-table tablespace if configured.
    For other table space id is defaulted to 0 which means system-tablespace.
    We want non-compressed temp-table to reside in temp-tablespace and so we 
    add conditional check to update space-id accordingly.
    Index and other related objects inherit table space-id and so resides 
    in temp-tablespace.