MySQL 8.3.0
Source Code Documentation
sql_tmp_table.h File Reference

Temporary table handling functions. More...

#include <sys/types.h>
#include "my_base.h"
#include "my_inttypes.h"
#include "sql/item.h"
#include "sql/temp_table_param.h"

Go to the source code of this file.


enum  enum_internal_tmp_mem_storage_engine { TMP_TABLE_MEMORY , TMP_TABLE_TEMPTABLE }


TABLEcreate_tmp_table (THD *thd, Temp_table_param *param, const mem_root_deque< Item * > &fields, ORDER *group, bool distinct, bool save_sum_fields, ulonglong select_options, ha_rows rows_limit, const char *table_alias)
bool open_tmp_table (TABLE *table)
TABLEcreate_tmp_table_from_fields (THD *thd, List< Create_field > &field_list, bool is_virtual=true, ulonglong select_options=0, const char *alias=nullptr)
 Create an, optionally reduced, TABLE object with properly set up Field list from a list of field definitions. More...
bool create_ondisk_from_heap (THD *thd, TABLE *table, int error, bool insert_last_record, bool ignore_last_dup, bool *is_duplicate)
 If a MEMORY table gets full, create a disk-based table and copy all rows to this. More...
void close_tmp_table (TABLE *table)
 Close a temporary table at end of preparation or execution. More...
void free_tmp_table (TABLE *table)
 Free temporary table. More...
TABLEcreate_duplicate_weedout_tmp_table (THD *thd, uint uniq_tuple_length_arg, SJ_TMP_TABLE *sjtbl)
 Create a temporary table to weed out duplicate rowid combinations. More...
bool setup_tmp_table_handler (THD *thd, TABLE *table, ulonglong select_options, bool force_disk_table=false, bool schema_table=false)
 Helper function to create_tmp_table_* family for setting up table's SE. More...
bool instantiate_tmp_table (THD *thd, TABLE *table)
 Instantiates temporary table. More...
Fieldcreate_tmp_field (THD *thd, TABLE *table, Item *item, Item::Type type, Func_ptr_array *copy_func, Field **from_field, Field **default_field, bool group, bool modify_item, bool table_cant_handle_bit_fields, bool make_copy_field)
 Create field for temporary table. More...
Fieldcreate_tmp_field_from_field (THD *thd, const Field *org_field, const char *name, TABLE *table, Item_field *item)
 Lifecycle management of internal temporary tables. More...
void get_max_key_and_part_length (uint *max_key_length, uint *max_key_part_length, uint *max_key_parts)
 Get the minimum of max_key_length and max_key_part_length between HEAP engine and internal_tmp_disk_storage_engine. More...
void init_cache_tmp_engine_properties ()
 Initialize the storage engine properties for the alternative temporary table storage engines. More...
void encode_innodb_position (uchar *rowid_bytes, uint length, ha_rows row_num)
 Encode an InnoDB PK in 6 bytes, high-byte first; like InnoDB's dict_sys_write_row_id() does. More...
bool reposition_innodb_cursor (TABLE *table, ha_rows row_num)
 Helper function for create_ondisk_from_heap(). More...

Detailed Description

Temporary table handling functions.

Enumeration Type Documentation

◆ enum_internal_tmp_mem_storage_engine


Function Documentation

◆ close_tmp_table()

void close_tmp_table ( TABLE table)

Close a temporary table at end of preparation or execution.

Any buffers associated with the table will be released. When tmp_open_count reaches zero, the following will happen:

  • If table contents has been created, it will be deleted. When tmp_handler_count reaches zero, the following will happen:
  • The storage handler will be deleted and the plugin will be released.
tableTable reference

◆ create_duplicate_weedout_tmp_table()

TABLE * create_duplicate_weedout_tmp_table ( THD thd,
uint  uniq_tuple_length_arg,

Create a temporary table to weed out duplicate rowid combinations.

thdThread handle
uniq_tuple_length_argLength of the table's column
sjtblUpdate sjtbl->[start_]recinfo values which will be needed if we'll need to convert the created temptable from HEAP to MyISAM/Maria.


Create a temporary table to weed out duplicate rowid combinations. The table has a single column that is a concatenation of all rowids in the combination.

Depending on the needed length, there are two cases:

  1. When the length of the column < max_key_length:


  2. Otherwise (not a valid SQL syntax but internally supported):


The code in this function was produced by extraction of relevant parts from create_tmp_table().

created table NULL on error

◆ create_ondisk_from_heap()

bool create_ondisk_from_heap ( THD thd,
TABLE wtable,
int  error,
bool  insert_last_record,
bool  ignore_last_dup,
bool *  is_duplicate 

If a MEMORY table gets full, create a disk-based table and copy all rows to this.

[in]thdTHD reference
[in]wtableTable reference being written to
[in]errorReason why inserting into MEMORY table failed.
[in]insert_last_recordIf true, the last record(table->record[0]) is inserted into the newly created table after copying all the records from the temp table. If false, the last record is not inserted and the parameters ignore_last_dup, is_duplicate are ignored.
[in]ignore_last_dupIf true, ignore duplicate key error for last inserted key (see detailed description below).
[out]is_duplicateIf non-NULL and ignore_last_dup is true, return true if last key was a duplicate, and false otherwise.

Function can be called with any error code, but only HA_ERR_RECORD_FILE_FULL will be handled, all other errors cause a fatal error to be thrown. The function creates a disk-based temporary table, copies all records from the MEMORY table into this new table, deletes the old table and switches to use the new table within the table handle. The function uses table->record[1] as a temporary buffer while copying.

If the parameter insert_last_record is true, this function assumes that table->record[0] contains the row that caused the error when inserting into the MEMORY table (the "last row"). After all existing rows have been copied to the new table,the last row is attempted to be inserted as well. If ignore_last_dup is true, this row can be a duplicate of an existing row without throwing an error. If is_duplicate is non-NULL, an indication of whether the last row was a duplicate is returned.

If the parameter insert_last_record is false, this function makes no assumptions on the operation and will not try an insert of the last record(table->record[0]). The caller is expected to handle the operation after moving to disk.

that any index/scan access initialized on the MEMORY 'wtable' is not replicated to the on-disk table - it's the caller's responsibility. However, access initialized on other TABLEs, is replicated.

If 'wtable' has other TABLE clones (example: a multi-referenced or a recursive CTE), we convert all clones; if an error happens during conversion of clone B after successfully converting clone A, clone A and B will exit from the function with a TABLE_SHARE corresponding to the pre-conversion table ("old" TABLE_SHARE). So A will be inconsistent (for example s->db_type() will say "MEMORY" while A->file will be a disk-based engine). However, as all callers bail out, it is reasonable to think that they won't be using the TABLE_SHARE except in free_tmp_table(); and free_tmp_table() only uses properties of TABLE_SHARE which are common to the old and new object (reference counts, MEM_ROOT), so that should work. Solutions to fix this cleanly:

  • allocate new TABLE_SHARE on heap instead of on stack, to be able to exit with two TABLE_SHAREs (drawback: more heap memory consumption, and need to verify all exit paths are safe),
  • close all TABLEs if error (but then callers and cleanup code may be surprised to find already-closed tables so they would need fixing). To lower the risk of error between A and B: we expect most errors will happen when copying rows (e.g. read or write errors); so we convert 'wtable' (which does the row copying) first; if it fails, the A-B situation is avoided and we can properly exit with the old TABLE_SHARE.
true if error.

◆ create_tmp_field()

Field * create_tmp_field ( THD thd,
TABLE table,
Item item,
Item::Type  type,
Func_ptr_array copy_func,
Field **  from_field,
Field **  default_field,
bool  group,
bool  modify_item,
bool  table_cant_handle_bit_fields,
bool  make_copy_field 

Create field for temporary table.

thdThread handler
tableTemporary table
itemItem to create a field for
typeType of item (normally item->type)
copy_funcIf set and item is a function, store copy of item in this array
from_fieldif field will be created using other field as example, pointer example field will be written here
default_fieldIf field has a default value field, store it here
group1 if we are going to do a relative group by on result
modify_item1 if item->result_field should point to new item. This is relevant for how fill_record() is going to work: If modify_item is 1 then fill_record() will update the record in the original table. If modify_item is 0 then fill_record() will update the temporary table
table_cant_handle_bit_fieldsif table can't handle bit-fields and bit-fields shall be converted to long
See also
make_copy_fieldif true, a pointer of the result field should be stored in from_field, otherwise the item should be wrapped in Func_ptr and stored in copy_func
Return values
NULLOn error.

◆ create_tmp_field_from_field()

Field * create_tmp_field_from_field ( THD thd,
const Field org_field,
const char *  name,
TABLE table,
Item_field item 

Lifecycle management of internal temporary tables.

An internal temporary table is represented by a TABLE_SHARE object.

The interface to an internal temporary table is through one or more TABLE objects, of which at most one TABLE object is a writer object, the remaining TABLE objects are reader objects. Each TABLE object points to the TABLE_SHARE. TABLE_SHARE::ref_count counts the number of TABLE objects that points to it.

The TABLE, TABLE_SHARE and associated objects (e.g Field objects) are created in a dedicated mem_root. This mem_root is deleted when the TABLE_SHARE object is deleted.

Initially, an internal temporary table is created with one TABLE_SHARE object and one TABLE object. The table is created with no file handler (storage engine) and in the "deleted" state. Later, more TABLE objects may be created against the table, and TABLE_SHARE::ref_count is increased.

An internal temporary table may be instantiated and used multiple times, typically once per execution of a statement.

To instantiate a table, call instantiate_tmp_table(). This function will first assign and lock a storage engine using setup_tmp_table_handler(). The locked engine is assigned to TABLE_SHARE::db_plugin and the file handler is assigned to TABLE::file. After this, calling TABLE::has_storage_handler() reports true.

After this, the table contents is created by calling TABLE::file->create() and the table is opened by calling open_tmp_table(), which itself calls TABLE::file->ha_open(), and sets the TABLE::created flag.

Thus, opening a temporary table is a two-stage operation:

  1. assign and lock a storage engine, and
  2. create the table contents.

Since a temporary table may be in any of the two stages, we use two counter members in the TABLE_SHARE to count the number of TABLEs in each of the stages: tmp_handler_count and tmp_open_count. tmp_handler_count is incremented in setup_tmp_table_handler(). tmp_open_count is incremented in open_tmp_table().

To open an already instantiated table, assign a storage handler by calling setup_tmp_table_handler(), then call open_tmp_table() which will again increment TABLE_SHARE::tmp_open_count and set TABLE::created.

Insert, update, delete and read rows using the active TABLE handlers.

After use, close all active TABLE handlers by calling close_tmp_table(). For simplicity, we may also call close_tmp_table() on a non-active TABLE, as it will check whether a storage handler has been assigned.

If the table is created, TABLE_SHARE::tmp_open_count is decremented. If there are no remaining active TABLE objects, delete the table contents by calling TABLE::file->ha_drop_table(), otherwise close it by calling TABLE::file->ha_close(). Set status of the TABLE to deleted and delete the storage handler. If there are no remaining active tables and the storage engine is still locked, unlock the plugin and disassociate it from the TABLE_SHARE object, and decrement TABLE_SHARE::tmp_handler_count.

After the final instantiation of an internal temporary table, call free_tmp_table() for all associated TABLE objects.

free_tmp_table() can only be called on a non-instantiated temporary table (but handlers may be assigned for other TABLE objects to the same table).. It will decrement TABLE_SHARE::ref_count and the final call will also remove the temporary table's mem_root object. Create field for temporary table from given field.

thdThread handler
org_fieldField from which new field will be created
nameNew field name
tableTemporary table
itemIf item != NULL then fill_record() will update the record in the original table. If item == NULL then fill_record() will update the temporary table
Return values
NULLon error

◆ create_tmp_table()

TABLE * create_tmp_table ( THD thd,
Temp_table_param param,
const mem_root_deque< Item * > &  fields,
ORDER group,
bool  distinct,
bool  save_sum_fields,
ulonglong  select_options,
ha_rows  rows_limit,
const char *  table_alias 

When true, enforces unique constraint (by adding a hidden hash field and creating a key over this field) when: (1) unique key is too long, or (2) number of key parts in distinct key is too big, or (3) the caller has requested it. (4) we have INTERSECT or EXCEPT, i.e. not UNION.

◆ create_tmp_table_from_fields()

TABLE * create_tmp_table_from_fields ( THD thd,
List< Create_field > &  field_list,
bool  is_virtual,
ulonglong  select_options,
const char *  alias 

Create an, optionally reduced, TABLE object with properly set up Field list from a list of field definitions.

When is_virtual arg is true: The created table doesn't have a table handler associated with it, has no keys, no group/distinct, no copy_funcs array. The sole purpose of this TABLE object is to use the power of Field class to read/write data to/from table->record[0]. Then one can store the record in any container (RB tree, hash, etc). The table is created in THD mem_root, so are the table's fields. Consequently, if you don't BLOB fields, you don't need to free it. When is_virtual is false: This function creates a normal tmp table out of fields' definitions, rather than from lst of items. This is the main difference with create_tmp_table. Also the table created here doesn't do grouping, doesn't have indexes and copy_funcs/fields. The purpose is to be able to create result table for table functions out of fields' definitions without need in intermediate list of items.

thdconnection handle
field_listlist of column definitions
is_virtualif true, then it's effectively only a record buffer with wrapper, used e.g to store vars in SP if false, then a normal table, which can hold records, is created
select_optionsoptions for non-virtual tmp table
aliastable's alias
0 if out of memory, TABLE object in case of success

◆ encode_innodb_position()

void encode_innodb_position ( uchar rowid_bytes,
uint  length,
ha_rows  row_num 

Encode an InnoDB PK in 6 bytes, high-byte first; like InnoDB's dict_sys_write_row_id() does.

rowid_byteswhere to store the result
lengthhow many available bytes in rowid_bytes
row_numPK to encode

◆ free_tmp_table()

void free_tmp_table ( TABLE table)

Free temporary table.

When ref_count reaches zero, the table's mem_root allocator is deleted.

tableTable reference

◆ get_max_key_and_part_length()

void get_max_key_and_part_length ( uint *  max_key_length,
uint *  max_key_part_length,
uint *  max_key_parts 

Get the minimum of max_key_length and max_key_part_length between HEAP engine and internal_tmp_disk_storage_engine.

Get the minimum of max_key_length and max_key_part_length between HEAP engine and internal_tmp_disk_storage_engine.

The minimum is between HEAP engine and internal_tmp_disk_storage_engine.

[out]max_key_lengthMinimum of max_key_length
[out]max_key_part_lengthMinimum of max_key_part_length
[out]max_key_partsMinimum of max_key_parts

◆ init_cache_tmp_engine_properties()

void init_cache_tmp_engine_properties ( )

Initialize the storage engine properties for the alternative temporary table storage engines.

◆ instantiate_tmp_table()

bool instantiate_tmp_table ( THD thd,
TABLE table 

Instantiates temporary table.

thdThread handler
tableTable object that describes the table to be instantiated Creates temporary table and opens it.
false if success, true if error

◆ open_tmp_table()

bool open_tmp_table ( TABLE table)

◆ reposition_innodb_cursor()

bool reposition_innodb_cursor ( TABLE table,
ha_rows  row_num 

Helper function for create_ondisk_from_heap().

Our InnoDB on-disk intrinsic table uses an autogenerated auto-incrementing primary key:

  • first inserted row has pk=1 (see dict_table_get_next_table_sess_row_id()), second has pk=2, etc
  • ha_rnd_next uses a PK index scan so returns rows in PK order
  • position() returns the PK
  • ha_rnd_pos() takes the PK in input.
tabletable read by cursor
row_numfunction should position on the row_num'th row in insertion order.

◆ setup_tmp_table_handler()

bool setup_tmp_table_handler ( THD thd,
TABLE table,
ulonglong  select_options,
bool  force_disk_table,
bool  schema_table 

Helper function to create_tmp_table_* family for setting up table's SE.

thdThread handler
tabletable to allocate SE for
select_optionsOptions that may control storage engine selection
force_disk_tabletrue <=> Use InnoDB
schema_tablewhether the table is a schema table
false if success, true if error
In a prepared statement, both preparation and execution may use this function, for a same TABLE. Execution always uses force_disk_table=schema_table=false; this may be inconsistent with what was used at preparation, but it's ok in fact:
  • force_disk_table=true is for semijoin duplicate elimination table, which is execution-only
  • schema_table=true is for schema tables, and they're re-created at each execution.