MySQL 9.0.1
Source Code Documentation
Common_table_expr Class Reference

After parsing, a Common Table Expression is accessed through a Table_ref. More...

#include <table.h>

Public Member Functions

 Common_table_expr (MEM_ROOT *mem_root)
 
TABLEclone_tmp_table (THD *thd, Table_ref *tl)
 Produces, from the first tmp TABLE object, a clone TABLE object for Table_ref 'tl', to have a single materialization of multiple references to a CTE. More...
 
bool substitute_recursive_reference (THD *thd, Query_block *sl)
 Replaces the recursive reference in query block 'sl' with a clone of the first tmp table. More...
 
void remove_table (Table_ref *tr)
 Remove one table reference. More...
 
bool clear_all_references ()
 Empties the materialized CTE and informs all of its clones. More...
 

Public Attributes

Mem_root_array< Table_ref * > references
 All references to this CTE in the statement, except those inside the query expression defining this CTE. More...
 
bool recursive
 True if it's a recursive CTE. More...
 
Mem_root_array< Table_ref * > tmp_tables
 List of all TABLE_LISTSs reading/writing to the tmp table created to materialize this CTE. More...
 
LEX_STRING name
 Name of the WITH block. Used only for EXPLAIN FORMAT=tree. More...
 

Detailed Description

After parsing, a Common Table Expression is accessed through a Table_ref.

This class contains all information about the CTE which the Table_ref needs.

Note
that before and during parsing, the CTE is described by a PT_common_table_expr.

Constructor & Destructor Documentation

◆ Common_table_expr()

Common_table_expr::Common_table_expr ( MEM_ROOT mem_root)
inline

Member Function Documentation

◆ clear_all_references()

bool Common_table_expr::clear_all_references ( )

Empties the materialized CTE and informs all of its clones.

◆ clone_tmp_table()

TABLE * Common_table_expr::clone_tmp_table ( THD thd,
Table_ref tl 
)

Produces, from the first tmp TABLE object, a clone TABLE object for Table_ref 'tl', to have a single materialization of multiple references to a CTE.

How sharing of a single tmp table works

There are several scenarios. (1) Non-recursive CTE referenced only once: nothing special. (2) Non-recursive CTE referenced more than once:

  • multiple TABLEs, one TABLE_SHARE.
  • The first ref in setup_materialized_derived() calls create_tmp_table(); others call open_table_from_share().
  • The first ref in create_derived() calls instantiate_tmp_table() (which calls handler::create() then open_tmp_table()); others call open_tmp_table(). open_tmp_table() calls handler::open().
  • The first ref in materialize_derived() evaluates the subquery and does all writes to the tmp table.
  • Finally all refs set up a read access method (table scan, index scan, index lookup, etc) and do reads, possibly interlaced (example: a nested-loop join of two references to the CTE).
  • The storage engine (MEMORY or InnoDB) must be informed of the uses above; this is done by having TABLE_SHARE::ref_count>=2 for every handler::open() call. (3) Recursive CTE, referenced once or more than once: All of (2) applies, where the set of refs is the non-recursive ones (a recursive ref is a ref appearing in the definition of a recursive CTE). Additionally:
  • recursive refs do not call setup_materialized_derived(), create_derived(), materialize_derived().
  • right after a non-recursive ref has been in setup_materialized_derived(), its recursive refs are replaced with clones of that ref, made with open_table_from_share().
  • the first non-recursive ref in materialized_derived() initiates the with-recursive algorithm: its recursive refs call open_tmp_table(). Then writes (to the non-recursive ref) and reads (from the recursive refs) happen interlaced.
  • a particular recursive ref is the UNION table, if UNION DISTINCT is present in the CTE's definition: there is a single TABLE for it, writes/reads to/from it happen interlaced (writes are done by Query_result_union::send_data(); reads are done by the fake_query_block's JOIN).
  • Finally all non-recursive refs set up a read access method and do reads, possibly interlaced.
  • The storage engine (MEMORY or InnoDB) must be informed of the uses above; this is done by having TABLE_SHARE::ref_count>=2 for every handler::open() call.
  • The Server code handling tmp table creation must also be informed: see how Query_result_union::create_result_table() disables PK promotion.

How InnoDB manages the uses above

The storage engine needs to take measures so that inserts and reads don't corrupt each other's behaviour. In InnoDB that means two things (

See also
row_search_no_mvcc()): (a) A certain way to use its cursor when reading (b) Making the different handlers inform each other when one insertion modifies the structure of the index tree (e.g. splits a page; this triggers a refreshing of all read cursors).

Requirements on tmp tables used to write/read CTEs

The internal tmp table must support a phase where table scans and insertions happen interlaced, either issued from a single TABLE or from multiple TABLE clones. If from a single TABLE, that object does repetitions of {"write rows" then "init scan / read rows / close scan"}. If from multiple TABLEs, one does "write rows", every other one does "init scan / read rows / close scan". During this, neither updates, nor deletes, nor any other type of read access than table scans, are allowed on this table (they are allowed after the phase's end). Any started table scan on this table:

  • must remember its position between two read operations, without influence from other scans/inserts;
  • must return rows inserted before and after it started (be catching up continuously) (however, when it reports EOF it is allowed to stop catching up and report EOF until closed).
  • must return rows in insertion order.
  • may be started from the first record (ha_rnd_init, ha_rnd_next) or from the record where the previous scan was ended (position(), ha_rnd_end, [...], ha_rnd_init, ha_rnd_pos(saved position), ha_rnd_next).
  • must return positions (handler::position()) which are stable if a write later occurs, so that a handler::rnd_pos() happening after the write finds the same record.

Cursor re-positioning when MEMORY is converted to InnoDB

See create_ondisk_from_heap(). A requirement is that InnoDB is able to start a scan like this: rnd_init, rnd_pos(some PK value), rnd_next.

Parameters
thdThread handler
tlTable reference wanting the copy
Returns
New clone, or NULL if error

◆ remove_table()

void Common_table_expr::remove_table ( Table_ref tr)

Remove one table reference.

◆ substitute_recursive_reference()

bool Common_table_expr::substitute_recursive_reference ( THD thd,
Query_block sl 
)

Replaces the recursive reference in query block 'sl' with a clone of the first tmp table.

Parameters
thdThread handler
slQuery block
Returns
true if error

Member Data Documentation

◆ name

LEX_STRING Common_table_expr::name

Name of the WITH block. Used only for EXPLAIN FORMAT=tree.

◆ recursive

bool Common_table_expr::recursive

True if it's a recursive CTE.

◆ references

Mem_root_array<Table_ref *> Common_table_expr::references

All references to this CTE in the statement, except those inside the query expression defining this CTE.

In other words, all non-recursive references.

◆ tmp_tables

Mem_root_array<Table_ref *> Common_table_expr::tmp_tables

List of all TABLE_LISTSs reading/writing to the tmp table created to materialize this CTE.

Due to shared materialization, only the first one has a TABLE generated by create_tmp_table(); other ones have a TABLE generated by open_table_from_share().


The documentation for this class was generated from the following files: