WL#7540: Move properties from TABLE to TABLE_LIST: table number and map
Status: Complete
The lifetime of a TABLE object is one execution, whereas a TABLE_LIST object lasts as long as the prepared statement it is used in. Hence, it may be reasonable to collect table-specific properties that are constant through all executions of a query in the TABLE_LIST object. Examples of such properties are table number and table map. This will also focus the TABLE_LIST object as the optimizer's view of a table throughout query preparation and optimization. The benefit of this change is that we only need to calculate these properties once, and less work needs to be done for each execution. User Documentation ================== Code refactoring. No user documentation needed.
This is a refactoring worklog. There are no functional changes. Because some property calculations are carried out when a query is prepared, rather than for every execution, a small performance gain is expected. However, it may be too small to be noticed in tests.
Table number and table map -------------------------- Table number and table map are two properties of a table used solely by the optimizer. They are used to determine order of expression evaluation, and to transform outer join to inner join. The properties are constant throughout the lifetime of a query, but as they are located in struct TABLE, they must be calculated for each execution. Notice however that not all tables have a corresponding TABLE_LIST object. Due to the way we calculate used tables information, all fields represented in the optimizer require a table number. If there is no corresponding TABLE_LIST, we always assign table number zero, and the corresponding table map 1. This is OK, as it is the sole table accessed on this "level". These tables have no corresponding TABLE_LIST object: 1. Intrinsic temporary table. These are helper tables used during sorting and grouping. When read, these tables are always used alone, except for references to outer tables, hence it is OK to assign the table number zero. Example of use: We have a subquery with three tables that are joined together. The join also includes an outer reference. The subquery is grouped, thus the result of the join must be collected in an intrinsic table, which is then grouped, before the result can be used. During join, the three tables are numbered 0, 1 and 2. Outer references are also read as part of the join. After join, the intrinsic table is populated with the result, and the grouping is performed. After grouping, data is then read from this intrinsic table, which is now given the table number zero. As we see, this table is never accessed together with the original table zero used in the subquery, hence it is acceptable to use table number zero for this table. 2. Stored procedure variables collection The local variables in a stored procedure are represented as a collection of fields in an imaginary table. This imaginary table is represented with a TABLE object but no TABLE_LIST object. JOIN_TAB changes ---------------- JOIN_TAB is the class that represents a table with its position in the join order. It contains a pointer "table" that references the TABLE representation of a table. A TABLE_LIST representation of the table can be reached through table->pos_in_table_list. In order to have quicker access to the TABLE_LIST object, JOIN_TAB is extended with a field table_ref pointing to the TABLE_LIST. Notice that table_ref is NULL for intrinsic temporary tables. Item_field changes ------------------ An essential property of an Item (and hence Item_field) is used_tables(), which makes it possible to calculate the optimal position for the evaluation of the Item. used_tables() for Item_field is currently calculated based on the field pointer, which references the TABLE object the field belongs to. The table map information is found here. When table map information is moved to TABLE_LIST, it makes sense to access TABLE_LIST directly from Item_field through a new table_ref pointer. Notice that this pointer may be NULL, e.g. for intrinsic temporary tables used by the optimizer. We assume that this table is the sole one on this "level" of execution (it is certainly so for an intrinsic table), and it will represent table number zero on that "level". Removal of references to pos_in_table_list. ------------------------------------------- Ideally, we should access class objects in a hierarchy, from top to bottom. For tables and items, these might mean that TABLE is always accessed through TABLE_LIST, and that the reverse pointer pos_in_table_list is not needed. It is a goal of this WL to eliminate as many as possible of these pointer references, as long as no major refactoring is needed. Hopefully, this reverse pointer can be completely removed later. - opt_sum_query() may use TABLE_LIST more internally. This can be done by minor changes to some internal interfaces: find_key_for_maxmin(), reckey_in_range(), maxmin_in_range() and matching_cond(). - Replace TABLE pointer in Item_func_match with TABLE_LIST pointer This can be done by a minor change to the interface allows_search_on_nonindexed_columns(), and by storing a TABLE_LIST pointer in class Item_func_match instead o a TABLE pointer. - Replace TABLE pointer in Key_use with TABLE_LIST pointer. Also replace the Field pointer in Key_field with Item_field, as this is everywhere directly available and TABLE_LIST cam easily be extracted from Item_field.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.