WL#5275: Process subqueries in FROM clause in the same way as view
Status: In-Progress
Both derived tables and views (also known as named derived tables) are represented in the server in the same way, but the processing is a bit different in the preparation stage. These differences are caused partly by slightly different nature, but mostly by mis-design. Some design flaws were addressed in WL#5274. The purpose of this worklog is to unify handling of derived tables and views as much as possible. Currently, subqueries in the FROM clause (derived tables) are unconditionally materialized while views created from the same query expressions are sometimes materialized and sometimes merged into the outer query. This behaviour, beside being inconsistent, can lead to a serious performance penalty. Effectively, materialization means that a derived table or view is materialized into an intermediate temporary table. This is often causing significant slowdown of query execution. One of numerous examples is the BUG#59203 - SERIOUS OPTIMIZER FAILURE WHEN QUERY IS ENCAPSULATED IN ADDITIONAL SELECT. In this bug, a user reports slowdown of almost 1000 times - from 0.03s to 29s. Another problem with unnecessary materialization is that currently the optimizer is unable to push any optimizations down to derived tables, e.g. constant and equality propagation do not work trough inner-outer query boundary. This could be illustrated with the following example: a condition in the outer query that could significantly reduce amount of data needed to be processed by subquery can be applied too late - server has to materialize a large derived table first, and use the restrictive condition when reading from this materialized derived table. Obviously, this can lead to a significant performance penalty. Users work around this flaw by wrapping derived tables into views, which don't have such problems. This isn't always possible due to a variety of reasons, e.g such queries could be auto-generated by some software that cannot be altered. The solution to this problem is to unify the preparation stage of views and derived tables so that the merge procedure which is available for views can also be applied to derived tables. Currently, outer references are not allowed inside derived tables. To process outer references in derived tables, resolving order needs to be changed so that outer references can be resolved in the context of a resolved outer query. This means that column references of the outer query may be looked up when resolving the derived table, by following outer context pointers of name resolution contexts used by the derived table to locate the outer references. This is an example of a query involving a derived table with an outer reference: SELECT * FROM t1 JOIN t2 ON t1.a=t2.a WHERE t1.a IN (SELECT it.a FROM (SELECT a FROM t3 WHERE t1.b=t3.c) AS it); Notice that a derived table may not refer to tables on the same level. Thus, this query is invalid: SELECT * FROM t1, (SELECT a FROM t3 WHERE t1.b=t3.c) AS dt; This WL will make it possible to resolve outer references in derived tables by changing the resolver order, but actual implementation will be done in WL#461. Bug reports that may be affected by this work: BUG#59203 SERIOUS OPTIMIZER FAILURE WHEN QUERY IS ENCAPSULATED IN ADDITIONAL SELECT BUG#70220 GROUPING WITH A VIEW MAY REPORT 'INVALID USE OF GROUP FUNCTION' BUG#67631 Subquery in view works slower than if run without view BUG#36802 MySQL only use covered index when selecting from a view (or subquery) with union (Pushdown of conditions into derived tables, not scheduled) BUG#41994 Optimizer do full table scan when use distinct with group by in subquery (Need to check if this is affected) BUG#67631 Subquery in view works slower than if run without view (Need to check if this is affected) TODO list / functionality not implemented in this worklog: ---------------------------------------------------------- - Resolve outer references for derived tables. - Select only relevant columns from materialized derived tables When preparing a derived table or view for materialization, only the selected columns that are also referenced in the outer query should be marked in the read set. - Do not propagate ordering for derived tables to outer query (see F16). - Allow sub queries in the SELECT list for merged derived tables (see F9).
Functional requirements ======================= F1. This worklog applies to derived tables, which are defined by a query expression over a set of other tables (derived tables and base tables), and may be ordered using an ORDER BY clause and constrained using LIMIT and OFFSET clauses. Derived tables may be unnamed or named. Named derived tables are commonly referred to as views. For simplicity, we refer to views and unnamed derived tables as just "derived tables" when this is unlikely to be confused. F2. Allow both named and unnamed derived tables to be merged into outer query block. F3. These statement types allow derived tables to be merged into an outer query block: SELECT, UPDATE, DELETE, INSERT, REPLACE, LOAD, CREATE TABLE with SELECT. Unnamed derived tables cannot be used as update target tables in UPDATE, DELETE, INSERT, REPLACE and LOAD statements. F4. Derived tables with a UNION in its outer-most query expression cannot be merged into the outer query block. Thus, only query expressions consisting of a single query block can be merged (however, that query block may itself contain subqueries). F5. Derived tables that are aggregated cannot be merged into the outer query block. A derived table is aggregated if it consists of a single query block and contains an aggregation function in the SELECT list, or has a GROUP BY clause, or has a HAVING clause, or specifies DISTINCT in the SELECT list. F6. Derived tables with zero table references cannot be merged. F7. Derived tables that have OFFSET or LIMIT clauses cannot be merged. F8. Derived tables that cause side-effects, such as setting user variables, cannot be merged. (See WL#3274 revno 2815). F9. Derived tables that have subqueries in the SELECT list cannot be merged. F10. Views that are created with ALGORITHM=TEMPTABLE will never be merged. Views that are created with ALGORITHM=MERGE will be merged unless some other rule prevents them from being merged. F11. The optimizer_switch derived_merge controls whether derived tables and views will be merged into the outer query block, unless some other rule prevents them from being merged. For a view, an ALGORITHM directive takes precedence over the derived_merge setting. The value "on" enables merging, and the value "off" disables merging. The default value is "on", and this can also be selected with the value "default". F12. After merging, a query block cannot reference more than 61 base tables. This is an implementation limit, mandated by the use of a 64-bit field to represent tables in a query block. If merging a derived table/view would cause more than 61 tables in a query block, materialization will be chosen instead. F13. Merging of derived tables into outer query blocks should be processed at the same time as other query block merging strategies, such as semi-join transformation. F14. Merging of derived tables should be processed in a bottom up manner, just like other query preparation strategies. E.g in this query: SELECT * FROM (SELECT * FROM (SELECT * FROM t1)); table t1 is first merged into its immediately containing query block, then it is merged into the outer-most query block. F15. All derived tables that cannot be merged are materialized before they are needed into a temporary table. This temporary table takes part in query optimization and execution as if it was a regular table. (see WL#5274 for execution details). F16. An ORDER BY clause that is contained within a derived table or view definition is ignored, unless it is also accompanied by a LIMIT or OFFSET clause. Non-functional Requirements =========================== NF1. Performance that does not involve views or derived tables should not be affected. NF2. Performance degradation up to 3% is expected for queries involving merged views or derived tables. This is because all views and derived tables are fully resolved before merging. A major part of this is creating a JOIN object, which is completely unnecessary for merged derived tables, but still required due to our resolved code.
1. Functionality changes 1.1 Merging of unnamed derived tables The major functionality change is that named and unnamed derived tables are treated equally regarding how they are merged with the outer query block. Before this worklog, only named views could be merged with the outer query block, a derived table would always be materialized. This worklog also describes the criteria for when a derived table can be merged into the outer query block and when it will be materialized. The criteria are largely unchanged, but are specified for documentation purposes. In addition to existing criteria, an optimizer_switch that specifies whether to merge or materialize derived tables will be implemented. 1.2 Allowing full resolution of view references Currently, view references are expanded in the function mysql_make_view() which is called during opening of tables. If the view conforms to the rules for view merging, its contents are merged directly into the query block of the embedding query. After this worklog, view merging no longer takes place in mysql_make_view(). Instead, a full resolution is performed for all derived tables underlying a query block. This makes sure that the view is actually syntactically sane (earlier processing relied on separate code to validate this), and it gives us the opportunity to consider the full context of the query. This means that we can resolve outer references properly (see below) and take decisions on which derived tables to merge more intelligently than a pure heuristics based approach as is done currently. 1.3 Consolidation of query block merging We currently support two ways of query block merging: - Derived tables merging - Semi-join merging The techniques are quite similar and in some ways the two techniques may compete and prevent each other. Currently, the two techniques are applied at very different stages in the query optimization. Derived table merging is applied when opening tables, whereas semi-join merging is applied during the optimization phase. The goal of this worklog is to prepare for query block merging to take place at logically the same place. Currently, the natural place is in the function that resolves the outer query block, ie SELECT_LEX::prepare(). When all subqueries below this query block have been resolved, look at them all and determine which to merge into the outer query block. This goal will be accomplished for derived table and view merging in this worklog. The same goal has already been accomplished for semi-join transformation in move permanent transformation from JOIN::optimize() to JOIN::prepare(). 1.4 Column privileges are validated differently Privileges to columns of a view have been validated in two phases: first, a want_privilege has been set up for each table and view, then each column has been validated against this privilege, during resolving of columns. This practice is sufficient for e.g SELECT, as all SELECT_ACL privilege is required uniformly for all columns. However, for INSERT we may want INSERT_ACL for columns that are inserted and UPDATE_ACL for columns specified in ON DUPLICATE UPDATE. And for multi-table UPDATE, SELECT_ACL privilege is required for referenced columns and UPDATE_ACL is required for columns from those tables that are updated. This two-phase approach is error-prone, as there is quite a long distance between when the want_privilege is set and the column privileges are checked. A second challenge is that we want to resolve a view without a specific operation context, ie. without knowing whether the view is selected from, delete from, updated or inserted into. Neither do we know which columns from the view that are being used. Example: CREATE VIEW v AS SELECT a, b FROM t; INSERT INTO v(a) VALUES (10); In this INSERT statement, we need INSERT privileges for column a, but we can ignore column b. However, the resolver needs to validate that both a and b are in fact valid columns in the underlying table of view v. We try to solve this problem by delaying the privilege checking of view columns until the outer query is resolved. A third challenge is that we want to prepare for a future worklog where prepared statements are resolved once only. Currently, the resolver is called for each execution, and it is simple to check column privileges while resolving columns. When resolving occurs only once, we have to perform a dedicated privilege check for each execution. This is reasonably simple, but sometimes, "non-wrapped" view column references are used. Wrapped column references means that an Item_ref object is used for a view column, which points to an Item_field object describing the base table column, but non-wrapped view column references only contain the Item_field. Privilege checking needs all this information, as we have to ensure that we have proper privileges both to the view and to the base table. However, with "non-wrapped" items, we can only check column privileges for the base table, and not for view columns. We solve this problem by eliminating the use of "non-wrapped" view columns. 2.0 Test changes 2.1 Functionality coverage for derived tables and views All functionality specified in Functional Requirements is covered in derived.test and view.test. 2.2 Improved functionality coverage of column view privileges. Extensive testing of column privilege checking for all SELECT and data change operations has been added to view_grant.test. 2.3 Existing coverage tests Existing coverage tests with derived tables have several plan changes because they now use merging instead of materialization. Generally, these changes lead to less cost and less handler operations being executed. 2.4 Existing bug fix tests Bug fix tests that test materialization-related issues are executed with optimizer_switch='derived_merge=off' to preserve old behaviour. 2.5 Change in handling of CHECK TABLE A CHECK TABLE test in view.test around line 1970 changes the message from "error: Corrupt" to "status: Operation failed" for a non-existing function referenced inside a view. This change is due to the way views are resolved now: They used to be resolved when opening a table, now they are resolved within the regular resolver hierarchy. One way to get back the old message would be to report an error when bringing function into cache and the function does not exist. Now the code returns silently and wait until resolver time to set error.
1. Overall design 1.1 Resolving derived tables and view references A reference to a derived table or a view (hereafter only referred to as derived tables) is resolved during the regular recursive resolver stage implemented in SELECT_LEX::prepare(): 1. Derived tables are resolved after tables have been opened and after the local tables of a query block have been set up. 2. Transformations, such as merging derived tables into the enclosing query block, are done using a postfix traversal of the tree. This way, we ensure that all outer information is available when resolving a derived table, and when a derived table has been resolved, it has also been transformed appropriately. Example: A SELECT query with a derived table containing an IN subquery. The derived table can be merged and the IN subquery can be transformed to a semi-join: SELECT t1.a, dt.a from t1 JOIN (SELECT * FROM t2 WHERE t2.c IN (SELECT t3.d FROM t3 ) ) AS dt ON t1.a=dt.b WHERE ... Processing will be carried out like this: Resolve outer-most query block (JOIN of t1 and derived table) Resolve derived table dt Resolve inner query (subquery) Transform inner query into a semi-join, merge with derived table dt Transform derived table, merge with outer-most query block Indentations are meant to describe recursion levels. UPDATE, DELETE and INSERT operations are represented with a hierarchy of SELECT_LEX nodes, where the root node represents the UPDATE, DELETE or INSERT, and the inner nodes represent subqueries. The root nodes in these operations are processed similar to a SELECT query, in the sense that the root node is resolved first, followed by recursively resolving the inner nodes. This means that tables referred by the data change operation are set up before resolving the inner nodes. The difference lies in the way tables are processed (for a SELECT, tables are selected from, for an UPDATE, tables are updated, for a DELETE, tables are deleted from...), and the different clauses supported by said operations. Generally, we process derived tables of a query block using the function SELECT_LEX::resolve_derived(). This function operates in three steps (here we assume that there is at least one derived table to process): 1. Resolve the derived tables of the query block. Involves looking up tables, columns, resolving expressions, checking semantics, applying global transformations. The function TABLE_LIST::resolve_derived() does this. 2. Choose the derived tables to be merged into the current query block. This is currently done in the listed order, but could later be done in a more optimal way. This is carried out by SELECT_LEX::merge_derived(). 3. Set up the remaining derived tables for materialization, using TABLE_LIST::setup_materialized_derived(). See section 1.3.7 for more details. Operations that handle only one table, ie single-table UPDATE, DELETE and INSERT are simplified: If the table to be handled is a view, that view is resolved using TABLE_LIST::resolve_derived() and merged using SELECT_LEX::merge_derived() unconditionally. 1.2 Privilege handling For functional changes, see HLS section 1.4. 1.2.1 Current privilege handling This is how privilege handling is done in versions up to 5.6: - All statements have a precheck function that does rudimentary privilege checking on all tables and views referenced directly in the statement. SELECT statements require SELECT_ACL privilege, INSERT statements require INSERT_ACL privilege, etc. - View expansion is done recursively in open_tables(), through function mysql_make_view(). Underlying tables are opened. Table references for views are created. View text is expanded into table expressions. Security contexts are defined for the looked up table references. Rudimentary privilege check for these tables is performed (rudimentary = any privilege is accepted). - SELECT_LEX::prepare() further checks table privileges for the underlying tables of views, with privileges assigned based on the operation (ie SELECT, UPDATE, INSERT or DELETE). In addition, there are some specific table-level privilege checks for e.g multi-table UPDATE operations. - Finally, column privileges are checked twice for each column, first for the underlying table and second for the view. The wanted privilege is determined based on how the column is used in the current operation, (In a SELECT, the privilege is always SELECT_ACL. In an UPDATE it is UPDATE_ACL for columns in the SET clause and SELECT_ACL for all other columns, etc.) Notice that we have two levels of privilege checking: 1. Check table-level privileges by using check_access() and check_grant(). check_grant() will also set a want_privilege for later column privilege checks. These checks are done using precheck functions and scattered across resolving. 2. Check column-level privileges during resolving of conditions and expression lists, using the want_privilege assigned from the latest table privilege check on the table that the column belongs to. 1.2.2 Problems with the current privilege handling 1.2.2.1 Column privilege checks are dependent upon table privileges Column privilege checking does not supply the required privilege. Instead, it relies on the proper table privileges being set up earlier. As these calls may be stretched out across the optimizer code, maintaining the proper dependencies between these calls can be challenging. This procedure can be simplified if column privilege checking directly provide the privileges that are needed. 1.2.2.2 Column list of view will be resolved earlier than current practice Currently, views are resolved with SELECT_LEX::prepare() only when they are materialized. Merged views are resolved as part of the context they are used within. This WL proposes to determine whether to materialize or merge a view at a later stage. As a consequence, it is necessary to resolve all view references earlier. However, column privilege checking is currently closely associated with this function. This is OK for all conditions and groupings of the view, but for the select list, we do not know the context in which the selected columns will be used. In fact, we do not even know which of the columns of the view that will be used later. Thus, we have to change the way privileges are checked for the view columns, by delaying the privilege checking of the view until the place where the columns are used. For an expression belonging to the view definition, this is still the same place as before. For a view column that is used by an outer query, this privilege checking occurs when the outer query is resolved. Example: CREATE VIEW v1 AS SELECT a FROM t1 WHERE b > 0; UPDATE v1 SET a=5; Column b is checked for SELECT_ACL privilege when the view v1 is resolved. Column a is checked for UPDATE_ACL privilege when the outer query is resolved. Notice also that in the latter case, we actually check column privileges twice, both for v1.a and for t1.a, and both must support the UPDATE_ACL privilege. 1.2.2.3 Lack of support for repeated privilege checking Currently, privilege checking is interspersed with query resolving. However, we have a WL#6570 that attempts to refactor statement planning so that statement resolving happens only during statement preparation, even though the query is executed multiple times. Nevertheless, we will have to perform privilege checking for each execution, because the query planner is not signalled when privileges are granted or revoked. Thus, privilege checking might be performed like this: On query preparation, check privileges while resolving the query. (This will also ensure that non-prepared statement execution is still performant). On each execution, resolving is not necessary. Thus, we need to perform an explicit step that checks privileges for all involved tables, views and columns. (We must check privileges for the first execution, because we have no guarantee that some privileges have not been revoked between preparation and first execution). There is also another complication with repeated privilege checking: For the UPDATE query above, we need to check privileges both for v1.a and for t1.a. This is fairly simple when a column is represented by a "wrapped item", essentially an Item_field pointed to by an Item_ref for each recursively defined view. In the above case, Item_field would reference the table t1, and there would exist an Item_ref object that references this Item_field and also the view v1. With this structure, it is quite simple to check that one has the proper privileges for all involved tables and views. However, with non-wrapped items, only the Item_field object is exposed, and this does not carry sufficient information to checking privileges against views. 1.2.3 Privilege handling This is the new design for privilege handling: - All statements have a precheck function that does rudimentary privilege checking on all tables and views referenced in the statement. This is identical to the current handling. For fast column privilege checking, make sure that table privileges are copied to the TABLE_LIST::grant struct. - During view expansion, check privileges for all referenced views and tables. This is also the same as before, and does a SELECT_ACL check only. For fast column privilege checking, make sure that table privileges are copied to the TABLE_LIST::grant struct. - When resolving columns, make sure that we have proper privileges for each column. The desired privilege is determined from the context (ie select list, set clause, etc.) The privilege is checked against all involved views and tables. This check is similar both for an outer query and for subqueries, except that the select list of a subquery is not checked. Instead, columns selected from a view are privilege-checked in the context they are used in an outer query block. In order to have sufficient context for view privilege checking during execution of prepared statements, all view column references will be fully wrapped. See section 1.2.2.3 for explanation). In the future, when we have prepared statements that are resolved once, privilege checking will take place like this for each execution: - First, precheck all tables and views, just like for the preparation. - Second, precheck all tables and views referenced from views, just like for the preparation. - Third, check privileges for all involved columns, just like for the preparation. The following sections outline the privilege handling for DML statements that contain view references that are merged into the outer query. Assume a table "t": CREATE TABLE t(a INTEGER, b INTEGER); Assume a view "v": CREATE VIEW v AS SELECT * FROM t; 1.2.4 Processing SELECT statement: SELECT * FROM v WHERE a > 100; 1. Precheck query ==> Make sure that we have some SELECT_ACL privilege for view "v" 2. Open view "v" ==> Expand view, create query block and table reference for table "t". ==> Set security context for table "t" ==> Check ANY privileges for table "t" 3. Prepare query block for the view "v" ==> Check SELECT_ACL privileges for columns referenced in WHERE, GROUP BY, HAVING, ORDER BY. 4. Prepare query block for outer query: ==> Merge query block for the view "v" into outer query. ==> Check SELECT_ACL privileges for columns referenced in select list, WHERE, GROUP BY, HAVING, ORDER BY. 1.2.5 Processing INSERT statement: INSERT INTO v(a, b) VALUES (10, 100); 1. Precheck query ==> Make sure that we have some INSERT_ACL privilege for the view "v" ==> Make sure that we have some SELECT_ACL privilege for any other tables. 2. Open view "v" ==> Expand view, create query block and table reference for table "t". ==> Set security context for table "t" ==> Check that we have ANY privileges for table "t" 3. Prepare query block for view "v": ==> Check that we have SELECT_ACL privileges for columns referenced in WHERE. 4. Prepare query block for outer query: ==> Merge query block for the view "v" into outer query. ==> If INSERT ... SELECT, do as for SELECT for selected tables ==> Check INSERT_ACL privileges for columns in INSERT list ==> Check UPDATE_ACL privileges for columns in ON DUPLICATE list 1.2.6 Processing UPDATE statement UPDATE v SET b=5 WHERE a > 100; 1. Precheck query ==> Make sure that we have some UPDATE_ACL privilege for the view "v" ==> Make sure that we have some SELECT_ACL privilege for tables in subqueries. 2. Open view "v" ==> Expand view, create query block and table reference for table "t". ==> Set security context for table "t" ==> Check that we have ANY privileges for table "t" 3. Prepare query block for view "v" ==> Check SELECT_ACL privileges for columns referenced in WHERE clause. 4. Prepare query block for outer query: ==> Merge query block for view "v" into outer query. ==> Check UPDATE_ACL privileges for columns in SET clause. ==> Check SELECT_ACL privileges for columns referenced in SET expressions and in WHERE. 1.2.7 Processing DELETE statement DELETE FROM v WHERE a > 100; 1. Precheck query ==> Make sure that we have DELETE_ACL privilege for the view "v" ==> Make sure that we have some SELECT_ACL privilege for tables in subqueries. 2. Open view "v" ==> Expand view, create query block and table reference for table "t". ==> Set security context for table "t" ==> Check that we have ANY privileges for table "t" 3. Prepare query block for view "v" ==> Check SELECT_ACL privileges for columns referenced in WHERE clause. 4. Prepare query block for outer query: ==> Merge query block for view into outer query. ==> Check SELECT_ACL privileges for columns referenced in WHERE. 1.3 Processing a SELECT query This section outlines the overall processing of a SELECT query, as well as multi-table DELETE and UPDATE, which use basically the same data structures and the same processing functions. 1.3.1 Precheck of privileges After command dispatching, the first function called is select_precheck() which basically checks that we have some SELECT_ACL privileges for all referenced tables. The precheck function records a want_privilege in each table's GRANT_INFO object, which is later used for column privilege checking. (Multi-table UPDATE and DELETE have similar but different functions). 1.3.2 Opening tables Tables are opened by calling open_tables_for_query(). This function was formerly called open_normal_and_derived_tables(), but had to be renamed because it no longer opens and processes derived tables. The function opens base tables, views and referenced functions. There are no changes here, except that derived tables are processed later. For each view, the function mysql_make_view() is called (see section 1.3.3). 1.3.3 Creating view descriptors mysql_make_view() creates a LEX object and a SELECT_LEX object for a view reference, before attaching the LEX to the TABLE_LIST object of the view. In addition, SELECT_LEX objects are created for all subqueries of the view, and TABLE_LIST objects are created for each referenced table. Referenced tables may in turn be view references, which are opened recursively by the same function. This function is heavily refactored, see section 10.1 for details. In addition, the function no longer merges views into the outer query. This is now done as a part of resolving the outer query. The function checks that we have some privileges to the underlying tables of the view. 1.3.4 Creating query result object After tables have been opened and view descriptors have been created, a query result object of class select_send() is created. This is basically what distinguishes a SELECT_LEX object for a SELECT query from one that is created for an UPDATE, DELETE or INSERT statement. 1.3.5 Resolving the query After the above steps have been carried out, we have a complete representation of the query, and access to all metadata, including table and column information. We now call SELECT_LEX::prepare() on the outer-most query block(s) of the query, which will resolve the query and perform global transformations on it. ::prepare() is recursive and will resolve subqueries in prefix/postfix order. This means that we first prepare the tables referenced in the FROM clause in prefix order, before recursing down the hierarchy of subqueries and repeat the process. When resolving a subquery, tables in all outer contexts have been prepared, which means that we are able to resolve outer references (although actual resolving of outer references is delayed to a later worklog). After the tables have been prepared, the other clauses (SELECT list, WHERE, GROUP BY, HAVING, ORDER BY) are resolved. When these clauses involve subqueries, these subqueries are resolved by calling SELECT_LEX::prepare() as well. This resolver process has always been available for subqueries in the SELECT list, WHERE, GROUP BY, HAVING and the ORDER BY clauses. What is new is that we also apply the same resolver process to subqueries in the FROM clause, aka derived tables. This resolver process is further described in subsequent sections. 1.3.6 Preparing tables This is part of the resolver process for a query block. It is done by calling SELECT_LEX::setup_tables(), which will prepare the base tables and views referenced into a list of leaf tables. The list of tables is created by traversing the list of local tables of the query block, in make_leaf_tables(). At this stage, derived tables have not yet been merged into the outer query, so derived table references are represented just like regular base tables, meaning that some of the leaf tables are still not expanded into base tables. After creating the list of leaf tables, these tables are prepared for resolving. Basically, this means calling TABLE_LIST::reset(), which initializes a few fields and assigns a table number to the table. The table number ranges from 0 to MAX_TABLES-1 and is unique within this query block. In addition, we call TABLE_LIST::process_index_hints() for each table. We also count the number of leaf tables and the number of leaf tables that are partitioned. 1.3.7 Resolving derived tables After local tables have been prepared, we can resolve any derived tables and views referenced inside the query block. This is done with SELECT_LEX::resolve_derived(). This function loops over all derived tables three times: - The first loop resolves all references inside the derived table by calling TABLE_LIST::resolved_derived(). - The second loop merges derived tables that can be merged into the local query block, using SELECT_LEX::merge_derived(). - The third loop prepares the remaining (non-merged) derived tables for materialization, using TABLE_LIST::setup_materialized_derived(). After this processing, and if some derived tables have been merged into this query block, table numbers must be remapped. Consider e.g. a local query block with three tables numbered 0, 1 and 2. Table number 1 is a view over a join of two tables and that view was merged into the local query block. The two joined tables are replacing the view in the leaf table list. Thus, these two tables are mapped to table numbers 1 and 2, and the original table number 2 is mapped to 3. leaf_table_count is increased from 3 to 4. After remapping the table numbers, we need to update Item_field objects of the resolved join conditions of the join nests of the local query block so that they get correct table numbers. The resolved join conditions are the WHERE clauses and join conditions of the merged views. This is done in function update_child_used_tables(). All other Item_field objects are yet unresolved and will be resolved with correct table numbers later. 1.3.8 Resolving one derived table TABLE_LIST::resolve_derived() is responsible for resolving one derived table. This is done by calling prepare() on the underlying query expression (SELECT_LEX_UNIT) of the derived table, which again calls prepare() on the query blocks of the query expression. As these calls are recursive, it means that when this function is done, all underlying derived tables of every subquery of the derived table will be fully resolved and either merged up to the parent query or prepared for materialization. While doing this, set THD::derived_tables_processing flag. This is used to prevent privilege checking for the SELECT list of views during resolving. At this point in time, we do not know the context of the view, so we want to delay privilege checking until we have completed all merging of derived tables. But notice that columns referenced in all other clauses of the subquery, such as the WHERE clause, all need SELECT_ACL privilege. There is one extra argument to this function: apply_semijoin. For regular query expressions, this argument is always true, meaning that we allow semi-join transformation to take place for subqueries inside the derived table. However, sometimes semi-join transformation is impossible or unnecessary. E.g for single-table UPDATE and DELETE, semi-join transformation is impossible because we do not allow these operations to process more than one table in the outer query block. For multi-table UPDATE it is impossible because we cannot use a semi-join-transformed subquery predicate to construct a check option. For some administration queries, a semi-join transformation is unnecessary. 1.3.9 Merging one derived table SELECT_LEX::merge_derived() is responsible for merging a derived table into its enclosing query block. Before merging a derived table, the function checks that it is mergeable according to the functional requirements: . derived table is mergeable, and . we are not just analyzing a view, and . there is room for the added tables from the derived table in the outer query. - Populate merge_underlying_list of the TABLE_LIST with the list of tables inside the derived table. - Change the name resolution contexts of all join nests of the derived table so that they reference the query block of the enclosing query. - Create a Nested_join object containing references to the top-level tables and join nests of the derived table. Attach this structure to the TABLE_LIST object of the derived table. - Add the base tables of the derived table as leaf tables of the enclosing query block, replacing the derived table reference in the leaf table list. (As derived tables have been resolved bottom-up, there are no unresolved derived table references left). - If the derived table reference is on the inner side of an outer join, propagate nullability to all tables of the derived table (see section 1.3.16 for details). - Merge the WHERE condition of the derived table with the join condition of the table reference representing the derived table. Join conditions inside the derived table are left where they are (they will later be processed by simplify_joins()). - Create field translations for the columns in the select list of the derived tables, which are later used for resolving references to these columns in the outer query blocks. - fix_after_pullout() is called on WHERE and join conditions to correct the resolver data for the column references. This is mostly needed because of the need for secondary resolving for prepared statements. - Merge ORDER BY clause from derived table with the ORDER BY clause of the enclosing query block. Notice that the final ORDER BY list will be ordered according to the position of derived tables in the query: First, any ORDER BY clauses from the query will be listed, followed by ORDER BY clauses from the first derived table, followed by clauses from the second derived table, etc. This process merges multiple query blocks into one. Some redundant TABLE_LIST objects exist after the merge, but these are later removed by the call to simplify_joins(), which is called after all query block merging is done. This is an example of how data structures are transformed. Consider these tables and views: create table t1(a integer, b integer); create table t2(c integer, d integer); create table t3(e integer, f integer); create view v3 as select * from t2 join t3 on t2.c = t3.e where t2.c > 0; Then we issue the following SELECT statement: select * from t1 join v3 on t1.a = v3.c where t1.a > 0; Representation of view v3: +-------------+ | | where_cond +----------+ | view-query |----------->| t2.c > 0 | | | +----------+ +-----+-------+ | +------------------+ +----------->| (nest_last_join) | top_join_list +--------+---------+ | +------+ +------+ +---->| t3 +---->| t2 | +--+---+ +------+ | +-------------+ +------->| t2.c = t3.e | join_cond +-------------+ Representation of outer query before merging: +-------------+ | | where_cond +----------+ | outer-query |----------->| t1.a > 0 | | | +----------+ +-----+-------+ | +------------------+ +----------->| (nest_last_join) | top_join_list +--------+---------+ | +------+ +------+ +---->| v3 +---->| t1 | +--+---+ +------+ | +-------------+ +------->| t1.a = v3.c | join_cond +-------------+ Representation of outer query after merging: +-------------+ | | where_cond +----------+ | outer-query |----------->| t1.a > 0 | | | +----------+ +-----+-------+ top | +------------------+ +--------------------------+ join +-->| (nest_last_join) | +------>| t1.a = v3.c and t2.c > 0 | list +--------+---------+ | +--------------------------+ | +-----++ +------+ +---->| v3 +---->| t1 | +--+---+ +------+ | +------------------+ | (nest_last_join) | +---------+--------+ | | +------+ +------+ +---->| t3 +---->| t2 | +--+---+ +------+ | +-------------+ +------>| t2.c = t3.e | join_cond +-------------+ After simplify_joins() has normalized this, we have: +-------------+ | | where_cond +-----------------------------------------------+ | outer-query |----------->| t1.a>0 and t1.a=t2.c and t2.c>0 and t2.c=t3.e | | | +-----------------------------------------------+ +-----+-------+ | +-----+ +------+ +------+ +----------->| t3 +---->| t2 +---->| t1 | top_join_list +-----+ +------+ +------+ The original list of tables and views is still preserved in SELECT_LEX::table_list. 1.3.10 Setup derived table for materialization TABLE_LIST::setup_materialized_derived() sets up the non-merged derived tables for materialization. The function select_union::create_result_table() is called to create a TABLE object for the temporary table, which is then attached to the TABLE_LIST object for the derived table. Some processing is then done for the underlying query blocks of the derived table: - As the involved tables are read before the outer query is executed, these tables may be freely used in data change operations involving the same tables. The function propagate_unique_test_exclusion() is called to ensure this. - SELECT_ACL privileges are required for all tables of the derived tables. Column privileges need not be checked here, as they either have been checked already (if part of WHERE, GROUP BY or HAVING clauses) or will be checked as part of outer query privilege checking. - Loop over all selected expressions in the select list of the derived table and record them in the table read sets. 1.3.11 Checking privileges for views After derived tables have been properly resolved, call SELECT_LEX::check_privs_for_views() to prepare any views for column privilege checking, by setting the want_privilege of every table and view in a chain of TABLE_LIST descriptions, by following the "belong_to_view" pointer and calling the check_single_table_access() function for each. 1.3.12 Setup natural join row types The function setup_natural_join_row_types() was previously called as part of setup_tables(). But then, derived tables were resolved prior to resolving the enclosing query block, and columns referred in join conditions may be referring to derived tables, hence this function had to be taken out of setup_tables() and must be called after resolving the derived tables. 1.3.13 Processing check options Check option processing is applicable to INSERT and UPDATE against views. Check option processing can be cascading or non-cascading. This property determines how we deal with check options defined for a view referenced from within another view. Preparation of check option is similar to how it was before implementing this WL: After preparation and merging of derived tables and views is complete, call TABLE_LIST::prepare_check_option(), which will recursively descend down the tree of view TABLE_LIST objects and prepare a check option condition based on the views' WHERE conditions and join conditions. For a prepared statement, the check option tree is built and resolved when the statement is prepared. For subsequent executions, the check option tree is simply resolved. The change is that TABLE_LIST::where has been eliminated. This is now available as the join condition of the join nest created for the view reference, meaning that the check option tree can be generated simply by merging join conditions. Due to the way we now propagate WHERE clauses of derived tables into the join nest of the table reference, the calls to TABLE_LIST::prepare_where() inside SELECT_LEX::setup_conds() have been eliminated. 1.3.14 Remaining part of resolver process The remaining part of the resolver process is unaffected, except for: - Partition pruning is moved into SELECT_LEX::apply_local_transforms(), meaning that it is done after query block merging (slightly more efficient and there are possibly more equality conditions available for pruning). See also section 1.3.17 - SELECT_LEX::apply_local_transforms() is no longer called explicitly for derived tables, instead derived tables are processed recursively by following child SELECT_LEX_UNIT pointers. - Note also that SELECT_LEX::apply_local_transforms() is extended with a THD argument and a prune argument. The THD argument is needed as prerequisite for the case when JOIN object is not available during preparation. The prune argument is necessary because partition pruning here should only take place for regular (sub)query resolving. UPDATE and DELETE processing has a specific pruning step, so it cannot be performed in this general function. Checking column privileges is unaffected for columns referenced in WHERE clause, join conditions, GROUP BY clause, HAVING clause and ORDER BY clause. 1.3.15 View error processing Errors when checking privileges to views need some special treatment, as we do not want to reveal too much of the structure of a view to unprivileged users. Therefore, an error processor is registered for the query block of a view, which when called will transform some specific error messages into a more "neutral" form. This error processor is called e.g when checking column privileges and resolving column references, and a problem occurs inside a view definition. 1.3.16 Nullability propagation Nullability propagation is applied in two cases: 1. For all inner tables of an outer join, these tables are always "nullable" 2. When query blocks are merged into the inner side of an outer join, as part of derived tables merging and semi-join transformation, all tables from the merged query block will become "nullable". Such nullability propagation has been consolidated into one recursive function: propagate_nullability(). 1.3.17 Partition pruning Partition pruning before locking tables is now applied within the apply_local_transforms() function. This ensure that query block merging has taken place, and hence all equalities between the merged query block and the outer query block have been considered. This function handles all necessary pruning for SELECT statements, the SELECT part of INSERT ... SELECT statements and multi-table DELETE and UPDATE statements. For single-table DELETE and UPDATE statements, apply_local_transforms() affects any subquery tables, but it is still needed to perform explicit partition pruning before locking tables on the table being deleted from or updated. Partition pruning after locking tables is not affected. 1.3.18 Optimization changes There are no changes in the optimization process, except that the handle_derived() call with mysql_derived_optimize as argument is replaced with an explicit loop over all leaf tables and a call to TABLE_LIST::optimize_derived() for each derived table. 1.3.19 Execution changes There are no changes in the execution process, except that the handle_derived() call with mysql_derived_create as argument is replaced with TABLE_LIST::create_derived(), and in join_materialize_derived(), the calls to mysql_handle_single_derived() have been replaced with calls to TABLE_LIST::materialize_derived() and TABLE_LIST::cleanup_derived(). 1.4 Processing a prepared SELECT query Preparing a prepared statement goes through the same steps as when preparing a regular statement, so the changes are the same here. Executing a prepared statement goes through the same steps as when executing a regular statement, but there are some differences in the preparation part: - All permanent transformations have already been done and need not be repeated. - In SELECT_LEX::setup_tables(), mergeable derived tables will have been merged, meaning that setup of leaf tables is complete after the function call. - In SELECT_LEX::resolve_derived(), no derived tables will be merged, as all merging has been performed as a permanent transformation. Only the derived tables that are marked for materialization need to be prepared. However, as we do not traverse the query blocks belonging to derived tables that have been merged, another loop is necessary to setup for materialization those derived tables that are contained within other merged derived tables. Luckily, these tables have already been identified as leaf tables, so we only need to traverse the list of leaf tables to find them. 1.5 Processing an INSERT statement 1.5.1 Opening tables When opening tables for INSERT, we call SELECT_LEX::setup_tables(). If the table to insert into is a view, call TABLE_LIST::resolve_derived() and TABLE_LIST::merge_derived() on it, as we know it should be a mergeable view. There is a call to SELECT_LEX::resolve_derived() that will process derived tables in the query expression of an INSERT ... SELECT statement. (This seems a bit redundant, as there is partial overlap with the call to TABLE_LIST::resolve_derived(), but it is needed because the first table in the list of tables is special for INSERT statements. It might be easier if the table to be inserted into was separate from the list of tables). 1.5.2 Preparing check option The call to TABLE_LIST::prepare_check_option() was moved to the end of mysql_prepare_insert(). This way it will cover the needs both for INSERT ... VALUES and for INSERT ... SELECT statements. 1.6 Processing a single-table UPDATE statement Opening the table to be updated had to be extracted into a new function mysql_update_prepare_table() and called before dispatching to either mysql_update() or mysql_multi_update(). This was needed because merging derived tables might cause the query block to contain multiple base tables, which requires execution as a multi-table UPDATE statement. setup_fields_with_no_wrap() is replaced with setup_fields(). See section 1.2.2.3 for explanation. check_fields() is called unconditionally and returns original_update_fields. SELECT_LEX::apply_local_transforms() is called at the end of the resolver phase. There is also new separate code for checking update privileges for columns to be updated. 1.7 Processing a multi-table UPDATE statement mysql_multi_update_prepare() now calls SELECT_LEX::setup_tables(), SELECT_LEX::resolve_derived() and setup_natural_join_columns(). setup_fields_with_no_wrap() is replaced with setup_fields(). See section 1.2.2.3 for explanation. check_fields() is called unconditionally and returns original_update_fields. A new privilege check is added for columns that are updated: We loop over all fields to be updated and validate privileges using Item::check_column_privileges() applied to a set of wrapped Item_field objects. In addition, these columns are registered in the write map of the appropriate tables. These column privilege checks are done just after tables are "prepared" for privilege checks through multi_update_check_table_access(). These privilege checks were earlier done implicitly through Item_field::fix_fields() when resolving columns to be updated. This was not sufficient because we kept non-wrapped fields, so privilege checks for prepared statements might be wrong if view privileges were changed. Doing the privilege check explicitly on wrapped Item_field objects means that we track all privilege changes throughout the lifetime of a prepared statement. Preparation of such statements goes through SELECT:LEX::prepare(), which will call functions like SELECT_LEX::apply_local_transforms(). 1.8 Processing a single-table DELETE statement After opening tables, we process tables by calling SELECT_LEX::setup_tables(), SELECT_LEX::resolve_derived() (if we have derived tables). We also need to check view privileges with SELECT_LEX::check_privs_for_views() if we are deleting from a view. SELECT_LEX::apply_local_transforms() is done as last part of preparation. 1.9 Processing a multi-table DELETE statement After opening tables, we process tables by calling SELECT_LEX::setup_tables(), SELECT_LEX::resolve_derived() (if we have derived tables). We also need to check view privileges with SELECT_LEX::check_privs_for_views() if we are deleting from one or more views. 1.10 Processing administrative statements This is applicable to: SHOW CREATE TABLE SHOW COLUMNS Commands calling "get_all_tables". For SHOW CREATE TABLE, it is necessary to resolve views, but not prepare for materialization. For SHOW COLUMNS, we need to resolve views and prepare for materialization of such, in order to list column information. This is also done for commands calling "get_all_tables" (although I do not think that preparing a materialization is needed for all those commands).For major design decisions, see HLS chapter 2. The LLD is a summary of low-level design changes with references back to the design chapter. 1.11 Query property interfaces This WL also adds some high-level property interfaces for the SELECT_LEX and TABLE_LIST class. For class SELECT_LEX, properties have been added that let us check whether a query block is aggregated, is grouped (explicitly or implicitly), is distinct, is ordered or has a limit. Note that these functions can be applied both before and after transformations, so their result may vary based on order of calls. Example: We may delete an ORDER BY clause for an aggregated query because it does not make sense to order a result set with only one row. Calling is_ordered() before this transformation will return true, calling it after transformation will return false. Notice that these interfaces are not implemented throughout the code base. For class TABLE_LIST, properties have been added that let us set or check whether a table is updatable, check whether a table is a derived table or a view, whether a derived table or view is mergeable, whether it has been merged, or whether it has been set up for materialization. 2. Class TABLE_LIST changes There are extensive changes made to this class. Some members have been made private and access is replaced with setters and getters. New functionality for handling derived tables has been added. See sections 1.3 through 1.8 for an overview of these design changes. 2.1 void set_updatable() This function sets the table as updatable (ie it is the target of an INSERT, UPDATE or a DELETE). All tables start as non-updatable, so a setter for non-updatable is not necessary. 2.2 bool is_updatable() Returns true if the table is updatable, false otherwise. 2.3 bool is_view_or_derived() Returns true if this represents a view or a derived table (i.e a named or an unnamed derived table), and false otherwise. 2.4 bool is_derived() Returns true if this represents a derived table, and false otherwise. 2.5 bool is_view() Returns true if this represents a view (ie. a named derived table), and false otherwise. 2.6 bool is_multiple_tables() Returns true if this represents multiple base tables, and false otherwise. Can be true only for views and derived tables. 2.7 bool is_mergeable() Returns true if this represents a derived table or view that can be merged into the outer query block. A view is mergeable if its query expression is mergeable and it does not contain an explicit materialization directive (ALGORITHM_TEMPTABLE). A derived table is mergeable if its query expression is mergeable. 2.8 bool is_merged() Returns true if this represents a derived table or view that is merged into the outer query block. This property is permanent for the lifetime of a prepared statement. 2.9 bool is_placeholder() Returns true if this is a "placeholder", which seems to mean that it is a derived table, a view, a schema table, or does not have a TABLE object for some reason. 2.10 uint leaf_tables_count() Should be called only for a derived table or view. Returns the number of underlying base tables. 2.11 void set_view_query(LEX *lex) Set the view query (a LEX object) for a derived table or view. The special value (LEX *)1 is used to indicate a view for which the definition is not available. 2.12 LEX *view_query() Get the view query (a LEX object) for a derived table or view. 2.13 void set_derived_unit(st_select_lex_unit *unit) Set the query expression for a derived table or view. 2.14 st_select_lex_unit *derived_unit() Get the query expression for a derived table or view. 2.15 void reset(uint tableno) Reset table before resolving: Assign table number and map for table. Reset optimizer data for table. Reset execution status for table. 2.16 bool resolve_derived(THD *thd, bool apply_semijoin) Resolve a reference to a view or a derived table, including the query expression defining the view. This function is a prerequisite for merging or materializing a derived table, see section 1.3.8 for an overview. 2.17 bool merge_derived(st_select_lex *target_select) Merge a derived table/view reference into an outer query block. The derived table must be prepared before merge is attempted. If the derived table is not mergeable or some other constraints prevent the merging, the request is silently ignored. Check the is_merged() property to obtain the status of the merge request. See also section 1.3.9. 2.18 bool setup_materialized_derived(THD *thd) Setup a derived table/view reference for materialization into a temporary table. The derived table must be resolved before merge is attempted. This function must not be called for a derived table that is merged. Notice that unlike for merging, materialization setup has to be performed for every execution of a prepared statement. See also section 1.3.10. 2.19 bool create_field_translation(THD *thd) Create field translations for the columns defined by a derived table or view. Field translations are required so that the columns of a merged derived table can be resolved in the context of the outer query block. 2.20 bool merge_where(THD *thd) Merge the WHERE clause of a derived table or view into the condition of the outer query. Within the outer query, a nested join object is created for this derived table, and the WHERE clause is attached as the join condition of that nested join. 2.21 bool prepare_check_option(THD *thd) Prepare a combined check option condition for a view based on check options of this and underlying views, and of the effective check option indicator. 2.22 void set_tableno(uint tableno) Assign a unique table number and a corresponding table map for a table within one query block. 2.23 bool merge_underlying_tables(select_lex) Merge the tables from a query block into a nested join struct. The nested join will represent a merged view in the outer query. 2.24 ALGORITHM specifications have been revised: ALGORITHM_UNDEFINED - algorithm has not been determined yet. ALGORITHM_TEMPTABLE - derived table/view will be materialized in a temporary table. ALGORITHM_MERGE - derived table/view will be merged into outer query. 3. Class TABLE changes Some utility functions have been added. 3.1 void set_nullable() Set table as nullable, used for inner tables in an outer join. 3.2 bool is_nullable() Returns true if table is nullable (see set_nullable()). 3.3 void mark_column_used(THD *thd, Field *field, enum enum_mark_used mark_used) Mark a column as read or written. 4. Class Item changes 4.1 void register_field_in_write_map(uchar *arg) Mark columns inside an expression tree as written. Implemented for sub-classes: - Item_field - Item_func_set_user_var 4.2 void check_column_privileges() This function accepts a class of type Column_priv_check and checks column privileges for an Item_field object, or an Item_field object wrapped in one or more Item_direct_view_ref objects. Column_priv_check identifies the set of required privileges. Implemented for sub-classes: - Item_field - Item_direct_view_ref For a wrapped field, privileges must be validated both for the base table and for any views wrapping that field. Error messages are reported through the name resolution context of the item, which may have been set up to convert specific privilege error messages into a more generic one, such as "View references invalid table(s) or column(s) or function(s) or definer/invoker of view lacks rights to use them". 5. Class SELECT_LEX changes The changes are made up of a set of utility functions that describe properties of the query block. Notice that the utility functions are not always used when possible. Implementing full usage is a task that is postponed to some later stage. 5.1 bool setup_tables(THD *thd, TABLE_LIST *tables, bool select_insert) This function is moved from sql_base.cc. It is also called from more places, as the function setup_tables_and_check_access(), which used to call it, has been eliminated. 5.2 bool resolve_derived(THD *thd) This is a new function that is used to resolve all derived tables and views referenced in a query block. It will make three passes over the derived tables: 1. Prepare all derived tables. 2. Merge applicable derived tables into the query block. 3. Set up the remaining derived tables for materialization. After merging, table numbers of the query block will be re-mapped, as the set of leaf tables may have changed due to merging derived tables. See also section 1.3.7. 5.3 bool check_privs_for_views(thd, false, DELETE_ACL, SELECT_ACL) Check privileges for views referenced from a query block. Called from SELECT_LEX::prepare(). 5.4 bool is_grouped() Return true if query block is grouped, ie whether it - is explicitly grouped, or - contains one or more set functions (implicitly grouped) 5.5 bool is_explicitly_grouped() Return true if query block contains a non-empty GROUP BY specification, false otherwise. Return false if a GROUP BY specification has been determined to be redundant. 5.6 bool is_implicitly_grouped() Return true if query block contains one or more set functions but is not explicitly grouped, and false otherwise. 5.7 bool is_single_grouped() Return true if query block is implicitly grouped and does not contain a HAVING clause. 5.8 bool is_distinct() Return true if query block contains a DISTINCT flag, false otherwise. Return false 5.9 bool is_ordered() Return true if query block contains an ORDER BY specification, false otherwise. Return false if an ORDER BY specification has been determined to be redundant. 5.10 bool has_limit() Return true if query block contains a LIMIT specification. 5.11 leaf_table_count This field counts the number of leaf tables in query block after merging 5.12 derived_table_count This field counts the number of original derived tables and views. 5.13 materialized_derived_table_count This field counts the number of derived tables and views which are materialized. 6. Class SELECT_LEX_UNIT changes 6.1 bool is_mergeable() This function returns true if the query expression it represents - does not contain a UNION clause, and - is not aggregated, and - does not contain a DISTINCT keyword, and - does not have a LIMIT or OFFSET clause, and - does not assign any variables. 7. Resolver changes 7.1 setup_natural_join_row_types() This function is no longer static, as it must be called from SELECT_LEX::prepare(). The reason for this is that it may refer to derived tables, which are now resolved after setup_tables() is called. (Otherwise it would not be possible to resolve outer references in derived tables). 7.2 setup_tables_and_check_access() This function has been replaced with SELECT_LEX::setup_tables() and SELECT_LEX::check_privs_for_views(). 7.3 void update_child_used_tables(List*tables) This function walks through the WHERE condition and the join conditions of a merged view and updates table number information to be correct after merging. 7.4 TABLE_LIST **make_leaves_list(TABLE_LIST **list, TABLE_LIST *tables) This function has been moved from sql_base.cc to sql_resolver.cc, and renamed to make_leaf_tables(). 7.5 void propagate_nullability(List *tables, bool nullable) This function propagates the nullability attribute into join nests on the inner side of an outer join. This is a recursive function that navigates through the join nest information, thus simplifying the implementation. 7.6 void repoint_contexts_of_join_nests() This function is moved from sql_view.cc to sql_resolver.cc. 7.7 setup_fields() changes Instead of a mark_used_columns argument, this function now has an operation argument, which is the same as the privilege required for the searched columns. The applicable mark argument (MARK_COLUMNS_READ or MARK_COLUMNS_WRITE) is derived from the operation. 7.8 setup_conds() This function has been made a member function of class SELECT_LEX. As derived tables and views are now resolved before resolving the outer query, it is no longer needed to call TABLE_LIST::prepare_where() for join nests. The label err_no_arena has been eliminated, thus simplifying error handling (we will not need to preserve environment in case of error, as it will be reset for each resolver attempt. 7.9 SELECT_LEX::propagate_unique_test_exclusion() This is a new function that propagates unique test exclusion into a tree of query expression objects. 7.10 Column privilege propagation See section 1.2 for an overview. This is applicable to the functions find_field_in_tables() and find_field_in_table_ref() and their users: Item_field::fix_fields(), Item_field::fix_outer_field(), Item_ref::fix_fields() and mysql_table_grant(). When checking privileges for a column, the desired privileges are passed directly to these functions. The resolver function SELECT_LEX::prepare() sets thd->want_privilege with the desired privilege before calling the Item resolver functions, which will pick up the desired privilege and pass it to the inner functions. Notice that when checking the select list of a derived table, the desired privilege is empty. The actual privilege checking takes place when the query block containing references to view/derived table fields is resolved. To check privileges of columns in an arbitrary expression, the interface Item::check_column_privileges() is added. See section 4.2 for details. Specific implementations are provided for the classes Item_field and Item_direct_view_ref. The latter will suppress specific errors and replace them with a generic "view privilege" message. 8. Data change processing changes (UPDATE, DELETE, INSERT) 8.1 bool mysql_update_prepare_table(THD *thd, st_select_lex *select) This is a new function that resolves the single target table of an UPDATE statement. If this table is a view, it is prepared and merged so that a single base table is ready. 8.2 mysql_prepare_insert_check_table() This function does derived tables processing for INSERT statements. It is a bit special, as we first call TABLE_LIST::resolve_derived() and TABLE_LIST::merge_derived() on the table being inserted into, and then call SELECT_LEX::resolve_derived() on the query block. This last function will also cover the inserted table, but as this is already processed, it will be ignored, and will only handle derived tables in the SELECT part of the INSERT statement. This procedure might be streamlined, but it seems the best solution given the current data structures. Separating the inserted table completely out of the SELECT_LEX tables list may be an interesting alternative. 9. Privilege check changes Privilege checking is mostly changed by changing the places where the check functions are called. There are only two interface changes, but see also the Item changes in section 4.2. 9.1 bool check_grant_column() This function is extended with an argument want_privilege, which is used when checking the privilege for that column. Earlier, this value was taken from grant.want_privilege of of the TABLE_LIST object of the table that the column belongs to. To reviewer: We still assert that this is the same, but this check can be removed when we are confident with this change. 9.2 bool check_column_grant_in_table_ref() A want_privilege argument is added, which is passed on to check_grant_column(). 10. View resolution changes 10.1. mysql_make_view() - creating internal representation from view See also section 1.3.3. This function reads the definition of a view and transforms it into a LEX object with an underlying query expression (SELECT_LEX_UNIT), which optionally may contain subqueries. The function is heavily refactored - here is a list of changes: - argument table is changed to view_ref - so it is easy to recognize as a view and not a table. - a Make_view_tracker RAII class is created to simplify cleanup after error handling inside the function. - local variables are declared when needed and scoped when possible. - views are no longer merged into the outer query if possible. Instead, a view is marked as mergeable when possible and the actual merging takes place in TABLE_LIST::merge_derived(). Here are the main processing steps: - If the view is already read in a prepared statement context, the function just prepares a security descriptor and returns. - Create LEX object for the view and parse the view, using this LEX as context. - Check that we have some privileges to the view. Detailed operation and column privilege checking is performed later. - Prepare underlying tables so they properly belong to a view. - Attach the underlying tables of the view in the global list of tables of the enclosing query. - If view can be merged into the outer query, set lock type for view so that it can be used in update operations. - For views with definer's rights, create a security context for the view, otherwise use the user's current security context. Then assign this context to all underlying tables of the view. - Setup an error processor to hide detailed errors when occurring inside a view. - Finally, link the query expression of the view into the list of derived tables and views of the enclosing query. 11. Base functions These functions are mostly about opening tables, see also section 1.3.2. 11.1 open_and_lock_tables() This function has lost the "derived" argument, because derived tables are no longer processed when opening tables. 11.2 open_normal_and_derived_tables() This function no longer processes derived tables. It has been renamed to open_tables_for_query(), as it is used generally to open all the tables required for processing of a query (or statement). 12. Removed functions 12.1 update_field_dependencies() This function has been removed from the resolver. It is replaced with calls to TABLE::mark_column_used(). 12.2 setup_fields_with_no_wrap() As we no longer use "unwrapped" fields, this function has been deleted, and calls replaced with setup_fields(). 12.3 mysql_handle_single_derived() This was brief wrapper over the derived table "processor" function, so it was easy to eliminate. 12.4 mysql_handle_derived() We no longer process all derived tables through the global list of query blocks, so this can be eliminated. 12.5 mysql_derived_prepare() This function is replaced by TABLE_LIST::resolve_derived(). 12.6 mysql_derived_optimize() This function is replaced by TABLE_LIST::optimize_derived(). 12.7 mysql_derived_create() This function is replaced by TABLE_LIST::create_derived(). 12.8 mysql_derived_materialize() This function is replaced by TABLE_LIST::materialize_derived(). 12.9 mysql_derived_cleanup() This function is replaced by TABLE_LIST::cleanup_derived().
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.