WL#5275: Process subqueries in FROM clause in the same way as view

Status: In-Progress   —   Priority: Medium

Both derived tables and views (also known as named derived tables) are
represented in the server in the same way, but the processing is a bit different
in the preparation stage. These differences are caused partly by slightly
different nature, but mostly by mis-design.

Some design flaws were addressed in WL#5274. The purpose of this worklog is to
unify handling of derived tables and views as much as possible.

Currently, subqueries in the FROM clause (derived tables) are unconditionally 
materialized while views created from the same query expressions are sometimes
materialized and sometimes merged into the outer query. This behaviour, beside
being inconsistent, can lead to a serious performance penalty.

Effectively, materialization means that a derived table or view is materialized
into an intermediate temporary table. This is often causing significant slowdown
of query execution. One of numerous examples is the BUG#59203 -
SERIOUS OPTIMIZER FAILURE WHEN QUERY IS ENCAPSULATED IN ADDITIONAL SELECT.

In this bug, a user reports slowdown of almost 1000 times - from 0.03s to 29s.

Another problem with unnecessary materialization is that currently the optimizer
is unable to push any optimizations down to derived tables, e.g. constant and
equality propagation do not work trough inner-outer query boundary. This could
be illustrated with the following example: a condition in the outer query that
could significantly reduce amount of data needed to be processed by subquery
can be applied too late - server has to materialize a large derived table first,
and use the restrictive condition when reading from this materialized derived
table.

Obviously, this can lead to a significant performance penalty.

Users work around this flaw by wrapping derived tables into views, which don't
have such problems. This isn't always possible due to a variety of reasons, e.g
such queries could be auto-generated by some software that cannot be altered.

The solution to this problem is to unify the preparation stage of views and
derived tables so that the merge procedure which is available for views can
also be applied to derived tables.

Currently, outer references are not allowed inside derived tables.

To process outer references in derived tables, resolving order needs to be
changed so that outer references can be resolved in the context of a resolved
outer query. This means that column references of the outer query may be looked
up when resolving the derived table, by following outer context pointers of name
resolution contexts used by the derived table to locate the outer references.

This is an example of a query involving a derived table with an outer reference:

SELECT *
FROM t1 JOIN t2 ON t1.a=t2.a
WHERE t1.a IN (SELECT it.a
               FROM (SELECT a FROM t3 WHERE t1.b=t3.c) AS it);

Notice that a derived table may not refer to tables on the same level.
Thus, this query is invalid:

SELECT * FROM t1, (SELECT a FROM t3 WHERE t1.b=t3.c) AS dt;

This WL will make it possible to resolve outer references in derived tables by
changing the resolver order, but actual implementation will be done in WL#461.

Bug reports that may be affected by this work:

BUG#59203 SERIOUS OPTIMIZER FAILURE WHEN QUERY IS ENCAPSULATED IN ADDITIONAL SELECT

BUG#70220 GROUPING WITH A VIEW MAY REPORT 'INVALID USE OF GROUP FUNCTION'

BUG#67631 Subquery in view works slower than if run without view

BUG#36802 MySQL only use covered index when selecting from a view (or subquery)
with union

(Pushdown of conditions into derived tables, not scheduled)

BUG#41994 Optimizer do full table scan when use distinct with group by in subquery

(Need to check if this is affected)

BUG#67631 Subquery in view works slower than if run without view

(Need to check if this is affected)

TODO list / functionality not implemented in this worklog:
----------------------------------------------------------

- Resolve outer references for derived tables.

- Select only relevant columns from materialized derived tables
  When preparing a derived table or view for materialization, only the
  selected columns that are also referenced in the outer query should be
  marked in the read set.

- Do not propagate ordering for derived tables to outer query (see F16).

- Allow sub queries in the SELECT list for merged derived tables (see F9).
Functional requirements
=======================

F1. This worklog applies to derived tables, which are defined by a query
    expression over a set of other tables (derived tables and base tables),
    and may be ordered using an ORDER BY clause and constrained using LIMIT
    and OFFSET clauses.

    Derived tables may be unnamed or named. Named derived tables are commonly
    referred to as views.

    For simplicity, we refer to views and unnamed derived tables as just
    "derived tables" when this is unlikely to be confused.

F2. Allow both named and unnamed derived tables to be merged into outer query
    block.

F3. These statement types allow derived tables to be merged into an outer
    query block:
    SELECT, UPDATE, DELETE, INSERT, REPLACE, LOAD, CREATE TABLE with SELECT.
    Unnamed derived tables cannot be used as update target tables in UPDATE,
    DELETE, INSERT, REPLACE and LOAD statements. 

F4. Derived tables with a UNION in its outer-most query expression cannot be
    merged into the outer query block. Thus, only query expressions consisting
    of a single query block can be merged (however, that query block may itself
    contain subqueries).

F5. Derived tables that are aggregated cannot be merged into the outer query
    block. A derived table is aggregated if it consists of a single query block
    and contains an aggregation function in the SELECT list, or has a GROUP BY
    clause, or has a HAVING clause, or specifies DISTINCT in the SELECT list.

F6. Derived tables with zero table references cannot be merged.

F7. Derived tables that have OFFSET or LIMIT clauses cannot be merged.

F8. Derived tables that cause side-effects, such as setting user variables,
    cannot be merged. (See WL#3274 revno 2815).

F9. Derived tables that have subqueries in the SELECT list cannot be merged.

F10. Views that are created with ALGORITHM=TEMPTABLE will never be merged.
     Views that are created with ALGORITHM=MERGE will be merged unless
     some other rule prevents them from being merged.

F11. The optimizer_switch derived_merge controls whether derived tables and
     views will be merged into the outer query block, unless some other rule
     prevents them from being merged. For a view, an ALGORITHM directive
     takes precedence over the derived_merge setting.
     The value "on" enables merging, and the value "off" disables merging.
     The default value is "on", and this can also be selected with the value
     "default".

F12. After merging, a query block cannot reference more than 61 base tables.
     This is an implementation limit, mandated by the use of a 64-bit field
     to represent tables in a query block.
     If merging a derived table/view would cause more than 61 tables in a
     query block, materialization will be chosen instead.

F13. Merging of derived tables into outer query blocks should be processed
     at the same time as other query block merging strategies, such as
     semi-join transformation.

F14. Merging of derived tables should be processed in a bottom up manner,
     just like other query preparation strategies. E.g in this query:
      SELECT * FROM (SELECT * FROM (SELECT * FROM t1));
     table t1 is first merged into its immediately containing query block,
     then it is merged into the outer-most query block.

F15. All derived tables that cannot be merged are materialized before
     they are needed into a temporary table. This temporary table takes
     part in query optimization and execution as if it was a regular table.
     (see WL#5274 for execution details).

F16. An ORDER BY clause that is contained within a derived table or view
     definition is ignored, unless it is also accompanied by a LIMIT or OFFSET
     clause.

Non-functional Requirements
===========================

NF1. Performance that does not involve views or derived tables should not be
     affected.

NF2. Performance degradation up to 3% is expected for queries involving merged
     views or derived tables. This is because all views and derived tables are
     fully resolved before merging. A major part of this is creating a
     JOIN object, which is completely unnecessary for merged derived tables,
     but still required due to our resolved code.
     
1. Functionality changes

1.1 Merging of unnamed derived tables

The major functionality change is that named and unnamed derived tables
are treated equally regarding how they are merged with the outer query block.
Before this worklog, only named views could be merged with the outer query
block, a derived table would always be materialized.

This worklog also describes the criteria for when a derived table can be
merged into the outer query block and when it will be materialized.
The criteria are largely unchanged, but are specified for documentation
purposes. In addition to existing criteria, an optimizer_switch that
specifies whether to merge or materialize derived tables will be implemented.

1.2 Allowing full resolution of view references

Currently, view references are expanded in the function mysql_make_view()
which is called during opening of tables. If the view conforms to the rules
for view merging, its contents are merged directly into the query block
of the embedding query.

After this worklog, view merging no longer takes place in mysql_make_view().
Instead, a full resolution is performed for all derived tables underlying
a query block. This makes sure that the view is actually syntactically sane
(earlier processing relied on separate code to validate this), and it gives
us the opportunity to consider the full context of the query. This means that
we can resolve outer references properly (see below) and take decisions
on which derived tables to merge more intelligently than a pure heuristics
based approach as is done currently.

1.3 Consolidation of query block merging

We currently support two ways of query block merging:
- Derived tables merging
- Semi-join merging

The techniques are quite similar and in some ways the two techniques may
compete and prevent each other. Currently, the two techniques are applied
at very different stages in the query optimization. Derived table merging
is applied when opening tables, whereas semi-join merging is applied during
the optimization phase.

The goal of this worklog is to prepare for query block merging to take place
at logically the same place. Currently, the natural place is in the function
that resolves the outer query block, ie SELECT_LEX::prepare(). When all
subqueries below this query block have been resolved, look at them all and
determine which to merge into the outer query block.

This goal will be accomplished for derived table and view merging in this
worklog.

The same goal has already been accomplished for semi-join transformation in
move permanent transformation from JOIN::optimize() to JOIN::prepare().

1.4 Column privileges are validated differently

Privileges to columns of a view have been validated in two phases: first,
a want_privilege has been set up for each table and view, then each column
has been validated against this privilege, during resolving of columns.
This practice is sufficient for e.g SELECT, as all SELECT_ACL privilege is
required uniformly for all columns. However, for INSERT we may want INSERT_ACL
for columns that are inserted and UPDATE_ACL for columns specified in
ON DUPLICATE UPDATE. And for multi-table UPDATE, SELECT_ACL privilege is
required for referenced columns and UPDATE_ACL is required for columns from
those tables that are updated. This two-phase approach is error-prone, as there
is quite a long distance between when the want_privilege is set and the column
privileges are checked.

A second challenge is that we want to resolve a view without a specific
operation context, ie. without knowing whether the view is selected from,
delete from, updated or inserted into. Neither do we know which columns from
the view that are being used. Example:

CREATE VIEW v AS SELECT a, b FROM t;
INSERT INTO v(a) VALUES (10);

In this INSERT statement, we need INSERT privileges for column a, but we
can ignore column b. However, the resolver needs to validate that both a and
b are in fact valid columns in the underlying table of view v.

We try to solve this problem by delaying the privilege checking of view columns
until the outer query is resolved.

A third challenge is that we want to prepare for a future worklog where
prepared statements are resolved once only. Currently, the resolver is called
for each execution, and it is simple to check column privileges while
resolving columns. When resolving occurs only once, we have to perform
a dedicated privilege check for each execution. This is reasonably simple,
but sometimes, "non-wrapped" view column references are used. Wrapped
column references means that an Item_ref object is used for a view column,
which points to an Item_field object describing the base table column, but
non-wrapped view column references only contain the Item_field.

Privilege checking needs all this information, as we have to ensure that we
have proper privileges both to the view and to the base table. However, with
"non-wrapped" items, we can only check column privileges for the base table,
and not for view columns.

We solve this problem by eliminating the use of "non-wrapped" view columns.
 
2.0 Test changes

2.1 Functionality coverage for derived tables and views

All functionality specified in Functional Requirements is covered in
derived.test and view.test.

2.2 Improved functionality coverage of column view privileges.

Extensive testing of column privilege checking for all SELECT and data change
operations has been added to view_grant.test.

2.3 Existing coverage tests

Existing coverage tests with derived tables have several plan changes because
they now use merging instead of materialization. Generally, these changes lead
to less cost and less handler operations being executed.

2.4 Existing bug fix tests

Bug fix tests that test materialization-related issues are executed with
optimizer_switch='derived_merge=off' to preserve old behaviour.

2.5 Change in handling of CHECK TABLE

A CHECK TABLE test in view.test around line 1970 changes the message from
"error: Corrupt" to "status: Operation failed" for a non-existing function
referenced inside a view. This change is due to the way views are resolved now:
They used to be resolved when opening a table, now they are resolved within the
regular resolver hierarchy.

One way to get back the old message would be to report an error when bringing
function into cache and the function does not exist. Now the code returns
silently and wait until resolver time to set error.
1. Overall design

1.1 Resolving derived tables and view references

A reference to a derived table or a view (hereafter only referred to as derived
tables) is resolved during the regular recursive resolver stage implemented in
SELECT_LEX::prepare():

1. Derived tables are resolved after tables have been opened and after the
   local tables of a query block have been set up.

2. Transformations, such as merging derived tables into the enclosing query
   block, are done using a postfix traversal of the tree.

This way, we ensure that all outer information is available when resolving a
derived table, and when a derived table has been resolved, it has also been
transformed appropriately.

Example: A SELECT query with a derived table containing an IN subquery. The
derived table can be merged and the IN subquery can be transformed to a semi-join:

SELECT t1.a, dt.a
from t1 JOIN (SELECT * FROM t2
              WHERE t2.c IN (SELECT t3.d
                             FROM t3
                             )
              ) AS dt
     ON t1.a=dt.b
WHERE ...

Processing will be carried out like this:

Resolve outer-most query block (JOIN of t1 and derived table)
  Resolve derived table dt
    Resolve inner query (subquery)
  Transform inner query into a semi-join, merge with derived table dt
Transform derived table, merge with outer-most query block

Indentations are meant to describe recursion levels.

UPDATE, DELETE and INSERT operations are represented with a hierarchy of
SELECT_LEX nodes, where the root node represents the UPDATE, DELETE or INSERT,
and the inner nodes represent subqueries.

The root nodes in these operations are processed similar to a SELECT query, in
the sense that the root node is resolved first, followed by recursively
resolving the inner nodes. This means that tables referred by the data change
operation are set up before resolving the inner nodes.

The difference lies in the way tables are processed (for a SELECT, tables are
selected from, for an UPDATE, tables are updated, for a DELETE, tables are
deleted from...), and the different clauses supported by said operations.

Generally, we process derived tables of a query block using the function
SELECT_LEX::resolve_derived(). This function operates in three steps (here we
assume that there is at least one derived table to process):

1. Resolve the derived tables of the query block. Involves looking up tables,
   columns, resolving expressions, checking semantics, applying global
   transformations. The function TABLE_LIST::resolve_derived() does this.

2. Choose the derived tables to be merged into the current query block. This is
   currently done in the listed order, but could later be done in a more
   optimal way. This is carried out by SELECT_LEX::merge_derived().

3. Set up the remaining derived tables for materialization, using
   TABLE_LIST::setup_materialized_derived().

See section 1.3.7 for more details.

Operations that handle only one table, ie single-table UPDATE, DELETE and INSERT
are simplified: If the table to be handled is a view, that view is resolved
using TABLE_LIST::resolve_derived() and merged using SELECT_LEX::merge_derived()
unconditionally.

1.2 Privilege handling

For functional changes, see HLS section 1.4.

1.2.1 Current privilege handling

This is how privilege handling is done in versions up to 5.6:

- All statements have a precheck function that does rudimentary
  privilege checking on all tables and views referenced directly in
  the statement. SELECT statements require SELECT_ACL privilege, INSERT
  statements require INSERT_ACL privilege, etc.

- View expansion is done recursively in open_tables(), through function
  mysql_make_view(). Underlying tables are opened. Table references for
  views are created. View text is expanded into table expressions.
  Security contexts are defined for the looked up table references.
  Rudimentary privilege check for these tables is performed
  (rudimentary = any privilege is accepted).
 
- SELECT_LEX::prepare() further checks table privileges for the underlying
  tables of views, with privileges assigned based on the operation (ie SELECT,
  UPDATE, INSERT or DELETE). In addition, there are some specific table-level
  privilege checks for e.g multi-table UPDATE operations.

- Finally, column privileges are checked twice for each column, first for
  the underlying table and second for the view. The wanted privilege is
  determined based on how the column is used in the current operation,
  (In a SELECT, the privilege is always SELECT_ACL. In an UPDATE it is
  UPDATE_ACL for columns in the SET clause and SELECT_ACL for all other
  columns, etc.)

Notice that we have two levels of privilege checking:

1. Check table-level privileges by using check_access() and check_grant().
   check_grant() will also set a want_privilege for later column privilege
   checks.
   These checks are done using precheck functions and scattered across
   resolving.

2. Check column-level privileges during resolving of conditions and
   expression lists, using the want_privilege assigned from the latest
   table privilege check on the table that the column belongs to.

1.2.2 Problems with the current privilege handling

1.2.2.1 Column privilege checks are dependent upon table privileges

Column privilege checking does not supply the required privilege. Instead,
it relies on the proper table privileges being set up earlier.
As these calls may be stretched out across the optimizer code, maintaining
the proper dependencies between these calls can be challenging.

This procedure can be simplified if column privilege checking directly provide
the privileges that are needed.

1.2.2.2 Column list of view will be resolved earlier than current practice

Currently, views are resolved with SELECT_LEX::prepare() only when they are
materialized. Merged views are resolved as part of the context they
are used within.

This WL proposes to determine whether to materialize or merge a view at a later
stage. As a consequence, it is necessary to resolve all view references earlier.
However, column privilege checking is currently closely associated with this
function. This is OK for all conditions and groupings of the view, but for the
select list, we do not know the context in which the selected columns will be
used. In fact, we do not even know which of the columns of the view that will
be used later.

Thus, we have to change the way privileges are checked for the view columns, by
delaying the privilege checking of the view until the place where the columns
are used.

For an expression belonging to the view definition, this is still the same
place as before.

For a view column that is used by an outer query, this privilege checking
occurs when the outer query is resolved.

Example:
CREATE VIEW v1 AS SELECT a FROM t1 WHERE b > 0;
UPDATE v1 SET a=5;

Column b is checked for SELECT_ACL privilege when the view v1 is resolved.
Column a is checked for UPDATE_ACL privilege when the outer query is resolved.

Notice also that in the latter case, we actually check column privileges twice,
both for v1.a and for t1.a, and both must support the UPDATE_ACL privilege.

1.2.2.3 Lack of support for repeated privilege checking

Currently, privilege checking is interspersed with query resolving.

However, we have a WL#6570 that attempts to refactor statement planning
so that statement resolving happens only during statement preparation, even
though the query is executed multiple times. Nevertheless, we will have to
perform privilege checking for each execution, because the query planner is
not signalled when privileges are granted or revoked.

Thus, privilege checking might be performed like this:

On query preparation, check privileges while resolving the query.
(This will also ensure that non-prepared statement execution is still
performant).

On each execution, resolving is not necessary. Thus, we need to perform
an explicit step that checks privileges for all involved tables, views and
columns. (We must check privileges for the first execution, because we have no
guarantee that some privileges have not been revoked between preparation and
first execution).

There is also another complication with repeated privilege checking:
For the UPDATE query above, we need to check privileges both for v1.a and
for t1.a. This is fairly simple when a column is represented by a "wrapped
item", essentially an Item_field pointed to by an Item_ref for each
recursively defined view. In the above case, Item_field would reference
the table t1, and there would exist an Item_ref object that references this
Item_field and also the view v1.

With this structure, it is quite simple to check that one has the proper
privileges for all involved tables and views.

However, with non-wrapped items, only the Item_field object is exposed,
and this does not carry sufficient information to checking privileges
against views.

1.2.3 Privilege handling

This is the new design for privilege handling:

- All statements have a precheck function that does rudimentary
  privilege checking on all tables and views referenced in the statement.
  This is identical to the current handling.
  For fast column privilege checking, make sure that table privileges are
  copied to the TABLE_LIST::grant struct.

- During view expansion, check privileges for all referenced views and tables.
  This is also the same as before, and does a SELECT_ACL check only.
  For fast column privilege checking, make sure that table privileges are
  copied to the TABLE_LIST::grant struct.

- When resolving columns, make sure that we have proper privileges for each
  column. The desired privilege is determined from the context (ie select list,
  set clause, etc.) The privilege is checked against all involved views and
  tables.

  This check is similar both for an outer query and for subqueries, except
  that the select list of a subquery is not checked. Instead, columns selected
  from a view are privilege-checked in the context they are used in
  an outer query block.

In order to have sufficient context for view privilege checking during
execution of prepared statements, all view column references will be fully
wrapped. See section 1.2.2.3 for explanation).

In the future, when we have prepared statements that are resolved once,
privilege checking will take place like this for each execution:

- First, precheck all tables and views, just like for the preparation.

- Second, precheck all tables and views referenced from views, just like
  for the preparation.

- Third, check privileges for all involved columns, just like for the
  preparation.

The following sections outline the privilege handling for DML statements
that contain view references that are merged into the outer query.

Assume a table "t": CREATE TABLE t(a INTEGER, b INTEGER);
Assume a view "v": CREATE VIEW v AS SELECT * FROM t;

1.2.4 Processing SELECT statement:

SELECT * FROM v WHERE a > 100;

1. Precheck query
   ==> Make sure that we have some SELECT_ACL privilege for view "v"

2. Open view "v"
   ==> Expand view, create query block and table reference for table "t".
   ==> Set security context for table "t"
   ==> Check ANY privileges for table "t"

3. Prepare query block for the view "v"
   ==> Check SELECT_ACL privileges for columns referenced in WHERE, GROUP BY,
       HAVING, ORDER BY.

4. Prepare query block for outer query:
   ==> Merge query block for the view "v" into outer query.
   ==> Check SELECT_ACL privileges for columns referenced in select list,
       WHERE, GROUP BY, HAVING, ORDER BY.

1.2.5 Processing INSERT statement:

INSERT INTO v(a, b) VALUES (10, 100);

1. Precheck query
   ==> Make sure that we have some INSERT_ACL privilege for the view "v"
   ==> Make sure that we have some SELECT_ACL privilege for any other tables.

2. Open view "v"
   ==> Expand view, create query block and table reference for table "t".
   ==> Set security context for table "t"
   ==> Check that we have ANY privileges for table "t"

3. Prepare query block for view "v":
   ==> Check that we have SELECT_ACL privileges for columns referenced in WHERE.

4. Prepare query block for outer query:
   ==> Merge query block for the view "v" into outer query.
   ==> If INSERT ... SELECT, do as for SELECT for selected tables
   ==> Check INSERT_ACL privileges for columns in INSERT list
   ==> Check UPDATE_ACL privileges for columns in ON DUPLICATE list

1.2.6 Processing UPDATE statement

UPDATE v SET b=5 WHERE a > 100;

1. Precheck query
   ==> Make sure that we have some UPDATE_ACL privilege for the view "v"
   ==> Make sure that we have some SELECT_ACL privilege for tables in
       subqueries.

2. Open view "v"
   ==> Expand view, create query block and table reference for table "t".
   ==> Set security context for table "t"
   ==> Check that we have ANY privileges for table "t"

3. Prepare query block for view "v"
   ==> Check SELECT_ACL privileges for columns referenced in WHERE clause.

4. Prepare query block for outer query:
   ==> Merge query block for view "v" into outer query.
   ==> Check UPDATE_ACL privileges for columns in SET clause.
   ==> Check SELECT_ACL privileges for columns referenced in SET expressions and
       in WHERE.

1.2.7 Processing DELETE statement

DELETE FROM v WHERE a > 100;

1. Precheck query
   ==> Make sure that we have DELETE_ACL privilege for the view "v"
   ==> Make sure that we have some SELECT_ACL privilege for tables in
       subqueries.

2. Open view "v"
   ==> Expand view, create query block and table reference for table "t".
   ==> Set security context for table "t"
   ==> Check that we have ANY privileges for table "t"

3. Prepare query block for view "v"
   ==> Check SELECT_ACL privileges for columns referenced in WHERE clause.

4. Prepare query block for outer query:
   ==> Merge query block for view into outer query.
   ==> Check SELECT_ACL privileges for columns referenced in WHERE.

1.3 Processing a SELECT query

This section outlines the overall processing of a SELECT query, as well as
multi-table DELETE and UPDATE, which use basically the same data structures and
the same processing functions.

1.3.1 Precheck of privileges

After command dispatching, the first function called is select_precheck() which
basically checks that we have some SELECT_ACL privileges for all referenced
tables. The precheck function records a want_privilege in each table's
GRANT_INFO object, which is later used for column privilege checking.

(Multi-table UPDATE and DELETE have similar but different functions).

1.3.2 Opening tables

Tables are opened by calling open_tables_for_query(). This function was formerly
called open_normal_and_derived_tables(), but had to be renamed because it no
longer opens and processes derived tables.

The function opens base tables, views and referenced functions. There are no
changes here, except that derived tables are processed later.

For each view, the function mysql_make_view() is called (see section 1.3.3).

1.3.3 Creating view descriptors

mysql_make_view() creates a LEX object and a SELECT_LEX object for a view
reference, before attaching the LEX to the TABLE_LIST object of the view.

In addition, SELECT_LEX objects are created for all subqueries of the view, and
TABLE_LIST objects are created for each referenced table. Referenced tables may
in turn be view references, which are opened recursively by the same function.

This function is heavily refactored, see section 10.1 for details.

In addition, the function no longer merges views into the outer query. This is
now done as a part of resolving the outer query.

The function checks that we have some privileges to the underlying tables of the
view.

1.3.4 Creating query result object

After tables have been opened and view descriptors have been created, a query
result object of class select_send() is created. This is basically what
distinguishes a SELECT_LEX object for a SELECT query from one that is created
for an UPDATE, DELETE or INSERT statement.

1.3.5 Resolving the query

After the above steps have been carried out, we have a complete representation
of the query, and access to all metadata, including table and column information.

We now call SELECT_LEX::prepare() on the outer-most query block(s) of the query,
which will resolve the query and perform global transformations on it.
::prepare() is recursive and will resolve subqueries in prefix/postfix order.
This means that we first prepare the tables referenced in the FROM clause in
prefix order, before recursing down the hierarchy of subqueries and repeat the
process. When resolving a subquery, tables in all outer contexts have been
prepared, which means that we are able to resolve outer references (although
actual resolving of outer references is delayed to a later worklog).

After the tables have been prepared, the other clauses (SELECT list, WHERE,
GROUP BY, HAVING, ORDER BY) are resolved. When these clauses involve subqueries,
these subqueries are resolved by calling SELECT_LEX::prepare() as well.

This resolver process has always been available for subqueries in the SELECT
list, WHERE, GROUP BY, HAVING and the ORDER BY clauses. What is new is that we
also apply the same resolver process to subqueries in the FROM clause, aka
derived tables.

This resolver process is further described in subsequent sections.

1.3.6 Preparing tables

This is part of the resolver process for a query block. It is done by calling
SELECT_LEX::setup_tables(), which will prepare the base tables and views
referenced into a list of leaf tables. The list of tables is created by
traversing the list of local tables of the query block, in make_leaf_tables().

At this stage, derived tables have not yet been merged into the outer query, so
derived table references are represented just like regular base tables, meaning
that some of the leaf tables are still not expanded into base tables.

After creating the list of leaf tables, these tables are prepared for resolving.
Basically, this means calling TABLE_LIST::reset(), which initializes a few
fields and assigns a table number to the table. The table number ranges from 0
to MAX_TABLES-1 and is unique within this query block. In addition, we call
TABLE_LIST::process_index_hints() for each table.

We also count the number of leaf tables and the number of leaf tables that are
partitioned.

1.3.7 Resolving derived tables

After local tables have been prepared, we can resolve any derived tables and
views referenced inside the query block. This is done with
SELECT_LEX::resolve_derived().

This function loops over all derived tables three times:

- The first loop resolves all references inside the derived table by calling
  TABLE_LIST::resolved_derived().

- The second loop merges derived tables that can be merged into the local
  query block, using SELECT_LEX::merge_derived().

- The third loop prepares the remaining (non-merged) derived tables for
  materialization, using TABLE_LIST::setup_materialized_derived().

After this processing, and if some derived tables have been merged into this
query block, table numbers must be remapped. Consider e.g. a local query block
with three tables numbered 0, 1 and 2. Table number 1 is a view over a join of
two tables and that view was merged into the local query block. The two joined
tables are replacing the view in the leaf table list. Thus, these two tables are
mapped to table numbers 1 and 2, and the original table number 2 is mapped to 3.
leaf_table_count is increased from 3 to 4.

After remapping the table numbers, we need to update Item_field objects of the
resolved join conditions of the join nests of the local query block so that they
get correct table numbers. The resolved join conditions are the WHERE clauses
and join conditions of the merged views. This is done in function
update_child_used_tables().

All other Item_field objects are yet
unresolved and will be resolved with correct table numbers later.

1.3.8 Resolving one derived table

TABLE_LIST::resolve_derived() is responsible for resolving one derived table.
This is done by calling prepare() on the underlying query expression
(SELECT_LEX_UNIT) of the derived table, which again calls prepare() on the query
blocks of the query expression. As these calls are recursive, it means that when
this function is done, all underlying derived tables of every subquery of the
derived table will be fully resolved and either merged up to the parent query or
prepared for materialization.

While doing this, set THD::derived_tables_processing flag. This is used to
prevent privilege checking for the SELECT list of views during resolving. At
this point in time, we do not know the context of the view, so we want to delay
privilege checking until we have completed all merging of derived tables. But
notice that columns referenced in all other clauses of the subquery, such as the
WHERE clause, all need SELECT_ACL privilege.

There is one extra argument to this function: apply_semijoin. For regular query
expressions, this argument is always true, meaning that we allow semi-join
transformation to take place for subqueries inside the derived table. However,
sometimes semi-join transformation is impossible or unnecessary. E.g for
single-table UPDATE and DELETE, semi-join transformation is impossible because
we do not allow these operations to process more than one table in the outer
query block. For multi-table UPDATE it is impossible because we cannot use a
semi-join-transformed subquery predicate to construct a check option. For some
administration queries, a semi-join transformation is unnecessary.

1.3.9 Merging one derived table

SELECT_LEX::merge_derived() is responsible for merging a derived table into its
enclosing query block.

Before merging a derived table, the function checks that it is mergeable
according to the functional requirements:
 . derived table is mergeable, and
 . we are not just analyzing a view, and
 . there is room for the added tables from the derived table in the outer query.

- Populate merge_underlying_list of the TABLE_LIST with the list of tables
  inside the derived table.

- Change the name resolution contexts of all join nests of the derived table
  so that they reference the query block of the enclosing query.

- Create a Nested_join object containing references to the top-level tables
  and join nests of the derived table. Attach this structure to the TABLE_LIST
  object of the derived table.

- Add the base tables of the derived table as leaf tables of the enclosing
  query block, replacing the derived table reference in the leaf table list.
  (As derived tables have been resolved bottom-up, there are no unresolved
   derived table references left).

- If the derived table reference is on the inner side of an outer join,
  propagate nullability to all tables of the derived table (see section 1.3.16
  for details).

- Merge the WHERE condition of the derived table with the join condition of
  the table reference representing the derived table. Join conditions inside
  the derived table are left where they are (they will later be processed by
  simplify_joins()).

- Create field translations for the columns in the select list of the derived
  tables, which are later used for resolving references to these columns in
  the outer query blocks.

- fix_after_pullout() is called on WHERE and join conditions to correct the
  resolver data for the column references. This is mostly needed because of
  the need for secondary resolving for prepared statements. 

- Merge ORDER BY clause from derived table with the ORDER BY clause of the
  enclosing query block. Notice that the final ORDER BY list will be ordered
  according to the position of derived tables in the query: First, any ORDER
  BY clauses from the query will be listed, followed by ORDER BY clauses from
  the first derived table, followed by clauses from the second derived table,
  etc.

This process merges multiple query blocks into one. Some redundant TABLE_LIST
objects exist after the merge, but these are later removed by the call to
simplify_joins(), which is called after all query block merging is done.

This is an example of how data structures are transformed. Consider these tables
and views:

create table t1(a integer, b integer);
create table t2(c integer, d integer);
create table t3(e integer, f integer);
create view v3 as select * from t2 join t3 on t2.c = t3.e where t2.c > 0;

Then we issue the following SELECT statement:

select * from t1 join v3 on t1.a = v3.c where t1.a > 0;

Representation of view v3:

+-------------+
|             | where_cond +----------+
| view-query  |----------->| t2.c > 0 |
|             |            +----------+
+-----+-------+
      |            +------------------+
      +----------->| (nest_last_join) |
     top_join_list +--------+---------+
                            |     +------+     +------+
                            +---->|  t3  +---->|  t2  |
                                  +--+---+     +------+
                                     |        +-------------+
                                     +------->| t2.c = t3.e |
                                    join_cond +-------------+

Representation of outer query before merging:

+-------------+
|             | where_cond +----------+
| outer-query |----------->| t1.a > 0 |
|             |            +----------+
+-----+-------+
      |            +------------------+
      +----------->| (nest_last_join) |
     top_join_list +--------+---------+
                            |     +------+     +------+
                            +---->|  v3  +---->|  t1  |
                                  +--+---+     +------+
                                     |        +-------------+
                                     +------->| t1.a = v3.c |
                                    join_cond +-------------+

Representation of outer query after merging:

+-------------+
|             | where_cond +----------+
| outer-query |----------->| t1.a > 0 |
|             |            +----------+
+-----+-------+
  top |   +------------------+         +--------------------------+
 join +-->| (nest_last_join) | +------>| t1.a = v3.c and t2.c > 0 |
 list     +--------+---------+ |       +--------------------------+
                   |     +-----++     +------+
                   +---->|  v3  +---->|  t1  |
                         +--+---+     +------+
                            |
                   +------------------+
                   | (nest_last_join) |
                   +---------+--------+
                             |
                             |     +------+     +------+
                             +---->|  t3  +---->|  t2  |
                                   +--+---+     +------+
                                      |       +-------------+
                                      +------>| t2.c = t3.e |
                                    join_cond +-------------+


After simplify_joins() has normalized this, we have:

+-------------+
|             | where_cond +-----------------------------------------------+
| outer-query |----------->| t1.a>0 and t1.a=t2.c and t2.c>0 and t2.c=t3.e |
|             |            +-----------------------------------------------+
+-----+-------+
      |            +-----+     +------+     +------+
      +----------->| t3  +---->|  t2  +---->|  t1  |
     top_join_list +-----+     +------+     +------+

The original list of tables and views is still preserved in SELECT_LEX::table_list.

1.3.10 Setup derived table for materialization

TABLE_LIST::setup_materialized_derived() sets up the non-merged derived tables
for materialization.

The function select_union::create_result_table() is called to create a TABLE
object for the temporary table, which is then attached to the TABLE_LIST object
for the derived table.

Some processing is then done for the underlying query blocks of the derived table:

- As the involved tables are read before the outer query is executed, these
  tables may be freely used in data change operations involving the same tables.
  The function propagate_unique_test_exclusion() is called to ensure this.

- SELECT_ACL privileges are required for all tables of the derived tables.
  Column privileges need not be checked here, as they either have been checked
  already (if part of WHERE, GROUP BY or HAVING clauses) or will be checked
  as part of outer query privilege checking.

- Loop over all selected expressions in the select list of the derived table
  and record them in the table read sets.

1.3.11 Checking privileges for views

After derived tables have been properly resolved, call
SELECT_LEX::check_privs_for_views() to prepare any views for column privilege
checking, by setting the want_privilege of every table and view in a chain of
TABLE_LIST descriptions, by following the "belong_to_view" pointer and calling
the check_single_table_access() function for each.

1.3.12 Setup natural join row types

The function setup_natural_join_row_types() was previously called as part of
setup_tables(). But then, derived tables were resolved prior to resolving the
enclosing query block, and columns referred in join conditions may be referring
to derived tables, hence this function had to be taken out of setup_tables() and
must be called after resolving the derived tables.

1.3.13 Processing check options

Check option processing is applicable to INSERT and UPDATE against views.
Check option processing can be cascading or non-cascading. This property
determines how we deal with check options defined for a view referenced from
within another view.

Preparation of check option is similar to how it was before implementing this
WL: After preparation and merging of derived tables and views is complete, call
TABLE_LIST::prepare_check_option(), which will recursively descend down the tree
of view TABLE_LIST objects and prepare a check option condition based on the
views' WHERE conditions and join conditions.

For a prepared statement, the check option tree is built and resolved when the
statement is prepared. For subsequent executions, the check option tree is
simply resolved.

The change is that TABLE_LIST::where has been eliminated. This is now available
as the join condition of the join nest created for the view reference, meaning
that the check option tree can be generated simply by merging join conditions.

Due to the way we now propagate WHERE clauses of derived tables into the join
nest of the table reference, the calls to TABLE_LIST::prepare_where() inside
SELECT_LEX::setup_conds() have been eliminated.

1.3.14 Remaining part of resolver process

The remaining part of the resolver process is unaffected, except for:

- Partition pruning is moved into SELECT_LEX::apply_local_transforms(), meaning
  that it is done after query block merging (slightly more efficient and there
  are possibly more equality conditions available for pruning).
  See also section 1.3.17

- SELECT_LEX::apply_local_transforms() is no longer called explicitly for
  derived tables, instead derived tables are processed recursively by following
  child SELECT_LEX_UNIT pointers.

- Note also that SELECT_LEX::apply_local_transforms() is extended with a THD
  argument and a prune argument. The THD argument is needed as prerequisite
  for the case when JOIN object is not available during preparation.
  The prune argument is necessary because partition pruning here should only
  take place for regular (sub)query resolving. UPDATE and DELETE processing has
  a specific pruning step, so it cannot be performed in this general function.

Checking column privileges is unaffected for columns referenced in WHERE clause,
join conditions, GROUP BY clause, HAVING clause and ORDER BY clause.

1.3.15 View error processing

Errors when checking privileges to views need some special treatment, as we do
not want to reveal too much of the structure of a view to unprivileged users.
Therefore, an error processor is registered for the query block of a view, which
when called will transform some specific error messages into a more "neutral" form.

This error processor is called e.g when checking column privileges and resolving
column references, and a problem occurs inside a view definition.

1.3.16 Nullability propagation

Nullability propagation is applied in two cases:

1. For all inner tables of an outer join, these tables are always "nullable"

2. When query blocks are merged into the inner side of an outer join, as
   part of derived tables merging and semi-join transformation, all tables
   from the merged query block will become "nullable".

Such nullability propagation has been consolidated into one recursive function:
propagate_nullability().

1.3.17 Partition pruning

Partition pruning before locking tables is now applied within the
apply_local_transforms() function. This ensure that query block merging has
taken place, and hence all equalities between the merged query block and the
outer query block have been considered.

This function handles all necessary pruning for SELECT statements, the SELECT
part of INSERT ... SELECT statements and multi-table DELETE and UPDATE
statements. For single-table DELETE and UPDATE statements,
apply_local_transforms() affects any subquery tables, but it is still needed to
perform explicit partition pruning before locking tables on the table being
deleted from or updated.

Partition pruning after locking tables is not affected.

1.3.18 Optimization changes

There are no changes in the optimization process, except that the
handle_derived() call with mysql_derived_optimize as argument is replaced with
an explicit loop over all leaf tables and a call to
TABLE_LIST::optimize_derived() for each derived table.

1.3.19 Execution changes

There are no changes in the execution process, except that the handle_derived()
call with mysql_derived_create as argument is replaced with
TABLE_LIST::create_derived(), and in join_materialize_derived(), the calls to
mysql_handle_single_derived() have been replaced with calls to
TABLE_LIST::materialize_derived() and TABLE_LIST::cleanup_derived().

1.4 Processing a prepared SELECT query

Preparing a prepared statement goes through the same steps as when preparing a
regular statement, so the changes are the same here.

Executing a prepared statement goes through the same steps as when executing a
regular statement, but there are some differences in the preparation part:

- All permanent transformations have already been done and need not be repeated.

- In SELECT_LEX::setup_tables(), mergeable derived tables will have been merged,
  meaning that setup of leaf tables is complete after the function call.

- In SELECT_LEX::resolve_derived(), no derived tables will be merged, as all
  merging has been performed as a permanent transformation. Only the derived
  tables that are marked for materialization need to be prepared.
  However, as we do not traverse the query blocks belonging to derived tables
  that have been merged, another loop is necessary to setup for materialization
  those derived tables that are contained within other merged derived tables.
  Luckily, these tables have already been identified as leaf tables, so we only
  need to traverse the list of leaf tables to find them.

1.5 Processing an INSERT statement

1.5.1 Opening tables

When opening tables for INSERT, we call SELECT_LEX::setup_tables(). If the table
to insert into is a view, call TABLE_LIST::resolve_derived() and
TABLE_LIST::merge_derived() on it, as we know it should be a mergeable view.

There is a call to SELECT_LEX::resolve_derived() that will process derived
tables in the query expression of an INSERT ... SELECT statement. (This seems a
bit redundant, as there is partial overlap with the call to
TABLE_LIST::resolve_derived(), but it is needed because the first table in the
list of tables is special for INSERT statements. It might be easier if the table
to be inserted into was separate from the list of tables).

1.5.2 Preparing check option

The call to TABLE_LIST::prepare_check_option() was moved to the end of
mysql_prepare_insert(). This way it will cover the needs both for INSERT ...
VALUES and for INSERT ... SELECT statements.

1.6 Processing a single-table UPDATE statement

Opening the table to be updated had to be extracted into a new function
mysql_update_prepare_table() and called before dispatching to either
mysql_update() or mysql_multi_update(). This was needed because merging derived
tables might cause the query block to contain multiple base tables, which
requires execution as a multi-table UPDATE statement.

setup_fields_with_no_wrap() is replaced with setup_fields(). See section 1.2.2.3
for explanation.

check_fields() is called unconditionally and returns original_update_fields.

SELECT_LEX::apply_local_transforms() is called at the end of the resolver phase.

There is also new separate code for checking update privileges for columns to be
updated.

1.7 Processing a multi-table UPDATE statement

mysql_multi_update_prepare() now calls SELECT_LEX::setup_tables(),
SELECT_LEX::resolve_derived() and setup_natural_join_columns().

setup_fields_with_no_wrap() is replaced with setup_fields(). See section 1.2.2.3
for explanation.

check_fields() is called unconditionally and returns original_update_fields.

A new privilege check is added for columns that are updated: We loop over all
fields to be updated and validate privileges using
Item::check_column_privileges() applied to a set of wrapped Item_field objects.
In addition, these columns are registered in the write map of the appropriate
tables. These column privilege checks are done just after tables are "prepared"
for privilege checks through multi_update_check_table_access().

These privilege checks were earlier done implicitly through
Item_field::fix_fields() when resolving columns to be updated. This was not
sufficient because we kept non-wrapped fields, so privilege checks for prepared
statements might be wrong if view privileges were changed. Doing the privilege
check explicitly on wrapped Item_field objects means that we track all privilege
changes throughout the lifetime of a prepared statement.

Preparation of such statements goes through SELECT:LEX::prepare(), which will
call functions like SELECT_LEX::apply_local_transforms().

1.8 Processing a single-table DELETE statement

After opening tables, we process tables by calling SELECT_LEX::setup_tables(),
SELECT_LEX::resolve_derived() (if we have derived tables).

We also need to check view privileges with SELECT_LEX::check_privs_for_views()
if we are deleting from a view.

SELECT_LEX::apply_local_transforms() is done as last part of preparation.

1.9 Processing a multi-table DELETE statement

After opening tables, we process tables by calling SELECT_LEX::setup_tables(),
SELECT_LEX::resolve_derived() (if we have derived tables).

We also need to check view privileges with SELECT_LEX::check_privs_for_views()
if we are deleting from one or more views.

1.10 Processing administrative statements

This is applicable to:

SHOW CREATE TABLE
SHOW COLUMNS
Commands calling "get_all_tables".

For SHOW CREATE TABLE, it is necessary to resolve views, but not prepare for
materialization.

For SHOW COLUMNS, we need to resolve views and prepare for materialization of
such, in order to list column information.

This is also done for commands calling "get_all_tables" (although I do not think
that preparing a materialization is needed for all those commands).For major
design decisions, see HLS chapter 2. The LLD is a summary of low-level
design changes with references back to the design chapter.

1.11 Query property interfaces

This WL also adds some high-level property interfaces for the SELECT_LEX and
TABLE_LIST class.

For class SELECT_LEX, properties have been added that let us check whether a
query block is aggregated, is grouped (explicitly or implicitly), is distinct,
is ordered or has a limit.

Note that these functions can be applied both before and after transformations,
so their result may vary based on order of calls.

Example: We may delete an ORDER BY clause for an aggregated query because it
does not make sense to order a result set with only one row. Calling
is_ordered() before this transformation will return true, calling it after
transformation will return false.

Notice that these interfaces are not implemented throughout the code base.

For class TABLE_LIST, properties have been added that let us set or check
whether a table is updatable, check whether a table is a derived table or a
view, whether a derived table or view is mergeable, whether it has been merged,
or whether it has been set up for materialization.

2. Class TABLE_LIST changes

There are extensive changes made to this class.
Some members have been made private and access is replaced with setters and
getters.
New functionality for handling derived tables has been added.

See sections 1.3 through 1.8 for an overview of these design changes.

2.1 void set_updatable()

This function sets the table as updatable (ie it is the target of an
INSERT, UPDATE or a DELETE). All tables start as non-updatable, so a
setter for non-updatable is not necessary.

2.2 bool is_updatable()

Returns true if the table is updatable, false otherwise.

2.3 bool is_view_or_derived()

Returns true if this represents a view or a derived table (i.e a named
or an unnamed derived table), and false otherwise.

2.4 bool is_derived()

Returns true if this represents a derived table, and false otherwise.

2.5 bool is_view()

Returns true if this represents a view (ie. a named derived table), and
false otherwise.

2.6 bool is_multiple_tables()

Returns true if this represents multiple base tables, and false otherwise.
Can be true only for views and derived tables.

2.7 bool is_mergeable()

Returns true if this represents a derived table or view that can be merged
into the outer query block.
A view is mergeable if its query expression is mergeable and it does not
contain an explicit materialization directive (ALGORITHM_TEMPTABLE).
A derived table is mergeable if its query expression is mergeable.

2.8 bool is_merged()

Returns true if this represents a derived table or view that is merged
into the outer query block. This property is permanent for the lifetime of
a prepared statement.

2.9 bool is_placeholder()

Returns true if this is a "placeholder", which seems to mean that it is a
derived table, a view, a schema table, or does not have a TABLE object for
some reason.

2.10 uint leaf_tables_count()

Should be called only for a derived table or view.
Returns the number of underlying base tables.

2.11 void set_view_query(LEX *lex)

Set the view query (a LEX object) for a derived table or view.
The special value (LEX *)1 is used to indicate a view for which the definition
is not available.

2.12 LEX *view_query()

Get the view query (a LEX object) for a derived table or view.

2.13 void set_derived_unit(st_select_lex_unit *unit)

Set the query expression for a derived table or view.

2.14 st_select_lex_unit *derived_unit()

Get the query expression for a derived table or view.

2.15 void reset(uint tableno)

Reset table before resolving:
Assign table number and map for table.
Reset optimizer data for table.
Reset execution status for table.

2.16 bool resolve_derived(THD *thd, bool apply_semijoin)

Resolve a reference to a view or a derived table, including the query
expression defining the view.

This function is a prerequisite for merging or materializing a derived table,
see section 1.3.8 for an overview.

2.17 bool merge_derived(st_select_lex *target_select)

Merge a derived table/view reference into an outer query block.
The derived table must be prepared before merge is attempted.
If the derived table is not mergeable or some other constraints prevent
the merging, the request is silently ignored.
Check the is_merged() property to obtain the status of the merge request.

See also section 1.3.9.

2.18 bool setup_materialized_derived(THD *thd)

Setup a derived table/view reference for materialization into a temporary table.
The derived table must be resolved before merge is attempted.
This function must not be called for a derived table that is merged.

Notice that unlike for merging, materialization setup has to be performed for
every execution of a prepared statement.

See also section 1.3.10.

2.19 bool create_field_translation(THD *thd)

Create field translations for the columns defined by a derived table or view.
Field translations are required so that the columns of a merged derived table
can be resolved in the context of the outer query block.

2.20 bool merge_where(THD *thd)

Merge the WHERE clause of a derived table or view into the condition of
the outer query. Within the outer query, a nested join object is created
for this derived table, and the WHERE clause is attached as the join condition
of that nested join.

2.21 bool prepare_check_option(THD *thd)

Prepare a combined check option condition for a view based on check options of
this and underlying views, and of the effective check option indicator.

2.22 void set_tableno(uint tableno)

Assign a unique table number and a corresponding table map for a table
within one query block.

2.23 bool merge_underlying_tables(select_lex)

Merge the tables from a query block into a nested join struct.
The nested join will represent a merged view in the outer query.

2.24 ALGORITHM specifications have been revised:

ALGORITHM_UNDEFINED - algorithm has not been determined yet.
ALGORITHM_TEMPTABLE - derived table/view will be materialized in
                      a temporary table.
ALGORITHM_MERGE     - derived table/view will be merged into outer query.

3. Class TABLE changes

Some utility functions have been added.

3.1 void set_nullable()

Set table as nullable, used for inner tables in an outer join.

3.2 bool is_nullable()

Returns true if table is nullable (see set_nullable()).

3.3 void mark_column_used(THD *thd, Field *field, enum enum_mark_used mark_used)

Mark a column as read or written.

4. Class Item changes

4.1 void register_field_in_write_map(uchar *arg)

Mark columns inside an expression tree as written.

Implemented for sub-classes:

- Item_field
- Item_func_set_user_var

4.2 void check_column_privileges()

This function accepts a class of type Column_priv_check and checks column
privileges for an Item_field object, or an Item_field object wrapped in one or
more Item_direct_view_ref objects.

Column_priv_check identifies the set of required privileges.

Implemented for sub-classes:

- Item_field
- Item_direct_view_ref

For a wrapped field, privileges must be validated both for the base table and
for any views wrapping that field.

Error messages are reported through the name resolution context of the item,
which may have been set up to convert specific privilege error messages into a
more generic one, such as "View references invalid table(s) or column(s) or
function(s) or definer/invoker of view lacks rights to use them".

5. Class SELECT_LEX changes

The changes are made up of a set of utility functions that describe properties
of the query block. Notice that the utility functions are not always used when
possible. Implementing full usage is a task that is postponed to some later stage.

5.1 bool setup_tables(THD *thd, TABLE_LIST *tables, bool select_insert)

This function is moved from sql_base.cc. It is also called from more places, as
the function setup_tables_and_check_access(), which used to call it, has
been eliminated.

5.2 bool resolve_derived(THD *thd)

This is a new function that is used to resolve all derived tables and views
referenced in a query block. It will make three passes over the derived tables:
1. Prepare all derived tables.
2. Merge applicable derived tables into the query block.
3. Set up the remaining derived tables for materialization.

After merging, table numbers of the query block will be re-mapped, as the
set of leaf tables may have changed due to merging derived tables.

See also section 1.3.7.

5.3  bool check_privs_for_views(thd, false, DELETE_ACL, SELECT_ACL)

Check privileges for views referenced from a query block.
Called from SELECT_LEX::prepare().

5.4 bool is_grouped()

Return true if query block is grouped, ie whether it
- is explicitly grouped, or
- contains one or more set functions (implicitly grouped)

5.5 bool is_explicitly_grouped()

Return true if query block contains a non-empty GROUP BY specification,
false otherwise.
Return false if a GROUP BY specification has been determined to be redundant.

5.6 bool is_implicitly_grouped()

Return true if query block contains one or more set functions but is not
explicitly grouped, and false otherwise.

5.7 bool is_single_grouped()

Return true if query block is implicitly grouped and does not contain a HAVING
clause.

5.8 bool is_distinct()

Return true if query block contains a DISTINCT flag, false otherwise.
Return false 

5.9 bool is_ordered()

Return true if query block contains an ORDER BY specification, false otherwise.
Return false if an ORDER BY specification has been determined to be redundant.

5.10 bool has_limit()

Return true if query block contains a LIMIT specification.

5.11 leaf_table_count

This field counts the number of leaf tables in query block after merging

5.12 derived_table_count

This field counts the number of original derived tables and views.

5.13 materialized_derived_table_count

This field counts the number of derived tables and views which are materialized.

6. Class SELECT_LEX_UNIT changes

6.1 bool is_mergeable()

This function returns true if the query expression it represents
- does not contain a UNION clause, and
- is not aggregated, and
- does not contain a DISTINCT keyword, and
- does not have a LIMIT or OFFSET clause, and
- does not assign any variables.

7. Resolver changes

7.1 setup_natural_join_row_types()

This function is no longer static, as it must be called from
SELECT_LEX::prepare(). The reason for this is that it may refer to derived
tables, which are now resolved after setup_tables() is called. (Otherwise
it would not be possible to resolve outer references in derived tables).

7.2 setup_tables_and_check_access()

This function has been replaced with SELECT_LEX::setup_tables() and
SELECT_LEX::check_privs_for_views().

7.3 void update_child_used_tables(List<TABLE_LIST> *tables)

This function walks through the WHERE condition and the join conditions of a
merged view and updates table number information to be correct after merging.

7.4 TABLE_LIST **make_leaves_list(TABLE_LIST **list, TABLE_LIST *tables)

This function has been moved from sql_base.cc to sql_resolver.cc, and renamed
to make_leaf_tables().

7.5 void propagate_nullability(List<TABLE_LIST> *tables, bool nullable)

This function propagates the nullability attribute into join nests on the
inner side of an outer join. This is a recursive function that navigates through
the join nest information, thus simplifying the implementation.

7.6 void repoint_contexts_of_join_nests()

This function is moved from sql_view.cc to sql_resolver.cc.

7.7 setup_fields() changes

Instead of a mark_used_columns argument, this function now has an operation
argument, which is the same as the privilege required for the searched columns.
The applicable mark argument (MARK_COLUMNS_READ or MARK_COLUMNS_WRITE) is
derived from the operation.

7.8 setup_conds()

This function has been made a member function of class SELECT_LEX.

As derived tables and views are now resolved before resolving the outer query,
it is no longer needed to call TABLE_LIST::prepare_where() for join nests.

The label err_no_arena has been eliminated, thus simplifying error handling (we
will not need to preserve environment in case of error, as it will be reset for
each resolver attempt.

7.9 SELECT_LEX::propagate_unique_test_exclusion()

This is a new function that propagates unique test exclusion into a tree of
query expression objects.

7.10 Column privilege propagation

See section 1.2 for an overview.

This is applicable to the functions find_field_in_tables() and
find_field_in_table_ref() and their users: Item_field::fix_fields(),
Item_field::fix_outer_field(), Item_ref::fix_fields() and mysql_table_grant().

When checking privileges for a column, the desired privileges are passed
directly to these functions. The resolver function SELECT_LEX::prepare() sets
thd->want_privilege with the desired privilege before calling the Item resolver
functions, which will pick up the desired privilege and pass it to the inner
functions.

Notice that when checking the select list of a derived table, the desired
privilege is empty. The actual privilege checking takes place when the query
block containing references to view/derived table fields is resolved.

To check privileges of columns in an arbitrary expression, the interface
Item::check_column_privileges() is added. See section 4.2 for details. Specific
implementations are provided for the classes Item_field and
Item_direct_view_ref. The latter will suppress specific errors and replace them
with a generic "view privilege" message.

8. Data change processing changes (UPDATE, DELETE, INSERT)

8.1 bool mysql_update_prepare_table(THD *thd, st_select_lex *select)

This is a new function that resolves the single target table of an UPDATE
statement. If this table is a view, it is prepared and merged so that a
single base table is ready.

8.2 mysql_prepare_insert_check_table()

This function does derived tables processing for INSERT statements. It is a bit
special, as we first call TABLE_LIST::resolve_derived() and
TABLE_LIST::merge_derived() on the table being inserted into, and then call
SELECT_LEX::resolve_derived() on the query block. This last function will also
cover the inserted table, but as this is already processed, it will be ignored,
and will only handle derived tables in the SELECT part of the INSERT statement.

This procedure might be streamlined, but it seems the best solution given the
current data structures. Separating the inserted table completely out of the
SELECT_LEX tables list may be an interesting alternative.

9. Privilege check changes

Privilege checking is mostly changed by changing the places where the
check functions are called. There are only two interface changes, but see also
the Item changes in section 4.2.

9.1 bool check_grant_column()

This function is extended with an argument want_privilege, which is used
when checking the privilege for that column. Earlier, this value was
taken from grant.want_privilege of of the TABLE_LIST object of the table
that the column belongs to.

To reviewer: We still assert that this is the same, but this check can be
removed when we are confident with this change.

9.2 bool check_column_grant_in_table_ref()

A want_privilege argument is added, which is passed on to check_grant_column().

10. View resolution changes

10.1. mysql_make_view() - creating internal representation from view

See also section 1.3.3.

This function reads the definition of a view and transforms it into a LEX object
with an underlying query expression (SELECT_LEX_UNIT), which optionally may
contain subqueries.

The function is heavily refactored - here is a list of changes:

- argument table is changed to view_ref - so it is easy to recognize as a view
  and not a table.

- a Make_view_tracker RAII class is created to simplify cleanup after error
  handling inside the function.

- local variables are declared when needed and scoped when possible.

- views are no longer merged into the outer query if possible. Instead, a view
  is marked as mergeable when possible and the actual merging takes place in
  TABLE_LIST::merge_derived().

Here are the main processing steps:

- If the view is already read in a prepared statement context, the function
  just prepares a security descriptor and returns.

- Create LEX object for the view and parse the view, using this LEX as context.

- Check that we have some privileges to the view. Detailed operation and column
  privilege checking is performed later.

- Prepare underlying tables so they properly belong to a view.

- Attach the underlying tables of the view in the global list of tables of the
  enclosing query.

- If view can be merged into the outer query, set lock type for view so that
  it can be used in update operations.

- For views with definer's rights, create a security context for the view,
  otherwise use the user's current security context. Then assign this context
  to all underlying tables of the view.

- Setup an error processor to hide detailed errors when occurring inside a view.

- Finally, link the query expression of the view into the list of derived tables
  and views of the enclosing query.

11. Base functions

These functions are mostly about opening tables, see also section 1.3.2.

11.1 open_and_lock_tables()

This function has lost the "derived" argument, because derived tables are no
longer processed when opening tables.

11.2 open_normal_and_derived_tables()

This function no longer processes derived tables. It has been renamed to
open_tables_for_query(), as it is used generally to open all the tables required
for processing of a query (or statement).

12. Removed functions

12.1 update_field_dependencies()

This function has been removed from the resolver. It is replaced with calls to
TABLE::mark_column_used().

12.2 setup_fields_with_no_wrap()

As we no longer use "unwrapped" fields, this function has been deleted, and
calls replaced with setup_fields().

12.3 mysql_handle_single_derived()

This was brief wrapper over the derived table "processor" function, so it was
easy to eliminate.

12.4 mysql_handle_derived()

We no longer process all derived tables through the global list of query blocks,
so this can be eliminated.

12.5 mysql_derived_prepare()

This function is replaced by TABLE_LIST::resolve_derived().

12.6 mysql_derived_optimize()

This function is replaced by TABLE_LIST::optimize_derived().

12.7 mysql_derived_create()

This function is replaced by TABLE_LIST::create_derived().

12.8 mysql_derived_materialize()

This function is replaced by TABLE_LIST::materialize_derived().

12.9 mysql_derived_cleanup()

This function is replaced by TABLE_LIST::cleanup_derived().