WL#9571: Remove trivial conditions before outer to inner join transformation

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

Currently mysql does not remove trivial conditions (expressions over literals)
during preparation. This results in poor plans for some queries.

For ex:
There are two queries of the form
1.Select * from t1 left join t2 on "cond1" where cond2 or 0 = 1;
2.Select * from t1 left join t2 on "cond1" where cond2 ;

As 0 = 1 is always false, it is optimized away. So both
the queries would be identical. But yet optimizer gives a better
plan for the latter. This is because, it converts outer join to
inner join and re-writes the above query to "Select ..... where
cond2 and cond1" and thereby resulting in a good plan.

While for the first query, we cannot convert the outer join to
inner join as not_null_tables_cache is different than the used
tables ( thanks to "0 = 1" condition). As a result, the join
condition is left as is. But 0 = 1 gets optimized away in removal
of trivial conditions which happens later.

So removing these trivial conditions during preparation is much more
advantageous than in optimization phase. The following logic can be adapted to
remove these conditions:
For an OR expression, if the result of any constant condition is TRUE, then
the entire condition can be collapsed and replaced with a ALWAYS TRUE item.
Else only the expression over literals is removed.
Similarly for an AND condition, if the result of any constant condition is
FALSE, then the entire condition can be collapsed and replaced with ALWAYS FALSE
item. Else only the expreseion over literals is removed.


Ref.BUG#69359
F1: A constant expression in a condition (WHERE, HAVING, ON) will be used to
determine the result of the condition if possible during preparation phase.

F2: Any expression in a condition which cannot be constant for all executions
will not be considered as a candidate for this optimization.


NF1: Performance of all the queries which do not have conditions with constant
expressions will remain the same.

NF2: Performance of all the queries having constant expressions in a condition
can see improvements in some cases (like in the example mentioned in executive
summary).

NF3: Any errors found during evaluation of a constant expression in a condition
during "execution" can be thrown during "prepare" phase now.
Parser:
No changes

Resolver:
Item_cond::fix_fields checks for constant expressions and removes them based on
the following logic:
If an item is a trivial condition like a literal or an operation
on literal(s), the item is evaluated and based on the result, decides
if the entire condition can be replaced with an ALWAYS TRUE or
ALWAYS FALSE item.
For every constant expression, if the result is found to be “true” after
evaluation and is part of an OR condition, we can replace the entire condition
with an ALWAYS TRUE item. If it's part of an AND condition we just have to
remove this constant expression from the list of arguments to the condition (if
it's not the only argument in the condition).
If the result of constant expression evaluation is false, and it is found to be
part of an AND condition it is replaced with an ALWAYS FALSE item and if it's
part of an OR condition we remove just this constant expression from the list of
arguments to the condition (if it's not the only argument in the condition).
This results in a simplified condition which can be used to determine the
used_tables_cache and not_null_tables_cache.
Also, this optimization is not applied for expressions having parameters, stored
procedure arguments, views (and some special cases). Care is taken to handle
subqueries too.

To determine if an expression is trivial or not, a new member function
"is_non_const_over_literals()" for class Item is introduced. If true, then the
expression does not contain only literals and therefore cannot be evaluated
during preparation. All the Item classes which cannot be constant during
preparation have been modified to return true for this function. For ex: Item_param

The above transformation results in conditions (WHERE, HAVING and ON) becoming
ALWAYS TRUE conditions which can be removed or ALWAYS FALSE which can result in
early return from the query execution. Function "simplify_const_conds()" checks
the same and does the necessary changes to conditions.

Some changes are done to handle cleanup after removal of subqueries (if found
constant).

Optimizer:
None

Executor:
None