WL#12358: Ensure that all predicates in SQL conditions are complete
Affects: Server-8.0
—
Status: Complete
The purpose of this worklog is to enforce that all conditions in MySQL query expressions consist of complete predicates, possibly combined using AND, OR, XOR and NOT. This means that the resolver, optimizer and executor components do not need to care about incomplete predicates. Incomplete predicates are non-boolean expressions that are interpreted as comparison predicates by extending the expression with "<> 0". Ie, specifying a single column or constant value where a predicate or condition is expected implies returning true when that column or constant value is different from zero. The incomplete predicates will be substituted for non-equalities during the contextualization phase, thus resolver, optimizer and executor will only have to deal with complete predicates.
FR1. Ensure that all predicates in WHERE, HAVING and ON clauses are complete, ie not on the form of a simple non-boolean expression. FR2. No semantic changes are expected, except that some marginal comparisons are now performed after converting the operands to double. FR3. Only complete predicates are written to the dictionary as part of conditions in generated columns, default expressions and view definitions. NF1. No performance regressions are expected.
This is mostly a refactoring effort. Incomplete predicates that are extended with "<> 0" still behave almost the same. The exception is a few cases where the comparison operation is carried out using floating point operations after extension. This is, however, a minor problem and will probably not be noticed in practice.
In the contextualization phase, all predicates produced by the parser are made complete predicates in the new function make_condition(). If the function Item::is_bool_func() returns true, the predicate is already complete and is returned as-is. Otherwise, the incomplete predicate is wrapped in an object of class Item_func_ne, and a constant zero value is provided as the second operand for that object. make_condition() is applied to WHERE and HAVING clauses, as well as ON clauses for joined tables. Item::is_bool_func() returns true for all boolean functions and operators, ie condition operators (AND, OR, XOR, NOT), regular predicates (comparison predicates, BETWEEN, LIKE, IN etc) and subquery predicates (IN, EXISTS, ANY/SOME). Note that the latter do not inherit from Item_bool_func, but their base class (Item_exists_subselect) implements is_bool_func() to return true. In addition, a new class Item_func_bool_const with subclasses Item_func_true and Item_func_false are implemented. These new classes are helper classes used in resolving and optimization to represent predicates that are always true or always false. The function negate_condition() (former negate_expression()) could be simplified slightly: when called with the condition NOT (NOT (<simple_condition>)) will now simply return <simple_condition>, since <simple_condition> can no longer be an incomplete predicate. The function neg_transformer() of class Item is renamed to negate_transformer(). The function neg_arguments() of class Item_cond is renamed to negate_arguments(). Wherever an "always true" or "always false" is required during resolving or optimization, an Item_func_true or Item_func_false object is inserted. Previously, a one or zero integer constant were used to represents these values. Extensive asserts for is_bool_func() are added wherever an Item * that represents a condition value is required. Notice that some places, conditions wrapped in Item_aggregate_ref objects are accepted, meaning that a real_item() function must be applied to the condition within the assert. The Item_func_match function needs special treatment: The MATCH clause can be used as a function returning a floating point value in the SELECT list or the WHERE clause. However, it may also be used as a boolean function in the WHERE clause, where it has different semantics than when used together with a comparison operator. With a comparison operator, the match operation is performed with ranking. To preserve this behavior, the simplest solution was to wrap the Item_func_match object inside an object of the new class Item_func_match_predicate, which effectively transforms the function into a predicate. The overridden functions implemented in this class generally forward all evaluation to the underlying object. In EXPLAIN output, the boolean constants are now written as "true" and "false", changed from the earlier "1" and "0", because we now use the classes Item_func_true and Item_func_false to represent conditions that are always true and always false, respectively. This is a readability enhancement, but causes numerous changes in test result files. The test innodb_fts.fulltext_misc has a corrected result.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.