Optimising of MIN(), MAX() and COUNT(*) queries without 'group by' clause by replacing the aggregate expression with a constant.
More...
|
static bool | find_key_for_maxmin (bool max_fl, Index_lookup *ref, Item_field *item_field, Item *cond, uint *range_fl, uint *prefix_len) |
| Check whether we can get value for {max|min}(field) by using a key. More...
|
|
static bool | reckey_in_range (bool max_fl, Index_lookup *ref, Item_field *item_field, Item *cond, uint range_fl, uint prefix_len) |
| Check whether found key is in range specified by conditions. More...
|
|
static bool | maxmin_in_range (bool max_fl, Item_field *item_field, Item *cond) |
| Check whether {MAX|MIN}(field) is in range specified by conditions. More...
|
|
static ulonglong | get_exact_record_count (Table_ref *tables) |
| Get exact count of rows in all tables. More...
|
|
static int | get_index_min_value (TABLE *table, Index_lookup *ref, Item_field *item_field, uint range_fl, uint prefix_len) |
| Use index to read MIN(field) value. More...
|
|
static int | get_index_max_value (TABLE *table, Index_lookup *ref, uint range_fl) |
| Use index to read MAX(field) value. More...
|
|
bool | optimize_aggregated_query (THD *thd, Query_block *select, const mem_root_deque< Item * > &fields, Item *conds, aggregate_evaluated *decision) |
| Substitute constants for some COUNT(), MIN() and MAX() functions in an aggregated (implicitly grouped) query. More...
|
|
bool | is_simple_predicate (Item_func *func_item, Item **args, bool *inv_order) |
| Test if the predicate compares a field with constants. More...
|
|
static bool | matching_cond (bool max_fl, Index_lookup *ref, KEY *keyinfo, KEY_PART_INFO *field_part, Item *cond, table_map map, key_part_map *key_part_used, uint *range_fl, uint *prefix_len) |
| Check whether a condition matches a key to get {MAX|MIN}(field):. More...
|
|
Optimising of MIN(), MAX() and COUNT(*) queries without 'group by' clause by replacing the aggregate expression with a constant.
Given a table with a compound key on columns (a,b,c), the following types of queries are optimised (assuming the table handler supports the required methods)
SELECT COUNT(*) FROM t1[,t2,t3,...]
SELECT MIN(b) FROM t1 WHERE a=const
SELECT MAX(c) FROM t1 WHERE a=const AND b=const
SELECT MAX(b) FROM t1 WHERE a=const AND b<const
SELECT MIN(b) FROM t1 WHERE a=const AND b>const
SELECT MIN(b) FROM t1 WHERE a=const AND b BETWEEN const AND const
SELECT MAX(b) FROM t1 WHERE a=const AND b BETWEEN const AND const
Instead of '<' one can use '<=', '>', '>=' and '=' as well. Instead of 'a=const' the condition 'a IS NULL' can be used.
If all selected fields are replaced then we will also remove all involved tables and return the answer without any join. Thus, the following query will be replaced with a row of two constants:
SELECT MAX(b), MIN(d) FROM t1,t2
WHERE a=const AND b<const AND d>const
(assuming a index for column d of table t2 is defined)
Check whether a condition matches a key to get {MAX|MIN}(field):.
For the index specified by the keyinfo parameter and an index that contains the field as its component (field_part), the function checks whether
- the condition cond is a conjunction,
- all of its conjuncts refer to columns of the same table, and
- each conjunct is on one of the following forms:
- f_i = const_i or const_i = f_i or f_i IS NULL, where f_i is part of the index
- field {<|<=|>=|>|=} const
- const {<|<=|>=|>|=} field
- field BETWEEN const_1 AND const_2
As a side-effect, the key value to be used for looking up the MIN/MAX value is actually stored inside the Field object. An interesting feature is that the function will find the most restrictive endpoint by over-eager evaluation of the WHERE
condition. It continually stores the current endpoint inside the Field object. For a query such as
#define MIN(a, b)
Definition: crypt_genhash_impl.cc:88
const std::string SELECT("SELECT")
Name of the static privileges.
@ WHERE
Definition: sql_yacc.h:679
@ FROM
Definition: sql_yacc.h:242
the algorithm will recurse over the conjunction, storing first a 3 in the field. In the next recursive invocation the expression a > 5 is evaluated as 3 > 5 (Due to the dual nature of Field objects as value carriers and field identifiers), which will obviously fail, leading to 5 being stored in the Field object.
- Parameters
-
[in] | max_fl | Set to true if we are optimizing MAX(), false means we are optimizing MIN() |
[in,out] | ref | Reference to the structure where the function stores the key value |
[in] | keyinfo | Reference to the key info |
[in] | field_part | Pointer to the key part for the field |
[in] | cond | WHERE condition |
[in] | map | Table map for the key |
[in,out] | key_part_used | Map of matchings parts. The function will output the set of key parts actually being matched in this set, yet it relies on the caller to initialize the value to zero. This is due to the fact that this value is passed recursively. |
[in,out] | range_fl | Says whether endpoints use strict greater/less than. |
[out] | prefix_len | Length of common key part for the range where MAX/MIN is searched for |
- Return values
-
false | Index can't be used. |
true | We can use the index to get MIN/MAX value |
Substitute constants for some COUNT(), MIN() and MAX() functions in an aggregated (implicitly grouped) query.
- Parameters
-
[in] | thd | thread handler |
[in] | select | query block |
[in] | fields | All fields to be returned |
[in] | conds | WHERE clause |
[out] | decision | outcome for successful execution = AGGR_REGULAR regular execution required = AGGR_COMPLETE values available = AGGR_DELAYED execution with ha_records() required = AGGR_EMPTY source tables empty, aggregates are NULL or zero (for COUNT) |
- Returns
- false if success, true if error
This function is called for queries with aggregate functions and no GROUP BY, thus the result set will contain a single row only.
First, the function will analyze the source tables and WHERE clause to see whether the query qualifies for optimization. If not, the decision AGGR_REGULAR is returned.
Second, the function walks over all expressions in the SELECT list. If the expression can be optimized with a storage engine operation that is O(1) (MIN or MAX) or O(0) (instant COUNT), the value is looked up and inserted in the value buffer, and the corresponding Item is marked as being const. If the expression is a COUNT operation that can be evaluated efficiently by the storage manager (but still O(N)), indicated with HA_COUNT_ROWS_INSTANT, it will be marked as such.
When all SELECT list expressions have been processed, there are four possible outcomes:
- An empty result from the source tables is indicated, and the output state is AGGR_EMPTY. Notice that the result of aggregation is still one row, filled with zero for COUNT operations and NULL values for all other expressions.
- All expressions have been given values, indicated with output state AGGR_COMPLETE.
- All expressions have been given values, except for one or more COUNT operations that will be evaluated in execution. This is indicated with AGGR_DELAYED.
- Some expressions must be evaluated as part of a regular execution, indicated with AGGR_REGULAR. Notice that some of the expressions may have been given values and are marked as const, but no expressions will be candidates for delayed execution.