WL#13929: Introduce new optimizer switch to disable limit optimization
Affects: Server-8.0 — Status: Complete
The optimization to switch from a non-ordering index to an ordering index for group by and order by when there is a limit clause goes very wrong for certain queries. Some of the problems with the algorithm w.r.t this aspect are: 1.Algorithm assumes that data distribution is uniform. For ex: If a query qualifies 10 rows out of 100 rows, it presumes that every 10th row will qualify. However if all the 10 qualifying rows are not distributed uniformly and if the algorithm picks a ordering index over ref/range index, then server ends up reading way too many rows than its estimation. This results in a bad plan compared to the original selection made by the optimizer. 2.Algorithm presumes that non-ordering index and the ordering index being compared are not-correlated. 3.Algorithm is not aware of additional predicates that could be pushed later using index condition pushdown for a range/ref access method. This would further result in reduction of qualifying rows for non-ordering index. So the estimations for the final rows to be read for an ordering index is not right resulting in picking ordering index while the non-ordering index would have performed much better. 4.The cost model used in the algorithm is flawed in arriving at index scan time for the ordering index.Also, its preference to pick covering index even when the ordering index scan time is no-better and preferring to switch to ordering index when group by is present without taking into consideration other parameters results in bad plans for certain queries. There are also other cases where it does not make the right decision of choosing ordering index. W.r.t this aspect, there are problems too. 1.The cost model used in the algorithm is flawed. It does not take into consideration the external sort cost for a non-ordering index. 2.Picking of ordering index is done very late. Ideally it should be part of the join order decision making which could take advantage of the requested order and also the limit clause. As seen in the many bugs reported over years, this is an optimization that can pick good plans but also very bad plans too. The optimizer switch which is introduced as part of this patch enables users to disable this optimization if the switch to ordering index results in bad plan. However the other problem of not switching to ordering index needs a change in the algorithm itself and cannot be addressed with the introduction of this switch. We have tested and analyzed most of the bugs reported in this area and identified the above problems with the algorithm. We will not be solving these problems with this change, but giving user an option to not use the optimization.
Functional Requirements: F1 - The new optimizer switch "prefer_ordering_index" will be similar to other switchable optimizations in its usability. Please refer https://dev.mysql.com/doc/refman/8.0/en/switchable-optimizations.html for the details of existing switchable optimizations. F2 - Hint for "order by/group by" (force index for orderby/group by) will override the value for switch prefer_ordering_index i.e if an index is forced for order by, even if prefer_ordering_index is "off", optimizer will use the limit optimization to pick an ordering index Non-functional requirements: NF1 - There should not be any change in plans for any queries when using the default value (on) for the switch. NF2 - When the switch prefer_ordering_index is set to "off", "limit optimization" will not be used. So all the queries which were switching to ordering index because of this optimization should not switch to ordering index.
A new optimizer switch "prefer_ordering_index" is introduced to enable/disable switching to an ordering index over the already chosen access plan because of limit clause. By default the switch is on which means optimizer will switch to ordering index over a non-ordering one if found better by the limit optimization. Users can disable if the switch to the ordering index results in bad performance. The new optimizer switch can be used in similar lines to existing switchable optimizations. SET [GLOBAL|SESSION] optimizer_switch='command[,command]...'; Please refer https://dev.mysql.com/doc/refman/8.0/en/switchable-optimizations.html for details of existing switchable optimizations and how they can be used. Example: SET optimizer_switch = "prefer_ordering_index=on"; EXPLAIN SELECT non_covered_column FROM t WHERE other_id > 3 ORDER BY id ASC LIMIT 2; id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE t NULL index index_other_id_covered_column PRIMARY 8 NULL 2 70.00 Using where SET optimizer_switch = "prefer_ordering_index=off"; EXPLAIN SELECT non_covered_column FROM t WHERE other_id > 3 ORDER BY id ASC LIMIT 2; id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE t NULL range index_other_id_covered_column index_other_id_covered_column 8 NULL 7 100.00 Using index condition; Using filesort With prefer_ordering_index being "off", ordering index "index_other_id_covered_column" is not picked. Hence the need for external sort.
Copyright (c) 2000, 2021, Oracle Corporation and/or its affiliates. All rights reserved.