MySQL 9.4.0
Source Code Documentation
interesting_orders.h
Go to the documentation of this file.
1/* Copyright (c) 2021, 2025, Oracle and/or its affiliates.
2
3 This program is free software; you can redistribute it and/or modify
4 it under the terms of the GNU General Public License, version 2.0,
5 as published by the Free Software Foundation.
6
7 This program is designed to work with certain software (including
8 but not limited to OpenSSL) that is licensed under separate terms,
9 as designated in a particular file or component or in included license
10 documentation. The authors of MySQL hereby grant you an additional
11 permission to link the program and your derivative works with the
12 separately licensed software that they have either included with
13 the program or referenced in the documentation.
14
15 This program is distributed in the hope that it will be useful,
16 but WITHOUT ANY WARRANTY; without even the implied warranty of
17 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18 GNU General Public License, version 2.0, for more details.
19
20 You should have received a copy of the GNU General Public License
21 along with this program; if not, write to the Free Software
22 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
23
24#ifndef SQL_JOIN_OPTIMIZER_INTERESTING_ORDERS_H
25#define SQL_JOIN_OPTIMIZER_INTERESTING_ORDERS_H
26
27/**
28 @file
29
30 Tracks which tuple streams follow which orders, and in particular whether
31 they follow interesting orders.
32
33 An interesting order (and/or grouping) is one that we might need to sort by
34 at some point during query execution (e.g. to satisfy an ORDER BY predicate);
35 if the rows already are produced in that order, for instance because we
36 scanned along the right index, we can skip the sort and get a lower cost.
37
38 We generally follow these papers:
39
40 [Neu04] Neumann and Moerkotte: “An efficient framework for order
41 optimization”
42 [Neu04b] Neumann and Moerkotte: “A Combined Framework for
43 Grouping and Order Optimization”
44
45 [Neu04b] is an updated version of [Neu04] that also deals with interesting
46 groupings but omits some details to make more space, so both are needed.
47 A combined and updated version of the same material is available in
48 Moerkotte's “Query compilers” PDF.
49
50 Some further details, like order homogenization, come from
51
52 [Sim96] Simmen et al: “Fundamental Techniques for Order Optimization”
53
54 All three papers deal with the issue of _logical_ orderings, where any
55 tuple stream may follow more than one order simultaneously, as inferred
56 through functional dependencies (FDs). For instance, if we have an ordering
57 (ab) but also an active FD {a} → c (c is uniquely determined by a,
58 for instance because a is a primary key in the same table as c), this means
59 we also implicitly follow the orders (acb) and (abc). In addition,
60 we trivially follow the orders (a), (ac) and (ab). However, note that we
61 do _not_ necessarily follow the order (cab).
62
63 Similarly, equivalences, such as WHERE conditions and joins, give rise
64 to a stronger form of FDs. If we have an ordering (ab) and the FD b = c,
65 we can be said to follow (ac), (acb) or (abc). The former would not be
66 inferable from {b} → c and {c} → b alone. Equivalences with constants
67 are perhaps even stronger, e.g. WHERE x=3 would give rise to {} → x,
68 which could extend (a) to (xa), (ax) or (x).
69
70 Neumann et al solve this by modelling which ordering we're following as a
71 state in a non-deterministic finite state machine (NFSM). By repeatedly
72 applying FDs (which become edges in the NFSM), we can build up all possible
73 orderings from a base (which can be either the empty ordering, ordering from
74 scanning along an index, or one produced by an explicit sort) and then
75 checking whether we are in a state matching the ordering we are interested
76 in. (There can be quite a bit of states, so we need a fair amount of pruning
77 to keep the count manageable, or else performance will suffer.) Of course,
78 since NFSMs are nondeterministic, a base ordering and a set of FDs can
79 necessarily put us in a number of states, so we need to convert the NFSM
80 to a DFSM (using the standard powerset construction for NFAs; see
81 ConvertNFSMToDFSM()). This means that the ordering state for an access path
82 is only a single integer, the DFSM state number. When we activate more FDs,
83 for instance because we apply joins, we will move throughout the DFSM into
84 more attractive states. By checking simple precomputed lookup tables,
85 we can quickly find whether a given DFSM state follows a given ordering.
86
87 The other kind of edges we follow are from the artificial starting state;
88 they represent setting a specific ordering (e.g. because we sort by that
89 ordering). This is set up in the NFSM and preserved in the DFSM.
90
91 The actual collection of FDs and interesting orders happen outside this
92 class, in the caller.
93
94 A weakness in the approach is that transitive FDs are not always followed
95 correctly. E.g., if we have an ordering (a), and FDs {a} → b and {b} → c,
96 we will create (ab) and (abc), but _not_ (ac). This is not a problem for
97 equivalences, though, and most of the FDs we collect are equivalences.
98 We do have some heuristics to produce a new FD {a} → c where it is relevant,
99 but they are not always effective.
100
101 Neumann and Moerkotte distinguish between “tested-for” (O_T) and
102 “producing” (O_P) orderings, where all orders are interesting but only
103 some can be produced by explicit operators, such as sorts. Our implementation
104 is exactly opposite; we allow every ordering to be produced (by means of
105 sort-ahead), but there are orders that can be produced (e.g. when scanning
106 an index) that are not interesting in themselves. Such orders can be
107 pruned away early if we can show they do not produce anything interesting.
108
109
110 The operations related to interesting orders, in particular the concept
111 of functional dependencies, are related to the ones we are doing when
112 checking ONLY_FULL_GROUP_BY legality in sql/aggregate_check.h. However,
113 there are some key differences as well:
114
115 - Orderings are lexical, while groupings are just a bag of attributes.
116 This increases the state space for orderings significantly; groupings
117 can add elements at will and just grow the set freely, while orderings
118 need more care. In particular, this means that groupings only need
119 FDs on the form S → x (where S is a set), while orderings also benefit
120 from those of the type x = y, which replace an element instead of
121 adding a new one.
122
123 - ONLY_FULL_GROUP_BY is for correctness of rejecting or accepting the
124 query, while interesting orders is just an optimization, so not
125 recognizing rare cases is more acceptable.
126
127 - ONLY_FULL_GROUP_BY testing only cares about the set of FDs that hold
128 at one specific point (GROUP BY testing, naturally), while interesting
129 orders must be tracked throughout the entire operator tree. In particular,
130 if using interesting orders for merge join, the status at nearly every
131 join is relevant. Also, performance matters much more.
132
133 Together, these mean that the code ends up being fairly different,
134 and some cases are recognized by ONLY_FULL_GROUP_BY but not by interesting
135 orders. (The actual FD collection happens in BuildInterestingOrders in
136 join_optimizer.cc; see the comment there for FD differences.)
137
138 A note about nomenclature: Like Neumann et al, we use the term “ordering”
139 (and “grouping”) instead of “order”, with the special exception of the
140 already-established term “interesting order”.
141 */
142
143#include "my_table_map.h"
145#include "sql/key_spec.h"
146#include "sql/mem_root_array.h"
147#include "sql/sql_array.h"
148
149#include <bitset>
150#include <sstream>
151#include <string>
152
153class Field;
154class LogicalOrderings;
155class Window;
156
157/**
158 Represents a (potentially interesting) ordering, rollup or (non-rollup)
159 grouping.
160*/
161class Ordering final {
162 friend bool operator==(const Ordering &a, const Ordering &b);
163
164 public:
165 /// This type hold the individual elements of the ordering.
167
168 /// The kind of ordering that an Ordering instance may represent.
169 enum class Kind : char {
170 /// An ordering with no elements. Such an ordering is not useful in itself,
171 /// but may appear as an intermediate result.
172 kEmpty,
173
174 /// Specific sequence of m_elements, and specific direction of each element.
175 /// Needed for e.g. ORDER BY.
176 kOrder,
177
178 /// Specific sequence of m_elements, but each element may be ordered in any
179 /// direction. Needed for ROLLUP:
180 kRollup,
181
182 /// Elements may appear in any sequence and may be ordered in any direction.
183 /// Needed for GROUP BY (with out ROLLUP), DISCTINCT, semi-join etc.
184 kGroup
185 };
186
187 Ordering() : m_kind{Kind::kEmpty} {}
188
189 Ordering(Elements elements, Kind kind) : m_elements{elements}, m_kind{kind} {
190 assert(Valid());
191 }
192
193 /// Copy constructor. Only defined explicitly to check Valid().
194 Ordering(const Ordering &other)
195 : m_elements{other.m_elements}, m_kind{other.m_kind} {
196 assert(Valid());
197 }
198
199 /// Assignment operator. Only defined explicitly to check Valid().
200 Ordering &operator=(const Ordering &other) {
201 assert(Valid());
202 m_kind = other.m_kind;
203 m_elements = other.m_elements;
204 return *this;
205 }
206
207 /// Make a copy of *this. Allocate new memory for m_elements from mem_root.
209 assert(Valid());
211 }
212
213 Kind GetKind() const {
214 assert(Valid());
215 return m_kind;
216 }
217
218 const Elements &GetElements() const {
219 assert(Valid());
220 return m_elements;
221 }
222
224 assert(Valid());
225 return m_elements;
226 }
227
228 size_t size() const { return m_elements.size(); }
229
230 /**
231 Remove duplicate entries, in-place.
232 */
233 void Deduplicate();
234
235 private:
236 /// The ordering terms.
238
239 /// The kind of this ordering.
241
242 /// @returns true iff *this passes a consistency check.
243 bool Valid() const;
244};
245
246/// Check if 'a' and 'b' has the same kind and contains the same elements.
247inline bool operator==(const Ordering &a, const Ordering &b) {
248 assert(a.Valid());
249 assert(b.Valid());
250 return a.m_kind == b.m_kind &&
253}
254
255inline bool operator!=(const Ordering &a, const Ordering &b) {
256 return !(a == b);
257}
258
260 enum {
261 // A special “empty” kind of edge in the FSM that signifies
262 // adding no functional dependency, ie., a state we can reach
263 // with no further effort. This can happen in two ways:
264 //
265 // 1. An ordering can drop its last element, ie.,
266 // if a tuple stream is ordered on (a,b,c), it is also
267 // ordered on (a,b).
268 // 2. An ordering can be converted to a grouping, i.e,
269 // if a tuple stream is ordered on (a,b,c), it is also
270 // grouped on {a,b,c}.
271 //
272 // head must be empty, tail must be 0. Often called ϵ.
273 // Must be the first in the edge list.
275
276 // A standard functional dependency {a} → b; if a tuple stream
277 // is ordered on all elements of a and this FD is applied,
278 // it is also ordered on (a,b). A typical example is if {a}
279 // is an unique key in a table, and b is a column of the
280 // same table. head can be empty.
282
283 // An equivalence a = b; implies a → b and b → a, but is
284 // stronger (e.g. if ordered on (a,c), there is also an
285 // ordering on (b,c), which wouldn't be true just from
286 // applying FDs individually). head must be a single element.
289
292
293 // Whether this functional dependency can always be applied, ie.,
294 // there is never a point during query processing where it does not hold.
295 //
296 // Examples of not-always-active FDs include join conditions;
297 // e.g. for t1.x = t2.x, it is not true before the join has actually
298 // happened (and t1.x won't be the same order as t2.x before that,
299 // and thus cannot be used in e.g. a merge join).
300 //
301 // However, FDs that stem from unique indexes are always true; e.g. if
302 // t1.x is a primary key, {t1.x} → t1.y will always be true, and we can
303 // always reduce t1.y from an order if t1.x is present earlier.
304 // Similarly, WHERE conditions that are applied on the base table
305 // (ie., it is not delayed due to outer joins) will always be true,
306 // if t1.x = 3, we can safely assume {} → t1.x holds even before
307 // joining in t1, so a sort on (t1.x, t2.y) can be satisfied just by
308 // sorting t2 on y.
309 //
310 // Always-active FDs are baked into the DFSM, so that we need to follow
311 // fewer arcs during query processing. They can also be used for reducing
312 // the final order (to get more efficient sorting), but we don't do it yet.
313 bool always_active = false;
314};
315
318
319 public:
320 explicit LogicalOrderings(THD *thd);
321
322 // Maps the Item to an opaque integer handle. Deduplicates items as we go,
323 // inserting new ones if needed.
325
326 // Maps the Field to an opaque integer handle. Equivalent to calling
327 // GetHandle(new Item_field(field)), except the Item_field is only allocated
328 // if no existing handle can be reused.
329 ItemHandle GetHandle(Field *field);
330
331 Item *item(ItemHandle item) const { return m_items[item].item; }
332 int num_items() const { return m_items.size(); }
333
334 // These are only available before Build() has been called.
335
336 // Mark an interesting ordering (or grouping) as interesting,
337 // returning an index that can be given to SetOrder() later.
338 // Will deduplicate against previous entries; if not deduplicated
339 // away, a copy will be taken.
340 //
341 // Uninteresting orderings are those that can be produced by some
342 // operator (for instance, index scan) but are not interesting to
343 // test for. Orderings may be merged, pruned (if uninteresting)
344 // and moved around after Build(); see RemapOrderingIndex().
345 //
346 // If used_at_end is true, the ordering is assumed to be used only
347 // after all joins have happened, so all FDs are assumed to be
348 // active. This enables reducing the ordering more (which can in
349 // some cases help with better sortahead or the likes), but is not
350 // correct if the ordering wants to be used earlier on, e.g.
351 // in merge join or for semijoin duplicate removal. If it is false,
352 // then it is also only attempted homogenized onto the given set
353 // of tables (otherwise, it is ignored, and homogenization is over
354 // all tables).
355 //
356 // The empty ordering/grouping is always index 0.
357 int AddOrdering(THD *thd, Ordering order, bool interesting, bool used_at_end,
358 table_map homogenize_tables) {
359 return AddOrderingInternal(thd, order,
362 used_at_end, homogenize_tables);
363 }
364
365 // NOTE: Will include the empty ordering.
366 int num_orderings() const { return m_orderings.size(); }
367
368 const Ordering &ordering(int ordering_idx) const {
369 return m_orderings[ordering_idx].ordering;
370 }
371
372 bool ordering_is_relevant_for_sortahead(int ordering_idx) const {
373 return !m_orderings[ordering_idx].ordering.GetElements().empty() &&
374 m_orderings[ordering_idx].type != OrderingWithInfo::UNINTERESTING;
375 }
376
377 // Add a functional dependency that may be applied at some point
378 // during the query planning. Same guarantees as AddOrdering().
379 // The special “decay” FD is always index 0.
381
382 // NOTE: Will include the decay (epsilon) FD.
383 int num_fds() const { return m_fds.size(); }
384
385 // Set the list of GROUP BY expressions, if any. This is used as the
386 // head of the functional dependencies for all aggregate functions
387 // (which by definition are functionally dependent on the GROUP BY
388 // expressions, unless ROLLUP is active -- see below), and must be
389 // valid (ie., not freed or modified) until Build() has run.
390 //
391 // If none is set, and there are aggregates present in orderings,
392 // implicit aggregation is assumed (ie., all aggregate functions
393 // are constant).
395 m_aggregate_head = head;
396 }
397
398 // Set whether ROLLUP is active; if so, we can no longer assume that
399 // aggregate functions are functionally dependent on (nullable)
400 // GROUP BY expressions, as two NULLs may be for different reasons.
401 void SetRollup(bool rollup) { m_rollup = rollup; }
402
403 // Builds the actual FSMs; all information about orderings and FDs is locked,
404 // optimized and then the state machine is built. After this, you can no
405 // longer add new orderings or FDs, ie., you are moving into the actual
406 // planning phase.
407 //
408 // Build() may prune away orderings and FDs, and it may also add homogenized
409 // orderings, ie., orderings derived from given interesting orders but
410 // modified so that they only require a single table (but will become an
411 // actual interesting order later, after the FDs have been applied). These are
412 // usually at the end, but may also be deduplicated against uninteresting
413 // orders, which will then be marked as interesting.
414 void Build(THD *thd);
415
416 // These are only available after Build() has been called.
417 // They are stateless and used in the actual planning phase.
418
419 // Converts an index returned by AddOrdering() to one that can be given
420 // to SetOrder() or DoesFollowOrder(). They don't convert themselves
421 // since it's completely legitimate to iterate over all orderings using
422 // num_orderings() and orderings(), and those indexes should _not_ be
423 // remapped.
424 //
425 // If an ordering has been pruned away, will return zero (the empty ordering),
426 // which is a valid input to SetOrder().
427 int RemapOrderingIndex(int ordering_idx) const {
428 assert(m_built);
429 return m_optimized_ordering_mapping[ordering_idx];
430 }
431
432 using StateIndex = int;
433
434 StateIndex SetOrder(int ordering_idx) const {
435 assert(m_built);
436 return m_orderings[ordering_idx].state_idx;
437 }
438
439 // Get a bitmap representing the given functional dependency. The bitmap
440 // can be all-zero if the given FD is optimized away, or outside the range
441 // of the representable bits. The bitmaps may be ORed together, but are
442 // otherwise to be treated as opaque to the client.
445 int new_fd_idx = m_optimized_fd_mapping[fd_idx];
446 if (new_fd_idx >= 1 && new_fd_idx <= kMaxSupportedFDs) {
447 fd_set.set(new_fd_idx - 1);
448 }
449 return fd_set;
450 }
451
452 // For a given state, see what other (better) state we can move to given a
453 // set of active functional dependencies, e.g. if we are in state ((),a) and
454 // the FD a=b becomes active, we can set its bit (see GetFDSet()) in the FD
455 // mask and use that to move to the state ((),a,b,ab,ba). Note that “fds”
456 // should contain the entire set of active FDs, not just newly-applied ones.
457 // This is because “old” FDs can suddenly become relevant when new logical
458 // orderings are possible, and the DFSM is not always able to bake this in.
460
461 bool DoesFollowOrder(StateIndex state_idx, int ordering_idx) const {
462 assert(m_built);
463 if (ordering_idx == 0) {
464 return true;
465 }
466 if (ordering_idx >= kMaxSupportedOrderings) {
467 return false;
468 }
469 return m_dfsm_states[state_idx].follows_interesting_order.test(
470 ordering_idx);
471 }
472
473 // Whether "a" follows any interesting orders than "b" does not, or could
474 // do so in the future. If !MoreOrderedThan(a, b) && !MoreOrderedThan(b, a)
475 // the states are equal (they follow the same interesting orders, and could
476 // lead to the same interesting orders given the same FDs -- see below).
477 // It is possible to have MoreOrderedThan(a, b) && MoreOrderedThan(b, a), e.g.
478 // if they simply follow disjunct orders.
479 //
480 // This is used in the planner, when pruning access paths -- an AP A can be
481 // kept even if it has higher cost than AP B, if it follows orders that B does
482 // not. Why is it enough to check interesting orders -- must we also not check
483 // uninteresting orders, since they could lead to new interesting orders
484 // later? This is because in the planner, two states will only ever be
485 // compared if the same functional dependencies have been applied to both
486 // sides:
487 //
488 // The set of logical orders, and thus the state, is uniquely determined
489 // by the initial ordering and applied FDs. Thus, if A has _uninteresting_
490 // orders that B does not, the initial ordering must have differed -- but the
491 // initial states only contain (and thus differ in) interesting orders.
492 // Thus, the additional uninteresting orders must have been caused by
493 // additional interesting orders (that do not go away), so testing the
494 // interesting ones really suffices in planner context.
495 //
496 // Note that this also means that in planner context, !MoreOrderedThan(a, b)
497 // && !MoreOrderedThan(b, a) implies that a == b.
499 StateIndex a_idx, StateIndex b_idx,
500 std::bitset<kMaxSupportedOrderings> ignored_orderings) const {
501 assert(m_built);
502 std::bitset<kMaxSupportedOrderings> a =
503 m_dfsm_states[a_idx].follows_interesting_order & ~ignored_orderings;
504 std::bitset<kMaxSupportedOrderings> b =
505 m_dfsm_states[b_idx].follows_interesting_order & ~ignored_orderings;
506 std::bitset<kMaxSupportedOrderings> future_a =
507 m_dfsm_states[a_idx].can_reach_interesting_order & ~ignored_orderings;
508 std::bitset<kMaxSupportedOrderings> future_b =
509 m_dfsm_states[b_idx].can_reach_interesting_order & ~ignored_orderings;
510 return (a & b) != a || (future_a & future_b) != future_a;
511 }
512
513 // See comment in .cc file.
515 Ordering::Elements tmp) const;
516
517 private:
518 struct NFSMState;
519 class OrderWithElementInserted;
520
521 bool m_built = false;
522
523 struct ItemInfo {
524 // Used to translate Item * to ItemHandle and back.
526
527 // Points to the head of this item's equivalence class. (If the item
528 // is not equivalent to anything, points to itself.) The equivalence class
529 // is defined by EQUIVALENCE FDs, transitively, and the head is the one with
530 // the lowest index. So if we have FDs a = b and b = c, all three {a,b,c}
531 // will point to a here. This is useful for pruning and homogenization;
532 // if two elements have the same equivalence class (ie., the same canonical
533 // item), they could become equivalent after applying FDs. See also
534 // m_can_be_added_by_fd, which deals with non-EQUIVALENCE FDs.
535 //
536 // Set by BuildEquivalenceClasses().
538
539 // Whether the given item (after canonicalization by means of
540 // m_canonical_item[]) shows up as the tail of any functional dependency.
541 //
542 // Set by FindElementsThatCanBeAddedByFDs();
543 bool can_be_added_by_fd = false;
544
545 // Whether the given item ever shows up in orderings as ASC or DESC,
546 // respectively. Used to see whether adding the item in that direction
547 // is worthwhile or not. Note that this is propagated through equivalences,
548 // so if a = b and any ordering contains b DESC and a is the head of that
549 // equivalence class, then a is also marked as used_desc = true.
550 bool used_asc = false;
551 bool used_desc = false;
552 bool used_in_grouping = false;
553 };
554 // All items we have seen in use (in orderings or FDs), deduplicated
555 // and indexed by ItemHandle.
557
558 // Head for all FDs generated for aggregate functions.
559 // See SetHeadForAggregates().
561
562 // Whether rollup is active; if so, we need to take care not to create
563 // FDs for aggregates in some cases. See SetHeadForAggregates() and
564 // SetRollup().
565 bool m_rollup = false;
566
567 struct NFSMEdge {
568 // Which FD is required to follow this edge. Index into m_fd, with one
569 // exception; from the initial state (0), we have constructor edges for
570 // setting a specific order without following an FD. Such edges have
571 // required_fd_idx = INT_MIN + order_idx, ie., they are negative.
573
574 // Destination state (index into m_states).
576
578 const LogicalOrderings *orderings) const {
579 return &orderings->m_fds[required_fd_idx];
580 }
581 const NFSMState *state(const LogicalOrderings *orderings) const {
582 return &orderings->m_states[state_idx];
583 }
584 };
585
586 friend bool operator==(const NFSMEdge &a, const NFSMEdge &b);
587 friend bool operator!=(const NFSMEdge &a, const NFSMEdge &b);
588
589 struct NFSMState {
593 int satisfied_ordering_idx; // Only for type == INTERESTING.
594
595 // Indexed by ordering.
596 std::bitset<kMaxSupportedOrderings> can_reach_interesting_order{0};
597
598 // Used during traversal, to keep track of which states we have
599 // already seen (for fast deduplication). We use the standard trick
600 // of using a generational counter instead of a bool, so that we don't
601 // have to clear it every time; we can just increase the generation
602 // and treat everything with lower/different “seen” as unseen.
603 int seen = 0;
604 };
605 struct DFSMState {
606 Mem_root_array<int> outgoing_edges; // Index into dfsm_edges.
607 Mem_root_array<int> nfsm_states; // Index into states.
608
609 // Structures derived from the above, but in forms for faster access.
610
611 // Indexed by FD.
613
614 // Indexed by ordering.
615 std::bitset<kMaxSupportedOrderings> follows_interesting_order{0};
616
617 // Interesting orders that this state can eventually reach,
618 // given that all FDs are applied (a superset of follows_interesting_order).
619 // We track this instead of the producing orders (e.g. which homogenized
620 // order are we following), because it allows for more access paths to
621 // compare equal. See also OrderingWithInfo::Type::HOMOGENIZED.
622 std::bitset<kMaxSupportedOrderings> can_reach_interesting_order{0};
623
624 // Whether applying the given functional dependency will take us to a
625 // different state from this one. Used to quickly intersect with the
626 // available FDs to find out what we can apply.
628 };
629
630 struct DFSMEdge {
633
635 const LogicalOrderings *orderings) const {
636 return &orderings->m_fds[required_fd_idx];
637 }
638 const DFSMState *state(const LogicalOrderings *orderings) const {
639 return &orderings->m_dfsm_states[state_idx];
640 }
641 };
642
645
646 // Status of the ordering. Note that types with higher indexes dominate
647 // lower, ie., two orderings can be collapsed into the one with the higher
648 // type index if they are otherwise equal.
649 enum Type {
650 // An ordering that is interesting in its own right,
651 // e.g. because it is given to ORDER BY.
653
654 // An ordering that is derived from an interesting order, but refers to
655 // one table only (or conceptually, a subset of tables -- but we don't
656 // support that in this implementation). Guaranteed to reach some
657 // interesting order at some point, but we don't track it as interesting
658 // in the FSM states. This means that these orderings don't get state bits
659 // in follows_interesting_order for themselves, but they will always have
660 // one or more interesting orders in can_reach_interesting_order.
661 // This helps us collapse access paths more efficiently; if we have
662 // an interesting order t3.x and create homogenized orderings t1.x
663 // and t2.x (due to some equality with t3.x), an access path following one
664 // isn't better than an access path following the other. They will lead
665 // to the same given the same FDs anyway (see MoreOrderedThan()), and
666 // thus are equally good.
668
669 // An ordering that is just added because it is easy to produce;
670 // e.g. because it is produced by scanning along an index. Such orderings
671 // can be shortened or pruned away entirely (in
672 // PruneUninterestingOrders())
673 // unless we find that they may lead to an interesting order.
674 UNINTERESTING = 0
676
678
679 // Only used if used_at_end = false (see AddOrdering()).
681
682 // Which initial state to use for this ordering (in SetOrder()).
684 };
685
687
688 // The longest ordering in m_orderings.
690
692
693 // NFSM. 0 is the initial state, all others are found by following edges.
695
696 // DFSM. 0 is the initial state, all others are found by following edges.
699
700 // After PruneUninterestingOrders has run, maps from the old indexes to the
701 // new indexes.
703
704 // After PruneFDs() has run, maps from the old indexes to the new indexes.
706
707 /// We may potentially use a lot of Ordering::Elements objects, with short and
708 /// non-overlapping life times. Therefore we have a pool
709 /// to allow reuse and avoid allocating from MEM_ROOT each time.
711
712 // The canonical order for two items in a grouping
713 // (after BuildEquivalenceClasses() has run; enforced by
714 // RecanonicalizeGroupings()). The reason why we sort by
715 // canonical_item first is so that switching out one element
716 // with an equivalent one (ie., applying an EQUIVALENCE
717 // functional dependency) does not change the order of the
718 // elements in the grouing, which could give false negatives
719 // in CouldBecomeInterestingOrdering().
721 if (m_items[a].canonical_item != m_items[b].canonical_item)
722 return m_items[a].canonical_item < m_items[b].canonical_item;
723 return a < b;
724 }
725
726 inline bool ItemBeforeInGroup(const OrderElement &a,
727 const OrderElement &b) const {
728 return ItemHandleBeforeInGroup(a.item, b.item);
729 }
730
731 // Helper for AddOrdering().
733 bool used_at_end, table_map homogenize_tables);
734
735 // See comment in .cc file.
736 void PruneUninterestingOrders(THD *thd);
737
738 // See comment in .cc file.
739 void PruneFDs(THD *thd);
740
741 // See comment in .cc file.
743 bool all_fds) const;
744
745 // Populates ItemInfo::canonical_item.
747
748 // See comment in .cc file.
750
751 // See comment in .cc file.
752 void AddFDsFromComputedItems(THD *thd);
753
754 // See comment in .cc file.
755 void AddFDsFromAggregateItems(THD *thd);
756
757 // See comment in .cc file.
759 THD *thd, ItemHandle argument_item, Window *window);
760
761 // See comment in .cc file.
762 void AddFDsFromConstItems(THD *thd);
763
764 // Populates ItemInfo::can_be_added_by_fd.
766
767 void PreReduceOrderings(THD *thd);
772 THD *thd, const Ordering &reduced_ordering, bool used_at_end,
773 int table_idx,
774 Bounds_checked_array<std::pair<ItemHandle, ItemHandle>>
775 reverse_canonical);
776
777 /// Sort the elements so that a will appear before b if
778 /// ItemBeforeInGroup(a,b)==true.
779 void SortElements(Ordering::Elements elements) const;
780
781 // See comment in .cc file.
783
784 void BuildNFSM(THD *thd);
785 void AddRollupFromOrder(THD *thd, int state_idx, const Ordering &ordering);
786 void AddGroupingFromOrder(THD *thd, int state_idx, const Ordering &ordering);
787 void AddGroupingFromRollup(THD *thd, int state_idx, const Ordering &ordering);
788 void TryAddingOrderWithElementInserted(THD *thd, int state_idx, int fd_idx,
789 Ordering old_ordering,
790 size_t start_point,
791 ItemHandle item_to_add,
792 enum_order direction);
793 void PruneNFSM(THD *thd);
794 bool AlwaysActiveFD(int fd_idx);
795 void FinalizeDFSMState(THD *thd, int state_idx);
797 int *generation, int extra_allowed_fd_idx);
798 void ConvertNFSMToDFSM(THD *thd);
799
800 // Populates state_idx for every ordering in m_ordering.
802
803 // If a state with the given ordering already exists (artificial or not),
804 // returns its index. Otherwise, adds an artificial state with the given
805 // order and returns its index.
806 int AddArtificialState(THD *thd, const Ordering &ordering);
807
808 // Add an edge from state_idx to an state with the given ordering; if there is
809 // no such state, adds an artificial state with it (taking a copy, so does not
810 // need to take ownership).
811 void AddEdge(THD *thd, int state_idx, int required_fd_idx,
812 const Ordering &ordering);
813
814 // Returns true if the given (non-DECAY) functional dependency applies to the
815 // given ordering, and the index of the element from which the FD is active
816 // (ie., the last element that was part of the head). One can start inserting
817 // the tail element at any point _after_ this index; if it is an EQUIVALENCE
818 // FD, one can instead choose to replace the element at start_point entirely.
820 const Ordering &ordering,
821 int *start_point) const;
822
823 /**
824 Fetch an Ordering::Elements object with size()==m_longest_ordering.
825 Get it from m_elements_pool if that is non-empty, otherwise allocate
826 from mem_root.
827 */
829 if (m_elements_pool.empty()) {
831 } else {
833 m_elements_pool.pop_back();
835 }
836 }
837
838 /**
839 Return an Ordering::Elements object with size()==m_longest_ordering
840 to m_elements_pool.
841 */
843 // Overwrite the array with garbage, so that we have a better chance
844 // of detecting it if we by mistake access it afterwards.
845 TRASH(elements.data(), m_longest_ordering * sizeof(elements.data()[0]));
846 m_elements_pool.push_back(elements.data());
847 }
848 // Used for optimizer trace.
849
850 std::string PrintOrdering(const Ordering &ordering) const;
852 bool html) const;
853 void PrintFunctionalDependencies(std::ostream *trace);
854 void PrintInterestingOrders(std::ostream *trace);
855 void PrintNFSMDottyGraph(std::ostream *trace) const;
856 void PrintDFSMDottyGraph(std::ostream *trace) const;
857};
858
861 return a.required_fd_idx == b.required_fd_idx && a.state_idx == b.state_idx;
862}
863
866 return !(a == b);
867}
868
869#endif // SQL_JOIN_OPTIMIZER_INTERESTING_ORDERS_H
Element_type * data()
Definition: sql_array.h:117
static Bounds_checked_array Alloc(MEM_ROOT *mem_root, size_t size)
Definition: sql_array.h:71
const_iterator cbegin() const
Returns a pointer to the first element in the array.
Definition: sql_array.h:145
size_t size() const
Definition: sql_array.h:155
Bounds_checked_array Clone(MEM_ROOT *mem_root) const
Make a copy of '*this'. Allocate memory for m_array on 'mem_root'.
Definition: sql_array.h:76
const_iterator cend() const
Returns a pointer to the past-the-end element in the array.
Definition: sql_array.h:147
Definition: field.h:569
Base class that is used to represent any kind of expression in a relational query.
Definition: item.h:927
Definition: interesting_orders.h:316
void SetHeadForAggregates(Bounds_checked_array< ItemHandle > head)
Definition: interesting_orders.h:394
void AddRollupFromOrder(THD *thd, int state_idx, const Ordering &ordering)
Definition: interesting_orders.cc:1724
void AddEdge(THD *thd, int state_idx, int required_fd_idx, const Ordering &ordering)
Definition: interesting_orders.cc:1320
bool ItemHandleBeforeInGroup(ItemHandle a, ItemHandle b) const
Definition: interesting_orders.h:720
void CreateOrderingsFromGroupings(THD *thd)
We don't currently have any operators that only group and do not sort (e.g.
Definition: interesting_orders.cc:935
void AddFDsFromComputedItems(THD *thd)
Try to add new FDs from items that are not base items; e.g., if we have an item (a + 1),...
Definition: interesting_orders.cc:626
Mem_root_array< OrderingWithInfo > m_orderings
Definition: interesting_orders.h:686
bool ImpliedByEarlierElements(ItemHandle item, Ordering::Elements prefix, bool all_fds) const
Checks whether the given item is redundant given previous elements in the ordering; ie....
Definition: interesting_orders.cc:828
Bounds_checked_array< int > m_optimized_fd_mapping
Definition: interesting_orders.h:705
StateIndex SetOrder(int ordering_idx) const
Definition: interesting_orders.h:434
int num_orderings() const
Definition: interesting_orders.h:366
bool DoesFollowOrder(StateIndex state_idx, int ordering_idx) const
Definition: interesting_orders.h:461
bool m_rollup
Definition: interesting_orders.h:565
int m_longest_ordering
Definition: interesting_orders.h:689
Mem_root_array< DFSMState > m_dfsm_states
Definition: interesting_orders.h:697
bool m_built
Definition: interesting_orders.h:521
void AddHomogenizedOrderingIfPossible(THD *thd, const Ordering &reduced_ordering, bool used_at_end, int table_idx, Bounds_checked_array< std::pair< ItemHandle, ItemHandle > > reverse_canonical)
Helper function for CreateHomogenizedOrderings().
Definition: interesting_orders.cc:1119
friend bool operator==(const NFSMEdge &a, const NFSMEdge &b)
Definition: interesting_orders.h:859
std::string PrintOrdering(const Ordering &ordering) const
Definition: interesting_orders.cc:2134
Ordering ReduceOrdering(Ordering ordering, bool all_fds, Ordering::Elements tmp) const
Remove redundant elements using the functional dependencies that we have, to give a more canonical fo...
Definition: interesting_orders.cc:1102
void PrintFunctionalDependencies(std::ostream *trace)
Definition: interesting_orders.cc:2187
int RemapOrderingIndex(int ordering_idx) const
Definition: interesting_orders.h:427
void FinalizeDFSMState(THD *thd, int state_idx)
Definition: interesting_orders.cc:1915
void AddGroupingFromRollup(THD *thd, int state_idx, const Ordering &ordering)
Definition: interesting_orders.cc:1704
void PrintDFSMDottyGraph(std::ostream *trace) const
Definition: interesting_orders.cc:2285
friend bool operator!=(const NFSMEdge &a, const NFSMEdge &b)
Definition: interesting_orders.h:864
Mem_root_array< OrderElement * > m_elements_pool
We may potentially use a lot of Ordering::Elements objects, with short and non-overlapping life times...
Definition: interesting_orders.h:710
int num_items() const
Definition: interesting_orders.h:332
bool ordering_is_relevant_for_sortahead(int ordering_idx) const
Definition: interesting_orders.h:372
void PreReduceOrderings(THD *thd)
Do safe reduction on all orderings (some of them may get merged by PruneUninterestingOrders() later),...
Definition: interesting_orders.cc:895
void SetRollup(bool rollup)
Definition: interesting_orders.h:401
void CreateHomogenizedOrderings(THD *thd)
For each interesting ordering, see if we can homogenize it onto each table.
Definition: interesting_orders.cc:1023
Mem_root_array< NFSMState > m_states
Definition: interesting_orders.h:694
void SortElements(Ordering::Elements elements) const
Sort the elements so that a will appear before b if ItemBeforeInGroup(a,b)==true.
Definition: interesting_orders.cc:1185
int AddOrderingInternal(THD *thd, Ordering order, OrderingWithInfo::Type type, bool used_at_end, table_map homogenize_tables)
Definition: interesting_orders.cc:215
void AddFDsFromConstItems(THD *thd)
Try to add FDs from items that are constant by themselves, e.g.
Definition: interesting_orders.cc:739
int AddFunctionalDependency(THD *thd, FunctionalDependency fd)
Definition: interesting_orders.cc:270
bool FunctionalDependencyApplies(const FunctionalDependency &fd, const Ordering &ordering, int *start_point) const
Definition: interesting_orders.cc:1338
Mem_root_array< DFSMEdge > m_dfsm_edges
Definition: interesting_orders.h:698
void ReturnElements(Ordering::Elements elements)
Return an Ordering::Elements object with size()==m_longest_ordering to m_elements_pool.
Definition: interesting_orders.h:842
bool MoreOrderedThan(StateIndex a_idx, StateIndex b_idx, std::bitset< kMaxSupportedOrderings > ignored_orderings) const
Definition: interesting_orders.h:498
bool AlwaysActiveFD(int fd_idx)
Definition: interesting_orders.cc:1910
void BuildNFSM(THD *thd)
Definition: interesting_orders.cc:1497
void PrintInterestingOrders(std::ostream *trace)
Definition: interesting_orders.cc:2204
void BuildEquivalenceClasses()
Definition: interesting_orders.cc:523
int AddArtificialState(THD *thd, const Ordering &ordering)
Definition: interesting_orders.cc:1304
int StateIndex
Definition: interesting_orders.h:432
void PruneNFSM(THD *thd)
Try to prune away irrelevant nodes from the NFSM; it is worth spending some time on this,...
Definition: interesting_orders.cc:1784
Item * item(ItemHandle item) const
Definition: interesting_orders.h:331
void PrintNFSMDottyGraph(std::ostream *trace) const
Definition: interesting_orders.cc:2251
const Ordering & ordering(int ordering_idx) const
Definition: interesting_orders.h:368
void FindElementsThatCanBeAddedByFDs()
Definition: interesting_orders.cc:800
StateIndex ApplyFDs(StateIndex state_idx, FunctionalDependencySet fds) const
Definition: interesting_orders.cc:363
Bounds_checked_array< int > m_optimized_ordering_mapping
Definition: interesting_orders.h:702
int AddOrdering(THD *thd, Ordering order, bool interesting, bool used_at_end, table_map homogenize_tables)
Definition: interesting_orders.h:357
Ordering::Elements RetrieveElements(MEM_ROOT *mem_root)
Fetch an Ordering::Elements object with size()==m_longest_ordering.
Definition: interesting_orders.h:828
void ExpandThroughAlwaysActiveFDs(Mem_root_array< int > *nfsm_states, int *generation, int extra_allowed_fd_idx)
Definition: interesting_orders.cc:1932
void PruneFDs(THD *thd)
Definition: interesting_orders.cc:454
void PruneUninterestingOrders(THD *thd)
Try to get rid of uninteresting orders, possibly by discarding irrelevant suffixes and merging them w...
Definition: interesting_orders.cc:395
void AddFDsFromAggregateItems(THD *thd)
Definition: interesting_orders.cc:762
Bounds_checked_array< ItemHandle > CollectHeadForStaticWindowFunction(THD *thd, ItemHandle argument_item, Window *window)
Definition: interesting_orders.cc:585
Mem_root_array< ItemInfo > m_items
Definition: interesting_orders.h:556
ItemHandle GetHandle(Item *item)
Definition: interesting_orders.cc:1196
void FindInitialStatesForOrdering()
Definition: interesting_orders.cc:2122
void AddGroupingFromOrder(THD *thd, int state_idx, const Ordering &ordering)
Definition: interesting_orders.cc:1681
LogicalOrderings(THD *thd)
Definition: interesting_orders.cc:193
Mem_root_array< FunctionalDependency > m_fds
Definition: interesting_orders.h:691
void ConvertNFSMToDFSM(THD *thd)
From the NFSM, convert an equivalent DFSM.
Definition: interesting_orders.cc:1976
Bounds_checked_array< ItemHandle > m_aggregate_head
Definition: interesting_orders.h:560
void TryAddingOrderWithElementInserted(THD *thd, int state_idx, int fd_idx, Ordering old_ordering, size_t start_point, ItemHandle item_to_add, enum_order direction)
FunctionalDependencySet GetFDSet(int fd_idx) const
Definition: interesting_orders.h:443
void RecanonicalizeGroupings()
Definition: interesting_orders.cc:573
void Build(THD *thd)
Definition: interesting_orders.cc:301
std::string PrintFunctionalDependency(const FunctionalDependency &fd, bool html) const
Definition: interesting_orders.cc:2154
void CreateOrderingsFromRollups(THD *thd)
bool ItemBeforeInGroup(const OrderElement &a, const OrderElement &b) const
Definition: interesting_orders.h:726
bool CouldBecomeInterestingOrdering(const Ordering &ordering) const
For a given ordering, check whether it ever has the hope of becoming an interesting ordering.
Definition: interesting_orders.cc:1246
int num_fds() const
Definition: interesting_orders.h:383
A typesafe replacement for DYNAMIC_ARRAY.
Definition: mem_root_array.h:432
A scope-guard class for allocating an Ordering::Elements instance which is automatically returned to ...
Definition: interesting_orders.cc:69
Represents a (potentially interesting) ordering, rollup or (non-rollup) grouping.
Definition: interesting_orders.h:161
Elements & GetElements()
Definition: interesting_orders.h:223
size_t size() const
Definition: interesting_orders.h:228
Kind
The kind of ordering that an Ordering instance may represent.
Definition: interesting_orders.h:169
@ kOrder
Specific sequence of m_elements, and specific direction of each element.
@ kGroup
Elements may appear in any sequence and may be ordered in any direction.
@ kRollup
Specific sequence of m_elements, but each element may be ordered in any direction.
@ kEmpty
An ordering with no elements.
Ordering & operator=(const Ordering &other)
Assignment operator. Only defined explicitly to check Valid().
Definition: interesting_orders.h:200
Kind GetKind() const
Definition: interesting_orders.h:213
Ordering()
Definition: interesting_orders.h:187
Ordering(Elements elements, Kind kind)
Definition: interesting_orders.h:189
void Deduplicate()
Remove duplicate entries, in-place.
Definition: interesting_orders.cc:157
Bounds_checked_array< OrderElement > Elements
This type hold the individual elements of the ordering.
Definition: interesting_orders.h:166
Kind m_kind
The kind of this ordering.
Definition: interesting_orders.h:240
Ordering Clone(MEM_ROOT *mem_root) const
Make a copy of *this. Allocate new memory for m_elements from mem_root.
Definition: interesting_orders.h:208
Elements m_elements
The ordering terms.
Definition: interesting_orders.h:237
friend bool operator==(const Ordering &a, const Ordering &b)
Check if 'a' and 'b' has the same kind and contains the same elements.
Definition: interesting_orders.h:247
Ordering(const Ordering &other)
Copy constructor. Only defined explicitly to check Valid().
Definition: interesting_orders.h:194
bool Valid() const
Definition: interesting_orders.cc:168
const Elements & GetElements() const
Definition: interesting_orders.h:218
For each client connection we create a separate thread with THD serving as a thread/connection descri...
Definition: sql_lexer_thd.h:36
Represents the (explicit) window of a SQL 2003 section 7.11 <window clause>, or the implicit (inlined...
Definition: window.h:110
static MEM_ROOT mem_root
Definition: client_plugin.cc:114
static bool equal(const Item *i1, const Item *i2, const Field *f2)
Definition: sql_select.cc:3937
bool operator!=(const Ordering &a, const Ordering &b)
Definition: interesting_orders.h:255
bool operator==(const Ordering &a, const Ordering &b)
Check if 'a' and 'b' has the same kind and contains the same elements.
Definition: interesting_orders.h:247
std::bitset< kMaxSupportedFDs > FunctionalDependencySet
Definition: interesting_orders_defs.h:63
static constexpr int kMaxSupportedFDs
Definition: interesting_orders_defs.h:62
static constexpr int kMaxSupportedOrderings
Definition: interesting_orders_defs.h:65
int ItemHandle
Definition: interesting_orders_defs.h:39
enum_order
Definition: key_spec.h:65
void TRASH(void *ptr, size_t length)
Put bad content in memory to be sure it will segfault if dereferenced.
Definition: memory_debugging.h:71
uint64_t table_map
Definition: my_table_map.h:30
mutable_buffer buffer(void *p, size_t n) noexcept
Definition: buffer.h:418
required string type
Definition: replication_group_member_actions.proto:34
Definition: interesting_orders.h:259
ItemHandle tail
Definition: interesting_orders.h:291
bool always_active
Definition: interesting_orders.h:313
Bounds_checked_array< ItemHandle > head
Definition: interesting_orders.h:290
@ FD
Definition: interesting_orders.h:281
@ DECAY
Definition: interesting_orders.h:274
@ EQUIVALENCE
Definition: interesting_orders.h:287
enum FunctionalDependency::@115 type
Definition: interesting_orders.h:630
int state_idx
Definition: interesting_orders.h:632
int required_fd_idx
Definition: interesting_orders.h:631
const DFSMState * state(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:638
const FunctionalDependency * required_fd(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:634
Definition: interesting_orders.h:605
FunctionalDependencySet can_use_fd
Definition: interesting_orders.h:627
Mem_root_array< int > nfsm_states
Definition: interesting_orders.h:607
std::bitset< kMaxSupportedOrderings > follows_interesting_order
Definition: interesting_orders.h:615
Mem_root_array< int > outgoing_edges
Definition: interesting_orders.h:606
std::bitset< kMaxSupportedOrderings > can_reach_interesting_order
Definition: interesting_orders.h:622
Bounds_checked_array< int > next_state
Definition: interesting_orders.h:612
Definition: interesting_orders.h:523
bool used_desc
Definition: interesting_orders.h:551
bool used_in_grouping
Definition: interesting_orders.h:552
ItemHandle canonical_item
Definition: interesting_orders.h:537
bool can_be_added_by_fd
Definition: interesting_orders.h:543
bool used_asc
Definition: interesting_orders.h:550
Item * item
Definition: interesting_orders.h:525
Definition: interesting_orders.h:567
const NFSMState * state(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:581
int required_fd_idx
Definition: interesting_orders.h:572
const FunctionalDependency * required_fd(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:577
int state_idx
Definition: interesting_orders.h:575
Definition: interesting_orders.h:589
enum LogicalOrderings::NFSMState::@116 type
@ ARTIFICIAL
Definition: interesting_orders.h:590
@ DELETED
Definition: interesting_orders.h:590
@ INTERESTING
Definition: interesting_orders.h:590
Mem_root_array< NFSMEdge > outgoing_edges
Definition: interesting_orders.h:591
std::bitset< kMaxSupportedOrderings > can_reach_interesting_order
Definition: interesting_orders.h:596
int satisfied_ordering_idx
Definition: interesting_orders.h:593
int seen
Definition: interesting_orders.h:603
Ordering satisfied_ordering
Definition: interesting_orders.h:592
Definition: interesting_orders.h:643
StateIndex state_idx
Definition: interesting_orders.h:683
Ordering ordering
Definition: interesting_orders.h:644
bool used_at_end
Definition: interesting_orders.h:677
Type
Definition: interesting_orders.h:649
@ UNINTERESTING
Definition: interesting_orders.h:674
@ INTERESTING
Definition: interesting_orders.h:652
@ HOMOGENIZED
Definition: interesting_orders.h:667
enum LogicalOrderings::OrderingWithInfo::Type type
table_map homogenize_tables
Definition: interesting_orders.h:680
The MEM_ROOT is a simple arena, where allocations are carved out of larger blocks.
Definition: my_alloc.h:83
Definition: interesting_orders_defs.h:44
ItemHandle item
Definition: interesting_orders_defs.h:45