MySQL 8.0.29
Source Code Documentation
interesting_orders.h
Go to the documentation of this file.
1/* Copyright (c) 2021, Oracle and/or its affiliates.
2
3 This program is free software; you can redistribute it and/or modify
4 it under the terms of the GNU General Public License, version 2.0,
5 as published by the Free Software Foundation.
6
7 This program is also distributed with certain software (including
8 but not limited to OpenSSL) that is licensed under separate terms,
9 as designated in a particular file or component or in included license
10 documentation. The authors of MySQL hereby grant you an additional
11 permission to link the program and your derivative works with the
12 separately licensed software that they have included with MySQL.
13
14 This program is distributed in the hope that it will be useful,
15 but WITHOUT ANY WARRANTY; without even the implied warranty of
16 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17 GNU General Public License, version 2.0, for more details.
18
19 You should have received a copy of the GNU General Public License
20 along with this program; if not, write to the Free Software
21 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
22
23#ifndef SQL_JOIN_OPTIMIZER_INTERESTING_ORDERS_H
24#define SQL_JOIN_OPTIMIZER_INTERESTING_ORDERS_H
25
26/**
27 @file
28
29 Tracks which tuple streams follow which orders, and in particular whether
30 they follow interesting orders.
31
32 An interesting order (and/or grouping) is one that we might need to sort by
33 at some point during query execution (e.g. to satisfy an ORDER BY predicate);
34 if the rows already are produced in that order, for instance because we
35 scanned along the right index, we can skip the sort and get a lower cost.
36
37 We generally follow these papers:
38
39 [Neu04] Neumann and Moerkotte: “An efficient framework for order
40 optimization”
41 [Neu04b] Neumann and Moerkotte: “A Combined Framework for
42 Grouping and Order Optimization”
43
44 [Neu04b] is an updated version of [Neu04] that also deals with interesting
45 groupings but omits some details to make more space, so both are needed.
46 A combined and updated version of the same material is available in
47 Moerkotte's “Query compilers” PDF.
48
49 Some further details, like order homogenization, come from
50
51 [Sim96] Simmen et al: “Fundamental Techniques for Order Optimization”
52
53 All three papers deal with the issue of _logical_ orderings, where any
54 row stream may follow more than one order simultaneously, as inferred
55 through functional dependencies (FDs). For instance, if we have an ordering
56 (ab) but also an active FD {a} → c (c is uniquely determined by a,
57 for instance because a is a primary key in the same table as c), this means
58 we also implicitly follow the orders (acb) and (abc). In addition,
59 we trivially follow the orders (a), (ac) and (ab). However, note that we
60 do _not_ necessarily follow the order (cab).
61
62 Similarly, equivalences, such as WHERE conditions and joins, give rise
63 to a stronger form of FDs. If we have an ordering (ab) and the FD b = c,
64 we can be said to follow (ac), (acb) or (abc). The former would not be
65 inferrable from {b} → c and {c} → b alone. Equivalences with constants
66 are perhaps even stronger, e.g. WHERE x=3 would give rise to {} → x,
67 which could extend (a) to (xa), (ax) or (x).
68
69 Neumann et al solve this by modelling which ordering we're following as a
70 state in a non-deterministic finite state machine (NFSM). By repeatedly
71 applying FDs (which become edges in the NFSM), we can build up all possible
72 orderings from a base (which can be either the empty ordering, ordering from
73 scanning along an index, or one produced by an explicit sort) and then
74 checking whether we are in a state matching the ordering we are interested
75 in. (There can be quite a bit of states, so we need a fair amount of pruning
76 to keep the count manageable, or else performance will suffer.) Of course,
77 since NFSMs are nondeterministic, a base ordering and a set of FDs can
78 necessarily put us in a number of states, so we need to convert the NFSM
79 to a DFSM (using the standard powerset construction for NFAs; see
80 ConvertNFSMToDFSM()). This means that the ordering state for an access path
81 is only a single integer, the DFSM state number. When we activate more FDs,
82 for instance because we apply joins, we will move throughout the DFSM into
83 more attractive states. By checking simple precomputed lookup tables,
84 we can quickly find whether a given DFSM state follows a given ordering.
85
86 The other kind of edges we follow are from the artificial starting state;
87 they represent setting a specific ordering (e.g. because we sort by that
88 ordering). This is set up in the NFSM and preserved in the DFSM.
89
90 The actual collection of FDs and interesting orders happen outside this
91 class, in the caller.
92
93 A weakness in the approach is that transitive FDs are not always followed
94 correctly. E.g., if we have an ordering (a), and FDs {a} → b and {b} → c,
95 we will create (ab) and (abc), but _not_ (ac). This is not a problem for
96 equivalences, though, and most of the FDs we collect are equivalences.
97 We do have some heuristics to produce a new FD {a} → c where it is relevant,
98 but they are not always effective.
99
100 Neumann and Moerkotte distinguish between “tested-for” (O_T) and
101 “producing” (O_P) orderings, where all orders are interesting but only
102 some can be produced by explicit operators, such as sorts. Our implementation
103 is exactly opposite; we allow every ordering to be produced (by means of
104 sort-ahead), but there are orders that can be produced (e.g. when scanning
105 an index) that are not interesting in themselves. Such orders can be
106 pruned away early if we can show they do not produce anything interesting.
107
108
109 The operations related to interesting orders, in particular the concept
110 of functional dependencies, are related to the ones we are doing when
111 checking ONLY_FULL_GROUP_BY legality in sql/aggregate_check.h. However,
112 there are some key differences as well:
113
114 - Orderings are lexical, while groupings are just a bag of attributes.
115 This increases the state space for orderings significantly; groupings
116 can add elements at will and just grow the set freely, while orderings
117 need more care. In particular, this means that groupings only need
118 FDs on the form S → x (where S is a set), while orderings also benefit
119 from those of the type x = y, which replace an element instead of
120 adding a new one.
121
122 - ONLY_FULL_GROUP_BY is for correctness of rejecting or accepting the
123 query, while interesting orders is just an optimization, so not
124 recognizing rare cases is more acceptable.
125
126 - ONLY_FULL_GROUP_BY testing only cares about the set of FDs that hold
127 at one specific point (GROUP BY testing, naturally), while interesting
128 orders must be tracked throughout the entire operator tree. In particular,
129 if using interesting orders for merge join, the status at nearly every
130 join is relevant. Also, performance matters much more.
131
132 Together, these mean that the code ends up being fairly different,
133 and some cases are recognized by ONLY_FULL_GROUP_BY but not by interesting
134 orders. (The actual FD collection happens in BuildInterestingOrders in
135 join_optimizer.cc; see the comment there for FD differences.)
136
137 A note about nomenclature: Like Neumann et al, we use the term “ordering”
138 (and “grouping”) instead of “order”, with the special exception of the
139 already-established term “interesting order”.
140 */
141
142#include "my_table_map.h"
144#include "sql/mem_root_array.h"
145#include "sql/sql_array.h"
146
147#include <bitset>
148#include <string>
149
150// Represents a (potentially interesting) ordering or grouping;
151// OrderElement::direction will signify which one. Immutable,
152// and usually lives on the MEM_ROOT.
154
155class Window;
156
158 enum {
159 // A special “empty” kind of edge in the FSM that signifies
160 // adding no functional dependency, ie., a state we can reach
161 // with no further effort. This can happen in two ways:
162 //
163 // 1. An ordering can drop its last element, ie.,
164 // if a tuple stream is ordered on (a,b,c), it is also
165 // ordered on (a,b).
166 // 2. An ordering can be converted to a grouping, i.e,
167 // if a tuple stream is ordered on (a,b,c), it is also
168 // grouped on {a,b,c}.
169 //
170 // head must be empty, tail must be 0. Often called ϵ.
171 // Must be the first in the edge list.
173
174 // A standard functional dependency {a} → b; if a row tuple
175 // is ordered on all elements of a and this FD is applied,
176 // it is also ordered on b. A typical example is if {a}
177 // is an unique key in a table, and b is a column of the
178 // same table. head can be empty.
180
181 // An equivalence a = b; implies a → b and b → a, but is
182 // stronger (e.g. if ordered on (a,c), there is also an
183 // ordering on (b,c), which wouldn't be true just from
184 // applying FDs individually). head must be a single element.
187
190
191 // Whether this functional dependency can always be applied, ie.,
192 // there is never a point during query processing where it does not hold.
193 //
194 // Examples of not-always-active FDs include join conditions;
195 // e.g. for t1.x = t2.x, it is not true before the join has actually
196 // happened (and t1.x won't be the same order as t2.x before that,
197 // and thus cannot be used in e.g. a merge join).
198 //
199 // However, FDs that stem from unique indexes are always true; e.g. if
200 // t1.x is a primary key, {t1.x} → t1.y will always be true, and we can
201 // always reduce t1.y from an order if t1.x is present earlier.
202 // Similarly, WHERE conditions that are applied on the base table
203 // (ie., it is not delayed due to outer joins) will always be true,
204 // if t1.x = 3, we can safely assume {} → t1.x holds even before
205 // joining in t1, so a sort on (t1.x, t2.y) can be satisfied just by
206 // sorting t2 on y.
207 //
208 // Always-active FDs are baked into the DFSM, so that we need to follow
209 // fewer arcs during query processing. They can also be used for reducing
210 // the final order (to get more efficient sorting), but we don't do it yet.
211 bool always_active = false;
212};
213
215 public:
216 explicit LogicalOrderings(THD *thd);
217
218 // Maps the Item to an opaque integer handle. Deduplicates items as we go,
219 // inserting new ones if needed.
221
222 Item *item(ItemHandle item) const { return m_items[item].item; }
223
224 // These are only available before Build() has been called.
225
226 // Mark an interesting ordering (or grouping) as interesting,
227 // returning an index that can be given to SetOrder() later.
228 // Will deduplicate against previous entries; if not deduplicated
229 // away, a copy will be taken.
230 //
231 // Uninteresting orderings are those that can be produced by some
232 // operator (for instance, index scan) but are not interesting to
233 // test for. Orderings may be merged, pruned (if uninteresting)
234 // and moved around after Build(); see RemapOrderingIndex().
235 //
236 // If used_at_end is true, the ordering is assumed to be used only
237 // after all joins have happened, so all FDs are assumed to be
238 // active. This enables reducing the ordering more (which can in
239 // some cases help with better sortahead or the likes), but is not
240 // correct if the ordering wants to be used earlier on, e.g.
241 // in merge join or for semijoin duplicate removal. If it is false,
242 // then it is also only attempted homogenized onto the given set
243 // of tables (otherwise, it is ignored, and homogenization is over
244 // all tables).
245 //
246 // The empty ordering/grouping is always index 0.
247 int AddOrdering(THD *thd, Ordering order, bool interesting, bool used_at_end,
248 table_map homogenize_tables) {
249 return AddOrderingInternal(thd, order,
252 used_at_end, homogenize_tables);
253 }
254
255 // NOTE: Will include the empty ordering.
256 int num_orderings() const { return m_orderings.size(); }
257
258 Ordering ordering(int ordering_idx) const {
259 return m_orderings[ordering_idx].ordering;
260 }
261
262 bool ordering_is_relevant_for_sortahead(int ordering_idx) const {
263 return m_orderings[ordering_idx].type != OrderingWithInfo::UNINTERESTING;
264 }
265
266 // Add a functional dependency that may be applied at some point
267 // during the query planning. Same guarantees as AddOrdering().
268 // The special “decay” FD is always index 0.
270
271 // NOTE: Will include the decay (epsilon) FD.
272 int num_fds() const { return m_fds.size(); }
273
274 // Set the list of GROUP BY expressions, if any. This is used as the
275 // head of the functional dependencies for all aggregate functions
276 // (which by definition are functionally dependent on the GROUP BY
277 // expressions, unless ROLLUP is active -- see below), and must be
278 // valid (ie., not freed or modified) until Build() has run.
279 //
280 // If none is set, and there are aggregates present in orderings,
281 // implicit aggregation is assumed (ie., all aggregate functions
282 // are constant).
284 m_aggregate_head = head;
285 }
286
287 // Set whether ROLLUP is active; if so, we can no longer assume that
288 // aggregate functions are functionally dependent on (nullable)
289 // GROUP BY expressions, as two NULLs may be for different reasons.
290 void SetRollup(bool rollup) { m_rollup = rollup; }
291
292 // Builds the actual FSMs; all information about orderings and FDs is locked,
293 // optimized and then the state machine is built. After this, you can no
294 // longer add new orderings or FDs, ie., you are moving into the actual
295 // planning phase.
296 //
297 // Build() may prune away orderings and FDs, and it may also add homogenized
298 // orderings, ie., orderings derived from given interesting orders but
299 // modified so that they only require a single table (but will become an
300 // actual interesting order later, after the FDs have been applied). These are
301 // usually at the end, but may also be deduplicated against uninteresting
302 // orders, which will then be marked as interesting.
303 //
304 // trace can be nullptr; if not, it get human-readable optimizer trace
305 // appended to it.
306 void Build(THD *thd, std::string *trace);
307
308 // These are only available after Build() has been called.
309 // They are stateless and used in the actual planning phase.
310
311 // Converts an index returned by AddOrdering() to one that can be given
312 // to SetOrder() or DoesFollowOrder(). They don't convert themselves
313 // since it's completely legitimate to iterate over all orderings using
314 // num_orderings() and orderings(), and those indexes should _not_ be
315 // remapped.
316 //
317 // If an ordering has been pruned away, will return zero (the empty ordering),
318 // which is a valid input to SetOrder().
319 int RemapOrderingIndex(int ordering_idx) const {
320 assert(m_built);
321 return m_optimized_ordering_mapping[ordering_idx];
322 }
323
324 using StateIndex = int;
325
326 StateIndex SetOrder(int ordering_idx) const {
327 assert(m_built);
328 return m_orderings[ordering_idx].state_idx;
329 }
330
331 // Get a bitmap representing the given functional dependency. The bitmap
332 // can be all-zero if the given FD is optimized away, or outside the range
333 // of the representable bits. The bitmaps may be ORed together, but are
334 // otherwise to be treated as opaque to the client.
337 int new_fd_idx = m_optimized_fd_mapping[fd_idx];
338 if (new_fd_idx >= 1 && new_fd_idx <= kMaxSupportedFDs) {
339 fd_set.set(new_fd_idx - 1);
340 }
341 return fd_set;
342 }
343
344 // For a given state, see what other (better) state we can move to given a
345 // set of active functional dependencies, e.g. if we are in state ((),a) and
346 // the FD a=b becomes active, we can set its bit (see GetFDSet()) in the FD
347 // mask and use that to move to the state ((),a,b,ab,ba). Note that “fds”
348 // should contain the entire set of active FDs, not just newly-applied ones.
349 // This is because “old” FDs can suddenly become relevant when new logical
350 // orderings are possible, and the DFSM is not always able to bake this in.
352
353 bool DoesFollowOrder(StateIndex state_idx, int ordering_idx) const {
354 assert(m_built);
355 if (ordering_idx == 0) {
356 return true;
357 }
358 if (ordering_idx >= kMaxSupportedOrderings) {
359 return false;
360 }
361 return m_dfsm_states[state_idx].follows_interesting_order.test(
362 ordering_idx);
363 }
364
365 // Whether "a" follows any interesting orders than "b" does not, or could
366 // do so in the future. If !MoreOrderedThan(a, b) && !MoreOrderedThan(b, a)
367 // the states are equal (they follow the same interesting orders, and could
368 // lead to the same interesting orders given the same FDs -- see below).
369 // It is possible to have MoreOrderedThan(a, b) && MoreOrderedThan(b, a), e.g.
370 // if they simply follow disjunct orders.
371 //
372 // This is used in the planner, when pruning access paths -- an AP A can be
373 // kept even if it has higher cost than AP B, if it follows orders that B does
374 // not. Why is it enough to check interesting orders -- must we also not check
375 // uninteresting orders, since they could lead to new interesting orders
376 // later? This is because in the planner, two states will only ever be
377 // compared if the same functional dependencies have been applied to both
378 // sides:
379 //
380 // The set of logical orders, and thus the state, is uniquely determined
381 // by the initial ordering and applied FDs. Thus, if A has _uninteresting_
382 // orders that B does not, the initial ordering must have differed -- but the
383 // initial states only contain (and thus differ in) interesting orders.
384 // Thus, the additional uninteresting orders must have been caused by
385 // additional interesting orders (that do not go away), so testing the
386 // interesting ones really suffices in planner context.
387 //
388 // Note that this also means that in planner context, !MoreOrderedThan(a, b)
389 // && !MoreOrderedThan(b, a) implies that a == b.
391 StateIndex a_idx, StateIndex b_idx,
392 std::bitset<kMaxSupportedOrderings> ignored_orderings) const {
393 assert(m_built);
394 std::bitset<kMaxSupportedOrderings> a =
395 m_dfsm_states[a_idx].follows_interesting_order & ~ignored_orderings;
396 std::bitset<kMaxSupportedOrderings> b =
397 m_dfsm_states[b_idx].follows_interesting_order & ~ignored_orderings;
398 std::bitset<kMaxSupportedOrderings> future_a =
399 m_dfsm_states[a_idx].can_reach_interesting_order & ~ignored_orderings;
400 std::bitset<kMaxSupportedOrderings> future_b =
401 m_dfsm_states[b_idx].can_reach_interesting_order & ~ignored_orderings;
402 return (a & b) != a || (future_a & future_b) != future_a;
403 }
404
405 private:
406 bool m_built = false;
407
408 struct ItemInfo {
409 // Used to translate Item * to ItemHandle and back.
411
412 // Points to the head of this item's equivalence class. (If the item
413 // is not equivalent to anything, points to itself.) The equivalence class
414 // is defined by EQUIVALENCE FDs, transitively, and the head is the one with
415 // the lowest index. So if we have FDs a = b and b = c, all three {a,b,c}
416 // will point to a here. This is useful for pruning and homogenization;
417 // if two elements have the same equivalence class (ie., the same canonical
418 // item), they could become equivalent after applying FDs. See also
419 // m_can_be_added_by_fd, which deals with non-EQUIVALENCE FDs.
420 //
421 // Set by BuildEquivalenceClasses().
423
424 // Whether the given item (after canonicalization by means of
425 // m_canonical_item[]) shows up as the tail of any functional dependency.
426 //
427 // Set by FindElementsThatCanBeAddedByFDs();
428 bool can_be_added_by_fd = false;
429
430 // Whether the given item ever shows up in orderings as ASC or DESC,
431 // respectively. Used to see whether adding the item in that direction
432 // is worthwhile or not. Note that this is propagated through equivalences,
433 // so if a = b and any ordering contains b DESC and a is the head of that
434 // equivalence class, then a is also marked as used_desc = true.
435 bool used_asc = false;
436 bool used_desc = false;
437 bool used_in_grouping = false;
438 };
439 // All items we have seen in use (in orderings or FDs), deduplicated
440 // and indexed by ItemHandle.
442
443 // Head for all FDs generated for aggregate functions.
444 // See SetHeadForAggregates().
446
447 // Whether rollup is active; if so, we need to take care not to create
448 // FDs for aggregates in some cases. See SetHeadForAggregates() and
449 // SetRollup().
450 bool m_rollup = false;
451
452 struct NFSMState {
456 int satisfied_ordering_idx; // Only for type == INTERESTING.
457
458 // Indexed by ordering.
459 std::bitset<kMaxSupportedOrderings> can_reach_interesting_order{0};
460
461 // Used during traversal, to keep track of which states we have
462 // already seen (for fast deduplication). We use the standard trick
463 // of using a generational counter instead of a bool, so that we don't
464 // have to clear it every time; we can just increase the generation
465 // and treat everything with lower/different “seen” as unseen.
466 int seen = 0;
467 };
468 struct DFSMState {
469 Mem_root_array<int> outgoing_edges; // Index into dfsm_edges.
470 Mem_root_array<int> nfsm_states; // Index into states.
471
472 // Structures derived from the above, but in forms for faster access.
473
474 // Indexed by FD.
476
477 // Indexed by ordering.
478 std::bitset<kMaxSupportedOrderings> follows_interesting_order{0};
479
480 // Interesting orders that this state can eventually reach,
481 // given that all FDs are applied (a superset of follows_interesting_order).
482 // We track this instead of the producing orders (e.g. which homogenized
483 // order are we following), because it allows for more access paths to
484 // compare equal. See also OrderingWithInfo::Type::HOMOGENIZED.
485 std::bitset<kMaxSupportedOrderings> can_reach_interesting_order{0};
486
487 // Whether applying the given functional dependency will take us to a
488 // different state from this one. Used to quickly intersect with the
489 // available FDs to find out what we can apply.
491 };
492
493 struct NFSMEdge {
494 // Which FD is required to follow this edge. Index into m_fd, with one
495 // exception; from the initial state (0), we have constructor edges for
496 // setting a specific order without following an FD. Such edges have
497 // required_fd_idx = INT_MIN + order_idx, ie., they are negative.
499
500 // Destination state (index into m_states).
502
504 const LogicalOrderings *orderings) const {
505 return &orderings->m_fds[required_fd_idx];
506 }
507 const NFSMState *state(const LogicalOrderings *orderings) const {
508 return &orderings->m_states[state_idx];
509 }
510 };
511 struct DFSMEdge {
514
516 const LogicalOrderings *orderings) const {
517 return &orderings->m_fds[required_fd_idx];
518 }
519 const DFSMState *state(const LogicalOrderings *orderings) const {
520 return &orderings->m_dfsm_states[state_idx];
521 }
522 };
523
526
527 // Status of the ordering. Note that types with higher indexes dominate
528 // lower, ie., two orderings can be collapsed into the one with the higher
529 // type index if they are otherwise equal.
530 enum Type {
531 // An ordering that is interesting in its own right,
532 // e.g. because it is given to ORDER BY.
534
535 // An ordering that is derived from an interesting order, but refers to
536 // one table only (or conceptually, a subset of tables -- but we don't
537 // support that in this implementation). Guaranteed to reach some
538 // interesting order at some point, but we don't track it as interesting
539 // in the FSM states. This means that these orderings don't get state bits
540 // in follows_interesting_order for themselves, but they will always have
541 // one or more interesting orders in can_reach_interesting_order.
542 // This helps us collapse access paths more efficiently; if we have
543 // an interesting order t3.x and create homogenized orderings t1.x
544 // and t2.x (due to some equality with t3.x), an access path following one
545 // isn't better than an access path following the other. They will lead
546 // to the same given the same FDs anyway (see MoreOrderedThan()), and
547 // thus are equally good.
549
550 // An ordering that is just added because it is easy to produce;
551 // e.g. because it is produced by scanning along an index. Such orderings
552 // can be shortened or pruned away entirely (in
553 // PruneUninterestingOrders())
554 // unless we find that they may lead to an interesting order.
555 UNINTERESTING = 0
557
559
560 // Only used if used_at_end = false (see AddOrdering()).
562
563 // Which initial state to use for this ordering (in SetOrder()).
565 };
566
568
569 // The longest ordering in m_orderings.
571
573
574 // NFSM. 0 is the initial state, all others are found by following edges.
577
578 // DFSM. 0 is the initial state, all others are found by following edges.
581
582 // After PruneUninterestingOrders has run, maps from the old indexes to the
583 // new indexes.
585
586 // After PruneFDs() has run, maps from the old indexes to the new indexes.
588
589 // The canonical order for two items in a grouping
590 // (after BuildEquivalenceClasses() has run; enforced by
591 // RecanonicalizeGroupings()). The reason why we sort by
592 // canonical_item first is so that switching out one element
593 // with an equivalent one (ie., applying an EQUIVALENCE
594 // functional dependency) does not change the order of the
595 // elements in the grouing, which could give false negatives
596 // in CouldBecomeInterestingOrdering().
598 if (m_items[a].canonical_item != m_items[b].canonical_item)
599 return m_items[a].canonical_item < m_items[b].canonical_item;
600 return a < b;
601 }
602
603 inline bool ItemBeforeInGroup(const OrderElement &a, const OrderElement &b) {
604 return ItemHandleBeforeInGroup(a.item, b.item);
605 }
606
607 // Helper for AddOrdering().
609 bool used_at_end, table_map homogenize_tables);
610
611 // See comment in .cc file.
612 void PruneUninterestingOrders(THD *thd);
613
614 // See comment in .cc file.
615 void PruneFDs(THD *thd);
616
617 // See comment in .cc file.
619 bool all_fds) const;
620
621 // Populates ItemInfo::canonical_item.
623
624 // See comment in .cc file.
626
627 // See comment in .cc file.
628 void AddFDsFromComputedItems(THD *thd);
629
630 // See comment in .cc file.
631 void AddFDsFromAggregateItems(THD *thd);
632
633 // See comment in .cc file.
635 THD *thd, ItemHandle argument_item, Window *window);
636
637 // See comment in .cc file.
638 void AddFDsFromConstItems(THD *thd);
639
640 // Populates ItemInfo::can_be_added_by_fd.
642
643 // See comment in .cc file.
645 OrderElement *tmpbuf) const;
646 void PreReduceOrderings(THD *thd);
650 THD *thd, Ordering reduced_ordering, bool used_at_end, int table_idx,
651 Bounds_checked_array<std::pair<ItemHandle, ItemHandle>> reverse_canonical,
652 OrderElement *tmpbuf);
653
654 // See comment in .cc file.
656
657 void BuildNFSM(THD *thd);
658 void AddGroupingFromOrdering(THD *thd, int state_idx, Ordering ordering,
659 OrderElement *tmpbuf);
660 void TryAddingOrderWithElementInserted(THD *thd, int state_idx, int fd_idx,
661 Ordering old_ordering,
662 size_t start_point,
663 ItemHandle item_to_add,
664 enum_order direction,
665 OrderElement *tmpbuf);
666 void PruneNFSM(THD *thd);
667 bool AlwaysActiveFD(int fd_idx);
668 void FinalizeDFSMState(THD *thd, int state_idx);
670 int *generation, int extra_allowed_fd_idx);
671 void ConvertNFSMToDFSM(THD *thd);
672
673 // Populates state_idx for every ordering in m_ordering.
675
676 // If a state with the given ordering already exists (artificial or not),
677 // returns its index. Otherwise, adds an artificial state with the given
678 // order and returns its index.
680
681 // Add an edge from state_idx to an state with the given ordering; if there is
682 // no such state, adds an artificial state with it (taking a copy, so does not
683 // need to take ownership).
684 void AddEdge(THD *thd, int state_idx, int required_fd_idx, Ordering ordering);
685
686 // Returns true if the given (non-DECAY) functional dependency applies to the
687 // given ordering, and the index of the element from which the FD is active
688 // (ie., the last element that was part of the head). One can start inserting
689 // the tail element at any point _after_ this index; if it is an EQUIVALENCE
690 // FD, one can instead choose to replace the element at start_point entirely.
692 const Ordering ordering,
693 int *start_point) const;
694
695 // Used for optimizer trace.
696
697 std::string PrintOrdering(Ordering ordering) const;
699 bool html) const;
700 void PrintFunctionalDependencies(std::string *trace);
701 void PrintInterestingOrders(std::string *trace);
702 void PrintNFSMDottyGraph(std::string *trace) const;
703 void PrintDFSMDottyGraph(std::string *trace) const;
704};
705
706bool IsGrouping(Ordering ordering);
707
708#endif // SQL_JOIN_OPTIMIZER_INTERESTING_ORDERS_H
Base class that is used to represent any kind of expression in a relational query.
Definition: item.h:802
Definition: interesting_orders.h:214
void SetHeadForAggregates(Bounds_checked_array< ItemHandle > head)
Definition: interesting_orders.h:283
void CreateOrderingsFromGroupings(THD *thd)
We don't currently have any operators that only group and do not sort (e.g.
Definition: interesting_orders.cc:837
Ordering ReduceOrdering(Ordering ordering, bool all_fds, OrderElement *tmpbuf) const
Remove redundant elements using the functional dependencies that we have, to give a more canonical fo...
Definition: interesting_orders.cc:1002
bool ItemHandleBeforeInGroup(ItemHandle a, ItemHandle b)
Definition: interesting_orders.h:597
void AddFDsFromComputedItems(THD *thd)
Try to add new FDs from items that are not base items; e.g., if we have an item (a + 1),...
Definition: interesting_orders.cc:511
Mem_root_array< OrderingWithInfo > m_orderings
Definition: interesting_orders.h:567
Bounds_checked_array< int > m_optimized_fd_mapping
Definition: interesting_orders.h:587
StateIndex SetOrder(int ordering_idx) const
Definition: interesting_orders.h:326
int num_orderings() const
Definition: interesting_orders.h:256
bool DoesFollowOrder(StateIndex state_idx, int ordering_idx) const
Definition: interesting_orders.h:353
bool m_rollup
Definition: interesting_orders.h:450
int m_longest_ordering
Definition: interesting_orders.h:570
int AddArtificialState(THD *thd, Ordering ordering)
Definition: interesting_orders.cc:1175
bool CouldBecomeInterestingOrdering(Ordering ordering) const
For a given ordering, check whether it ever has the hope of becoming an interesting ordering.
Definition: interesting_orders.cc:1120
Mem_root_array< DFSMState > m_dfsm_states
Definition: interesting_orders.h:579
bool m_built
Definition: interesting_orders.h:406
int RemapOrderingIndex(int ordering_idx) const
Definition: interesting_orders.h:319
void FinalizeDFSMState(THD *thd, int state_idx)
Definition: interesting_orders.cc:1592
bool ImpliedByEarlierElements(ItemHandle item, Ordering prefix, bool all_fds) const
Checks whether the given item is redundant given previous elements in the ordering; ie....
Definition: interesting_orders.cc:730
void PrintDFSMDottyGraph(std::string *trace) const
Definition: interesting_orders.cc:1941
bool ordering_is_relevant_for_sortahead(int ordering_idx) const
Definition: interesting_orders.h:262
void PreReduceOrderings(THD *thd)
Do safe reduction on all orderings (some of them may get merged by PruneUninterestingOrders() later),...
Definition: interesting_orders.cc:797
void SetRollup(bool rollup)
Definition: interesting_orders.h:290
void PrintFunctionalDependencies(std::string *trace)
Definition: interesting_orders.cc:1853
void CreateHomogenizedOrderings(THD *thd)
For each interesting ordering, see if we can homogenize it onto each table.
Definition: interesting_orders.cc:925
Mem_root_array< NFSMState > m_states
Definition: interesting_orders.h:575
int AddOrderingInternal(THD *thd, Ordering order, OrderingWithInfo::Type type, bool used_at_end, table_map homogenize_tables)
Definition: interesting_orders.cc:122
void AddFDsFromConstItems(THD *thd)
Try to add FDs from items that are constant by themselves, e.g.
Definition: interesting_orders.cc:625
int AddFunctionalDependency(THD *thd, FunctionalDependency fd)
Definition: interesting_orders.cc:175
void PrintNFSMDottyGraph(std::string *trace) const
Definition: interesting_orders.cc:1906
Mem_root_array< DFSMEdge > m_dfsm_edges
Definition: interesting_orders.h:580
bool MoreOrderedThan(StateIndex a_idx, StateIndex b_idx, std::bitset< kMaxSupportedOrderings > ignored_orderings) const
Definition: interesting_orders.h:390
bool AlwaysActiveFD(int fd_idx)
Definition: interesting_orders.cc:1587
void BuildNFSM(THD *thd)
Definition: interesting_orders.cc:1242
Mem_root_array< NFSMEdge > m_edges
Definition: interesting_orders.h:576
void BuildEquivalenceClasses()
Definition: interesting_orders.cc:405
int StateIndex
Definition: interesting_orders.h:324
void AddHomogenizedOrderingIfPossible(THD *thd, Ordering reduced_ordering, bool used_at_end, int table_idx, Bounds_checked_array< std::pair< ItemHandle, ItemHandle > > reverse_canonical, OrderElement *tmpbuf)
Helper function for CreateHomogenizedOrderings().
Definition: interesting_orders.cc:1017
Ordering ordering(int ordering_idx) const
Definition: interesting_orders.h:258
bool FunctionalDependencyApplies(const FunctionalDependency &fd, const Ordering ordering, int *start_point) const
Definition: interesting_orders.cc:1206
void PruneNFSM(THD *thd)
Try to prune away irrelevant nodes from the NFSM; it is worth spending some time on this,...
Definition: interesting_orders.cc:1457
Item * item(ItemHandle item) const
Definition: interesting_orders.h:222
void FindElementsThatCanBeAddedByFDs()
Definition: interesting_orders.cc:686
void PrintInterestingOrders(std::string *trace)
Definition: interesting_orders.cc:1870
StateIndex ApplyFDs(StateIndex state_idx, FunctionalDependencySet fds) const
Definition: interesting_orders.cc:250
Bounds_checked_array< int > m_optimized_ordering_mapping
Definition: interesting_orders.h:584
void AddGroupingFromOrdering(THD *thd, int state_idx, Ordering ordering, OrderElement *tmpbuf)
Definition: interesting_orders.cc:1378
int AddOrdering(THD *thd, Ordering order, bool interesting, bool used_at_end, table_map homogenize_tables)
Definition: interesting_orders.h:247
void ExpandThroughAlwaysActiveFDs(Mem_root_array< int > *nfsm_states, int *generation, int extra_allowed_fd_idx)
Definition: interesting_orders.cc:1609
void TryAddingOrderWithElementInserted(THD *thd, int state_idx, int fd_idx, Ordering old_ordering, size_t start_point, ItemHandle item_to_add, enum_order direction, OrderElement *tmpbuf)
Definition: interesting_orders.cc:1397
void PruneFDs(THD *thd)
Definition: interesting_orders.cc:336
void PruneUninterestingOrders(THD *thd)
Try to get rid of uninteresting orders, possibly by discarding irrelevant suffixes and merging them w...
Definition: interesting_orders.cc:282
void AddFDsFromAggregateItems(THD *thd)
Definition: interesting_orders.cc:648
std::string PrintOrdering(Ordering ordering) const
Definition: interesting_orders.cc:1806
bool ItemBeforeInGroup(const OrderElement &a, const OrderElement &b)
Definition: interesting_orders.h:603
Bounds_checked_array< ItemHandle > CollectHeadForStaticWindowFunction(THD *thd, ItemHandle argument_item, Window *window)
Definition: interesting_orders.cc:470
Mem_root_array< ItemInfo > m_items
Definition: interesting_orders.h:441
ItemHandle GetHandle(Item *item)
Definition: interesting_orders.cc:1082
void FindInitialStatesForOrdering()
Definition: interesting_orders.cc:1794
LogicalOrderings(THD *thd)
Definition: interesting_orders.cc:100
Mem_root_array< FunctionalDependency > m_fds
Definition: interesting_orders.h:572
void ConvertNFSMToDFSM(THD *thd)
From the NFSM, convert an equivalent DFSM.
Definition: interesting_orders.cc:1654
Bounds_checked_array< ItemHandle > m_aggregate_head
Definition: interesting_orders.h:445
FunctionalDependencySet GetFDSet(int fd_idx) const
Definition: interesting_orders.h:335
void RecanonicalizeGroupings()
Definition: interesting_orders.cc:455
std::string PrintFunctionalDependency(const FunctionalDependency &fd, bool html) const
Definition: interesting_orders.cc:1820
void AddEdge(THD *thd, int state_idx, int required_fd_idx, Ordering ordering)
Definition: interesting_orders.cc:1191
int num_fds() const
Definition: interesting_orders.h:272
void Build(THD *thd, std::string *trace)
Definition: interesting_orders.cc:205
A typesafe replacement for DYNAMIC_ARRAY.
Definition: mem_root_array.h:421
For each client connection we create a separate thread with THD serving as a thread/connection descri...
Definition: sql_class.h:945
Represents the (explicit) window of a SQL 2003 section 7.11 <window clause>, or the implicit (inlined...
Definition: window.h:104
bool IsGrouping(Ordering ordering)
Definition: interesting_orders.cc:96
std::bitset< kMaxSupportedFDs > FunctionalDependencySet
Definition: interesting_orders_defs.h:62
static constexpr int kMaxSupportedFDs
Definition: interesting_orders_defs.h:61
static constexpr int kMaxSupportedOrderings
Definition: interesting_orders_defs.h:64
int ItemHandle
Definition: interesting_orders_defs.h:38
enum_order
Definition: key_spec.h:64
uint64_t table_map
Definition: my_table_map.h:29
required string type
Definition: replication_group_member_actions.proto:33
Definition: interesting_orders.h:157
ItemHandle tail
Definition: interesting_orders.h:189
enum FunctionalDependency::@93 type
@ FD
Definition: interesting_orders.h:179
@ DECAY
Definition: interesting_orders.h:172
@ EQUIVALENCE
Definition: interesting_orders.h:185
bool always_active
Definition: interesting_orders.h:211
Bounds_checked_array< ItemHandle > head
Definition: interesting_orders.h:188
Definition: interesting_orders.h:511
int state_idx
Definition: interesting_orders.h:513
int required_fd_idx
Definition: interesting_orders.h:512
const DFSMState * state(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:519
const FunctionalDependency * required_fd(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:515
Definition: interesting_orders.h:468
FunctionalDependencySet can_use_fd
Definition: interesting_orders.h:490
Mem_root_array< int > nfsm_states
Definition: interesting_orders.h:470
std::bitset< kMaxSupportedOrderings > follows_interesting_order
Definition: interesting_orders.h:478
Mem_root_array< int > outgoing_edges
Definition: interesting_orders.h:469
std::bitset< kMaxSupportedOrderings > can_reach_interesting_order
Definition: interesting_orders.h:485
Bounds_checked_array< int > next_state
Definition: interesting_orders.h:475
Definition: interesting_orders.h:408
bool used_desc
Definition: interesting_orders.h:436
bool used_in_grouping
Definition: interesting_orders.h:437
ItemHandle canonical_item
Definition: interesting_orders.h:422
bool can_be_added_by_fd
Definition: interesting_orders.h:428
bool used_asc
Definition: interesting_orders.h:435
Item * item
Definition: interesting_orders.h:410
Definition: interesting_orders.h:493
const NFSMState * state(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:507
int required_fd_idx
Definition: interesting_orders.h:498
const FunctionalDependency * required_fd(const LogicalOrderings *orderings) const
Definition: interesting_orders.h:503
int state_idx
Definition: interesting_orders.h:501
Definition: interesting_orders.h:452
Mem_root_array< int > outgoing_edges
Definition: interesting_orders.h:454
enum LogicalOrderings::NFSMState::@94 type
std::bitset< kMaxSupportedOrderings > can_reach_interesting_order
Definition: interesting_orders.h:459
@ ARTIFICIAL
Definition: interesting_orders.h:453
@ DELETED
Definition: interesting_orders.h:453
@ INTERESTING
Definition: interesting_orders.h:453
int satisfied_ordering_idx
Definition: interesting_orders.h:456
int seen
Definition: interesting_orders.h:466
Ordering satisfied_ordering
Definition: interesting_orders.h:455
Definition: interesting_orders.h:524
StateIndex state_idx
Definition: interesting_orders.h:564
Ordering ordering
Definition: interesting_orders.h:525
bool used_at_end
Definition: interesting_orders.h:558
Type
Definition: interesting_orders.h:530
@ UNINTERESTING
Definition: interesting_orders.h:555
@ INTERESTING
Definition: interesting_orders.h:533
@ HOMOGENIZED
Definition: interesting_orders.h:548
enum LogicalOrderings::OrderingWithInfo::Type type
table_map homogenize_tables
Definition: interesting_orders.h:561
Definition: interesting_orders_defs.h:43
ItemHandle item
Definition: interesting_orders_defs.h:44