MySQL 9.2.0
Source Code Documentation
cost_model.h
Go to the documentation of this file.
1/* Copyright (c) 2020, 2024, Oracle and/or its affiliates.
2
3 This program is free software; you can redistribute it and/or modify
4 it under the terms of the GNU General Public License, version 2.0,
5 as published by the Free Software Foundation.
6
7 This program is designed to work with certain software (including
8 but not limited to OpenSSL) that is licensed under separate terms,
9 as designated in a particular file or component or in included license
10 documentation. The authors of MySQL hereby grant you an additional
11 permission to link the program and your derivative works with the
12 separately licensed software that they have either included with
13 the program or referenced in the documentation.
14
15 This program is distributed in the hope that it will be useful,
16 but WITHOUT ANY WARRANTY; without even the implied warranty of
17 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18 GNU General Public License, version 2.0, for more details.
19
20 You should have received a copy of the GNU General Public License
21 along with this program; if not, write to the Free Software
22 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
23
24#ifndef SQL_JOIN_OPTIMIZER_COST_MODEL_H_
25#define SQL_JOIN_OPTIMIZER_COST_MODEL_H_
26
27#include <algorithm> // std::clamp
28
29#include "my_base.h"
30#include "my_bitmap.h" // bitmap_bits_set
31
35#include "sql/mem_root_array.h"
36#include "sql/table.h"
37
38struct AccessPath;
40class Item;
41class Query_block;
42class THD;
43
44/**
45 When we make cost estimates, we use this as the maximal length the
46 values we get from evaluating an Item (in bytes). Actual values of
47 e.g. blobs may be much longer, but even so we use this as an upper
48 limit when doing cost calculations. (For context, @see Item#max_length .)
49*/
50constexpr size_t kMaxItemLengthEstimate = 4096;
51
52/// A fallback cardinality estimate that is used in case the storage engine
53/// cannot provide one (like for table functions). It's a fairly arbitrary
54/// non-zero value.
56
57/**
58 We model the IO cost for InnoDB tables with the DYNAMIC row format. For
59 other storage engines the IO cost is currently set to zero. For other
60 InnoDB row formats, the model may not be a good fit.
61
62 We only count the cost of accessing the leaf pages of indexes (clustered
63 or unclustered)that are not already in the buffer pool. For tables/indexes
64 that are (estimated to be) fully cached we get an IO cost of zero.
65 Tables that are small relative to the buffer pool are assumed to be fully
66 cached (all pages are in the buffer pool).
67 The only operations for which we count IO cost are random and sequential
68 page reads.
69
70 The cost of a disk IO operation is modeled as an affine function:
71
72 io_cost = kIOStartCost + no_of_bytes * kIOByteCost
73
74 Note that the granularity of no_of_bytes is the storage engine page size.
75 The values used here are derived from measurements in a cloud setting.
76 This means that they may be wrong in another context, such as when
77 running against a local SSD. Cloud measurements may also give inconsistent
78 results due to heterogeneous cloud hardware, and the impact of activity
79 in other cloud VMs.
80 Ideally we should be able to configure these parameters, or even set them
81 dynamically based on observed behavior.
82*/
83constexpr double kIOStartCost{937.0};
84
85/// The additional cost of reading an extra byte from disk.
86constexpr double kIOByteCost{0.0549};
87
88/// This is the estimated fraction of an (innodb) block that is in use
89/// (i.e. not free for future inserts).
90constexpr double kBlockFillFactor{0.75};
91
92/// See EstimateFilterCost.
93struct FilterCost {
94 /// Cost of evaluating the filter for all rows if subqueries are not
95 /// materialized. (Note that this includes the contribution from
96 /// init_cost_if_not_materialized.)
98
99 /// Initial cost before the filter can be applied for the first time.
100 /// Typically the cost of executing 'independent subquery' in queries like:
101 /// "SELECT * FROM tab WHERE field = <independent subquery>".
102 /// (That corresponds to the Item_singlerow_subselect class.)
104
105 /// Cost of evaluating the filter for all rows if all subqueries in
106 /// it have been materialized beforehand. If there are no subqueries
107 /// in the condition, equals cost_if_not_materialized.
109
110 /// Cost of materializing all subqueries present in the filter.
111 /// If there are no subqueries in the condition, equals zero.
113};
114
115/// Used internally by EstimateFilterCost() only.
116void AddCost(THD *thd, const ContainedSubquery &subquery, double num_rows,
117 FilterCost *cost);
118
119/**
120 Estimate the cost of evaluating “condition”, “num_rows” times.
121 This is a fairly rudimentary estimation, _but_ it includes the cost
122 of any subqueries that may be present and that need evaluation.
123 */
124FilterCost EstimateFilterCost(THD *thd, double num_rows, Item *condition,
125 const Query_block *outer_query_block);
126
127/**
128 A cheaper overload of EstimateFilterCost() that assumes that all
129 contained subqueries have already been extracted (ie., it skips the
130 walking, which can be fairly expensive). This data is typically
131 computed by FindContainedSubqueries().
132 */
134 THD *thd, double num_rows,
135 const Mem_root_array<ContainedSubquery> &contained_subqueries) {
136 FilterCost cost;
139
140 for (const ContainedSubquery &subquery : contained_subqueries) {
141 AddCost(thd, subquery, num_rows, &cost);
142 }
143 return cost;
144}
145
146/**
147 Estimate costs and output rows for a SORT AccessPath.
148 @param thd Current thread.
149 @param path the AccessPath.
150 @param distinct_rows An estimate of the number of distinct rows, if
151 remove_duplicates==true and we have an estimate already.
152*/
154 double distinct_rows = kUnknownRowCount);
155
157
158/**
159 Estimate the number of rows with a distinct combination of values for
160 'terms'. @see EstimateDistinctRowsFromStatistics for additional details.
161 @param thd The current thread.
162 @param child_rows The number of input rows.
163 @param terms The terms for which we estimate the number of unique
164 combinations.
165 @returns The estimated number of output rows.
166*/
167double EstimateDistinctRows(THD *thd, double child_rows,
169/**
170 Estimate costs and result row count for an aggregate operation.
171 @param[in,out] thd The current thread.
172 @param[in,out] path The AGGREGATE path.
173 @param[in] query_block The Query_block to which 'path' belongs.
174 */
176 const Query_block *query_block);
179
180/// Estimate the costs and row count for a STREAM AccessPath.
182
183/**
184 Estimate the costs and row count for a WINDOW AccessPath. As described in
185 @see AccessPath::m_init_cost, the cost to read k out of N rows would be
186 init_cost + (k/N) * (cost - init_cost).
187*/
189
190/// Estimate the costs and row count for a Temp table Aggregate AccessPath.
192 const Query_block *query_block);
193
194/// Estimate the costs and row count for a WINDOW AccessPath.
196
197/**
198 Estimate the fan out for a left semijoin or a left antijoin. The fan out
199 is defined as the number of result rows, divided by the number of input
200 rows from the left hand relation. For a semijoin, J1:
201
202 SELECT ... FROM t1 WHERE EXISTS (SELECT ... FROM t2 WHERE predicate)
203
204 we know that the fan out of the corresponding inner join J2:
205
206 SELECT ... FROM t1, t2 WHERE predicate
207
208 is: F(J2) = CARD(t2) * SELECTIVITY(predicate) , where CARD(t2)=right_rows,
209 and SELECTIVITY(predicate)=edge.selectivity. If 'predicate' is a
210 deterministic function of t1 and t2 rows, then J1 is equivalent to an inner
211 join J3:
212
213 SELECT ... FROM t1 JOIN (SELECT DISTINCT f1,..fn FROM t2) d ON predicate
214
215 where f1,..fn are those fields from t2 that appear in the predicate.
216
217 Then F(J1) = F(J3) = F(J2) * CARD(d) / CARD(t2)
218 = CARD(d) * SELECTIVITY(predicate).
219
220 This function therefore collects f1..fn and estimates CARD(d). As a special
221 case, 'predicate' may be independent of t2. The query is then equivalent to:
222
223 SELECT ... FROM t1 WHERE predicate AND (SELECT COUNT(*) FROM t2) > 0
224
225 The fan out is then the selectivity of 'predicate' multiplied by the
226 probability of t2 having at least one row.
227
228 @param thd The current thread.
229 @param right_rows The number of input rows from the right hand relation.
230 @param edge Join edge.
231 @returns fan out.
232 */
233double EstimateSemijoinFanOut(THD *thd, double right_rows,
234 const JoinPredicate &edge);
235
236/**
237 Estimate the number of output rows from joining two relations.
238 @param thd The current thread.
239 @param left_rows Number of rows in the left hand relation.
240 @param right_rows Number of rows in the right hand relation.
241 @param edge The join between the two relations.
242*/
243inline double FindOutputRowsForJoin(THD *thd, double left_rows,
244 double right_rows,
245 const JoinPredicate *edge) {
246 switch (edge->expr->type) {
248 // For outer joins, every outer row produces at least one row (if none
249 // are matching, we get a NULL-complemented row).
250 // Note that this can cause inconsistent row counts; see bug #33550360
251 // and/or JoinHypergraph::has_reordered_left_joins.
252 return left_rows * std::max(right_rows * edge->selectivity, 1.0);
253
255 return left_rows * EstimateSemijoinFanOut(thd, right_rows, *edge);
256
258 // Antijoin are estimated as simply the opposite of semijoin (see above),
259 // but wrongly estimating 0 rows (or, of course, a negative amount) could
260 // be really bad, so we assume at least 10% coming out as a fudge factor.
261 // It's better to estimate too high than too low here.
262 return left_rows *
263 std::max(1.0 - EstimateSemijoinFanOut(thd, right_rows, *edge),
264 0.1);
265
268 return left_rows * right_rows * edge->selectivity;
269
270 case RelationalExpression::FULL_OUTER_JOIN: // Not implemented.
271 case RelationalExpression::MULTI_INNER_JOIN: // Should not appear here.
272 case RelationalExpression::TABLE: // Should not appear here.
273 assert(false);
274 return 0;
275
276 default:
277 assert(false);
278 return 0;
279 }
280}
281
282/**
283 Determines whether a given key on a table is both clustered and primary.
284
285 @param table The table to which the index belongs.
286 @param key_idx The position of the key in table->key_info[].
287
288 @return True if the key is clustered and primary, false otherwise.
289*/
290inline bool IsClusteredPrimaryKey(const TABLE *table, unsigned key_idx) {
291 if (table->s->is_missing_primary_key()) return false;
292 return key_idx == table->s->primary_key &&
293 table->file->primary_key_is_clustered();
294}
295
296/// The minimum number of bytes to return for row length estimates. This is
297/// mostly to guard against returning estimates of zero, which may or may not
298/// actually be able to happen in practice.
299constexpr unsigned kMinEstimatedBytesPerRow = 8;
300
301/// The maximum number of bytes to return for row length estimates. The current
302/// value of this constant is set to cover a wide range of row sizes and should
303/// capture the most common row lengths in bytes. We place an upper limit on the
304/// estimates since we have performed our calibration within this range, and we
305/// would like to prevent cost estimates from running away in case the
306/// underlying statistics are off in some instances. In such cases we prefer
307/// capping the resulting estimate. As a reasonable upper limit we use half the
308/// default InnoDB page size of 2^14 = 16384 bytes.
309constexpr unsigned kMaxEstimatedBytesPerRow = 8 * 1024;
310
311/**
312 This struct represents the number of bytes we expect to read for a table row.
313 Note that the split between b-tree and overflow pages is specific to InnoDB
314 and may not be a good fit for other storage engines. Ideally we should
315 calculate row size and IO-cost in the handler, in a way that is specific to
316 each particular storage engine.
317 When using the DYNAMIC row format, an InnoDB B-tree record cannot be bigger
318 than about half a page. (The default page size is 16KB). Above that, the
319 longest fields are stored in separate overflow pages.
320*/
322 /**
323 The number of bytes read from the B-tree record. This also includes those
324 fields that were not in the projection.
325 */
327
328 /**
329 The number of bytes read from overflow pages. This is the combined size
330 of those long variable-sized fields that are in the projection but stored
331 in overflow pages.
332 */
334
335 /*
336 The probability of reading from an overflow page (i.e. an estimate
337 of the probability that at least one of the columns in the
338 projection overflows).
339 */
341};
342
343/**
344 Estimate the average number of bytes that we need to read from the
345 storage engine when reading a row from 'table'. This is the size of
346 the (b-tree) record and the overflow pages of any field that is
347 part of the projection. This is similar to what EstimateBytesPerRowTable()
348 does, but this function is intended to do a more accurate but also more
349 expensive calculation for tables with potentially large rows (i.e. tables
350 with BLOBs or large VARCHAR fields).
351
352 Note that this function tailored for InnoDB (and also for the
353 DYNAMIC row format of InnoDB). At some point we may want to move
354 this logic to the handler, so that we can customize it for other
355 engines as well.
356
357 @param table The target table
358 @param max_row_size The row size if all variable-sized fields are full.
359 @returns The estimated row size.
360*/
362 int64_t max_row_size);
363
364/// We clamp the block size to lie in the interval between the max and min
365/// allowed block size for InnoDB (2^12 to 2^16). Ideally we would have a
366/// guarantee that stats.block_size has a reasonable value (across all storage
367/// engines, types of tables, state of statistics), but in the absence of such
368/// a guarantee we clamp to the values for the InnoDB storage engine since the
369/// cost model has been calibrated for these values.
370inline unsigned ClampedBlockSize(const TABLE *table) {
371 constexpr unsigned kMinEstimatedBlockSize = 4096;
372 constexpr unsigned kMaxEstimatedBlockSize = 65536;
373 return std::clamp(table->file->stats.block_size, kMinEstimatedBlockSize,
374 kMaxEstimatedBlockSize);
375}
376
377/**
378 Estimates the number of bytes that MySQL must process when reading a row from
379 a table, independently of the size of the read set.
380
381 @param table The table to produce an estimate for.
382
383 @returns The estimated row size.
384
385 @note There are two different relevant concepts of bytes per row:
386
387 1. The bytes per row on the storage engine side.
388 2. The bytes per row on the server side.
389
390 One the storage engine side, for InnoDB at least, we compute
391 stats.mean_rec_length as the size of the data file divided by the number of
392 rows in the table.
393
394 On the server side we are interested in the size of the MySQL representation
395 in bytes. This could be a more accurate statistic when determining the CPU
396 cost of processing a row (i.e., it does not matter very much if InnoDB pages
397 are only half-full). As an approximation to the size of a row in bytes on the
398 server side we use the length of the record buffer for rows that should
399 not exceed the maximal size of an InnoDB B-tree record. (Otherwise, we call
400 EstimateBytesPerRowWideTable() to make the estimate).
401
402 Note that when the table fits in a single page stats.mean_rec_length
403 will tend to overestimate the record length since it is computed as
404 stats.data_file_length / stats.records and the data file length is
405 at least a full page which defaults to 16384 bytes (for InnoDB at
406 least). We may then get a better estimate from table->s->rec_buff_length.
407*/
409 int64_t max_bytes{0};
410
411 for (uint i = 0; i < table->s->fields; i++) {
412 // field_length is the maximal size (in bytes) of this field.
413 max_bytes += table->field[i]->field_length;
414 }
415
416 if (max_bytes < ClampedBlockSize(table) / 2) {
417 // The row should fit in a b-tree record.
418 return {.record_bytes =
419 std::clamp(table->s->rec_buff_length, kMinEstimatedBytesPerRow,
421 .overflow_bytes = 0,
422 .overflow_probability = 0.0};
423 }
424
425 // Make a more sophisticated estimate for tables that may have very
426 // large rows.
427 return EstimateBytesPerRowWideTable(table, max_bytes);
428}
429
430/**
431 Estimates the number of bytes that MySQL must process when reading a row from
432 a secondary index, independently of the size of the read set.
433
434 @param table The table to which the index belongs.
435 @param key_idx The position of the key in table->key_info[].
436
437 @return The estimated number of bytes per row in the index.
438*/
439inline unsigned EstimateBytesPerRowIndex(const TABLE *table, unsigned key_idx) {
440 // key_length should correspond to the length of the field(s) of the key in
441 // bytes and ref_length is the length of the field(s) of the primary key in
442 // bytes. Secondary indexes (in InnoDB) contain a copy of the value of the
443 // primary key associated with a given row, in order to make it possible to
444 // retrieve the corresponding row from the primary index in case we use a
445 // non-covering index operation.
446 unsigned estimate =
447 table->key_info[key_idx].key_length + table->file->ref_length;
448 return std::clamp(estimate, kMinEstimatedBytesPerRow,
450}
451
452/**
453 Estimates the height of a B-tree index.
454
455 We estimate the height of the index to be the smallest positive integer h such
456 that table_records <= (1 + records_per_page)^h.
457
458 This function supports both clustered primary indexes and secondary indexes.
459 Secondary indexes will tend to have more records per page compared to primary
460 clustered indexes and as a consequence they will tend to be shorter.
461
462 @param table The table to which the index belongs.
463 @param key_idx The position of the key in table->key_info[].
464
465 @return The estimated height of the index.
466*/
467inline int IndexHeight(const TABLE *table, unsigned key_idx) {
468 unsigned block_size = ClampedBlockSize(table);
469 unsigned bytes_per_row = IsClusteredPrimaryKey(table, key_idx)
472
473 // Ideally we should always have that block_size >= bytes_per_row, but since
474 // the storage engine and MySQL row formats differ, this is not always the
475 // case. Therefore we manually ensure that records_per_page >= 1.0.
476 double records_per_page =
477 std::max(1.0, static_cast<double>(block_size) / bytes_per_row);
478
479 // Computing the height using a while loop instead of using std::log turns out
480 // to be about 5 times faster in microbenchmarks when the measurement is made
481 // using a somewhat realistic and representative set of values for the number
482 // of records per page and the number of records in the table. In the worst
483 // case, if the B-tree contains only a single record per page, the table would
484 // have to contain 2^30 pages (corresponding to more than 16 terabytes of
485 // data) for this loop to run 30 times. A B-tree with 1 billion records and
486 // 100 records per page only uses 4 iterations of the loop (the height is 5).
487 int height = 1;
488 double r = 1.0 + records_per_page;
489 while (r < table->file->stats.records) {
490 r = r * (1.0 + records_per_page);
491 height += 1;
492 }
493 return height;
494}
495
496/// Calculate the IO-cost of reading 'num_rows' rows from 'table'.
497double TableAccessIOCost(const TABLE *table, double num_rows,
498 BytesPerTableRow row_size);
499
500/// Calculate the IO-cost of doing a lookup on index 'key_idx' on 'table'
501/// and then read 'num_rows' rows.
502double CoveringIndexAccessIOCost(const TABLE *table, unsigned key_idx,
503 double num_rows);
504
505/**
506 Computes the expected cost of reading a number of rows. The cost model takes
507 into account the number of fields that is being read from the row and the
508 width of the row in bytes. Both RowReadCostTable() and RowReadCostIndex() call
509 this function and thus use the same cost model.
510
511 @param num_rows The (expected) number of rows to read.
512 @param fields_read_per_row The number of fields to read per row.
513 @param bytes_per_row The total length of the row to be processed (including
514 fields that are not read) in bytes.
515
516 @returns The expected cost of reading num_rows.
517
518 @note It is important that this function be robust to fractional row
519 estimates. For example, if we index nested loop join two primary key columns
520 and the inner table is smaller than the outer table, we should see that
521 num_rows for the inner index lookup is less than one. In this case it is
522 important that we return the expected cost of the operation. For example, if
523 we only expect to read 0.1 rows the cost should be 0.1 of the cost of reading
524 one row (we are working with a linear cost model, so we only have to take the
525 expected number of rows into account, and not the complete distribution).
526*/
527inline double RowReadCost(double num_rows, double fields_read_per_row,
528 double bytes_per_row) {
529 return (kReadOneRowCost + kReadOneFieldCost * fields_read_per_row +
530 kReadOneByteCost * bytes_per_row) *
531 num_rows;
532}
533
534/**
535 Computes the cost of reading a number of rows from a table.
536 @see ReadRowCost() for further details.
537
538 @param table The table to read from.
539 @param num_rows The (expected) number of rows to read.
540
541 @returns The cost of reading num_rows.
542*/
543inline double RowReadCostTable(const TABLE *table, double num_rows) {
544 double fields_read_per_row = bitmap_bits_set(table->read_set);
546 return RowReadCost(
547 num_rows, fields_read_per_row,
548 bytes_per_row.record_bytes + bytes_per_row.overflow_bytes) +
549 TableAccessIOCost(table, num_rows, bytes_per_row);
550}
551
552/**
553 Computes the cost of reading a number of rows from an index.
554 @see ReadRowCost() for further details.
555
556 @param table The table to which the index belongs.
557 @param key_idx The position of the key in table->key_info[].
558 @param num_rows The (expected) number of rows to read.
559
560 @returns The cost of reading num_rows.
561*/
562inline double RowReadCostIndex(const TABLE *table, unsigned key_idx,
563 double num_rows) {
564 if (IsClusteredPrimaryKey(table, key_idx)) {
565 return RowReadCostTable(table, num_rows);
566 }
567 // Assume we read two fields from the index record if it is not covering. The
568 // exact assumption here is not very important as the cost should be dominated
569 // by the additional lookup into the primary index.
570 constexpr double kDefaultFieldsReadFromCoveringIndex = 2;
571 double fields_read_per_row = table->covering_keys.is_set(key_idx)
572 ? bitmap_bits_set(table->read_set)
573 : kDefaultFieldsReadFromCoveringIndex;
574
575 double bytes_per_row = EstimateBytesPerRowIndex(table, key_idx);
576 return RowReadCost(num_rows, fields_read_per_row, bytes_per_row) +
577 CoveringIndexAccessIOCost(table, key_idx, num_rows);
578}
579
580/**
581 Estimates the cost of a full table scan. Primarily used to assign a cost to
582 the TABLE_SCAN access path.
583
584 @param table The table to estimate cost for.
585
586 @returns The cost of scanning the table.
587*/
588inline double EstimateTableScanCost(const TABLE *table) {
589 return RowReadCostTable(table, table->file->stats.records);
590}
591
592/**
593 Estimates the cost of an index lookup.
594
595 @param table The table to which the index belongs.
596 @param key_idx The position of the key in table->key_info[].
597
598 @return The estimated cost of an index lookup.
599
600 @note The model "cost ~ index_height" works well when the Adaptive Hash Index
601 (AHI) is disabled. The AHI essentially works as a dynamic cache for the most
602 frequently accessed index pages that sits on top of the B-tree. With AHI
603 enabled the cost of random lookups does not appear to be predictable using
604 standard explanatory variables such as index height or the logarithm of the
605 number of rows in the index. The performance of AHI will also be dependent on
606 the access pattern, so it is fundamentally difficult to create an accurate
607 model. However, our calibration experiments reveal two things that hold true
608 both with and without AHI enabled:
609
610 1. Index lookups usually take 1-3 microseconds. The height of a B-tree grows
611 very slowly (proportional to log(N)/log(R) for tables with N rows and R
612 records per page), making it near-constant in the common case where tables
613 have many records per page, even without AHI.
614
615 2. Random lookups in large indexes tend to be slower. A random access pattern
616 will cause more cache misses, both for regular hardware caching and AHI. In
617 addition, for a larger B-tree only the top levels fit in cache and we will
618 invariably see a few cache misses with random accesses.
619
620 We model the cost of index lookups by interpolating between a model with
621 constant cost and a model that depends entirely on the height of the index.
622 The constants are based on calibration experiments with and without AHI.
623
624 Another factor that is important when calibrating the cost of index lookups is
625 whether we are interested in the average cost when performing many lookups
626 such as when performing an index nested loop join or scanning along a
627 secondary non-covering index and lookup into the primary index, or the cost of
628 a single lookup such as a point select that uses an index. From our
629 measurements we see that the average running time of an index lookup can
630 easily be a factor ~5x faster when performing thousands of successive lookups
631 compared to just one. This is likely due to hardware caching effects. Since
632 getting the cost right in the case when we perform many lookups is typically
633 more important, we have opted to calibrate costs based on operations that
634 perform many lookups.
635
636 For adding IO costs to this model (future work) we probably want to assume
637 that we only fetch a single page when performing an index lookup, as
638 everything but leaf pages will tend to be cached, at least when performing
639 many index lookups in a query plan, which is exactly the case where it is
640 important to get costs right.
641*/
642inline double IndexLookupCost(const TABLE *table, unsigned key_idx) {
643 assert(key_idx < table->s->keys);
644 double cost_with_ahi = kIndexLookupFixedCost;
645 double cost_without_ahi = kIndexLookupPageCost * IndexHeight(table, key_idx);
646 return 0.5 * (cost_with_ahi + cost_without_ahi);
647}
648
649/// Type of range scan operation.
650enum class RangeScanType : char {
651 /// Using the MRR optimization.
653
654 /// Plain range scan, without the MRR optimization.
656};
657
658/**
659 Estimates the cost of an index range scan.
660
661 The cost model for index range scans accounts for the index lookup cost as
662 well as the cost of reading rows. Both index scans and ref accesses can be
663 viewed as special cases of index range scans, so the cost functions for those
664 operations call this function under the hood.
665
666 @note In the future this function should be extended to account for IO cost.
667
668 @param table The table to which the index belongs.
669 @param key_idx The position of the key in table->key_info[].
670 @param scan_type Whether this an MRR or a regular index range scan.
671 @param num_ranges The number of ranges.
672 @param num_output_rows The estimated expected number of output rows.
673
674 @returns The estimated cost of the index range scan operation.
675*/
676double EstimateIndexRangeScanCost(const TABLE *table, unsigned key_idx,
677 RangeScanType scan_type, double num_ranges,
678 double num_output_rows);
679
680/**
681 Estimates the cost of an index scan. An index scan scans all rows in the
682 table along the supplied index.
683
684 @param table The table to which the index belongs.
685 @param key_idx The position of the key in table->key_info[].
686
687 @returns The estimated cost of the index scan.
688*/
689inline double EstimateIndexScanCost(const TABLE *table, unsigned key_idx) {
691 1.0, table->file->stats.records);
692}
693
694/**
695 Estimates the cost of an index lookup (ref access).
696
697 @param table The table to which the index belongs.
698 @param key_idx The position of the key in table->key_info[].
699 @param num_output_rows The estimated number of output rows.
700
701 @returns The estimated cost of the index scan.
702*/
703inline double EstimateRefAccessCost(const TABLE *table, unsigned key_idx,
704 double num_output_rows) {
705 // We want the optimizer to prefer ref acccesses to range scans when they both
706 // have the same cost. This is particularly important for the EQ_REF access
707 // path (index lookups with at most one matching row) since the EQ_REF
708 // iterator uses caching to improve performance.
709 constexpr double kRefAccessCostDiscount = 0.05;
710 return (1.0 - kRefAccessCostDiscount) *
712 1.0, num_output_rows);
713}
714
715#endif // SQL_JOIN_OPTIMIZER_COST_MODEL_H_
constexpr double kUnknownRowCount
To indicate that a row estimate is not yet made.
Definition: access_path.h:196
A wrapper class which provides array bounds checking.
Definition: sql_array.h:47
Base class that is used to represent any kind of expression in a relational query.
Definition: item.h:930
A typesafe replacement for DYNAMIC_ARRAY.
Definition: mem_root_array.h:432
This class represents a query block, aka a query specification, which is a query consisting of a SELE...
Definition: sql_lex.h:1170
For each client connection we create a separate thread with THD serving as a thread/connection descri...
Definition: sql_lexer_thd.h:36
Hypergraph optimizer cost constants.
constexpr double kReadOneFieldCost
Cost of per field in the read set.
Definition: cost_constants.h:85
constexpr double kApplyOneFilterCost
Cost of evaluating one filter on one row.
Definition: cost_constants.h:117
constexpr double kReadOneRowCost
Fixed cost of reading a row from the storage engine into the record buffer.
Definition: cost_constants.h:81
constexpr double kIndexLookupPageCost
The cost per page that is visited when performing an index lookup in an InnoDB B-tree.
Definition: cost_constants.h:129
constexpr double kIndexLookupFixedCost
Fixed cost of an index lookup when AHI is enabled (default).
Definition: cost_constants.h:132
constexpr double kReadOneByteCost
Overhead per byte when reading a row.
Definition: cost_constants.h:95
double RowReadCost(double num_rows, double fields_read_per_row, double bytes_per_row)
Computes the expected cost of reading a number of rows.
Definition: cost_model.h:527
constexpr ha_rows kRowEstimateFallback
A fallback cardinality estimate that is used in case the storage engine cannot provide one (like for ...
Definition: cost_model.h:55
double EstimateRefAccessCost(const TABLE *table, unsigned key_idx, double num_output_rows)
Estimates the cost of an index lookup (ref access).
Definition: cost_model.h:703
double TableAccessIOCost(const TABLE *table, double num_rows, BytesPerTableRow row_size)
Calculate the IO-cost of reading 'num_rows' rows from 'table'.
Definition: cost_model.cc:418
void EstimateSortCost(THD *thd, AccessPath *path, double distinct_rows=kUnknownRowCount)
Estimate costs and output rows for a SORT AccessPath.
Definition: cost_model.cc:605
void EstimateTemptableAggregateCost(THD *thd, AccessPath *path, const Query_block *query_block)
Estimate the costs and row count for a Temp table Aggregate AccessPath.
Definition: cost_model.cc:1635
RangeScanType
Type of range scan operation.
Definition: cost_model.h:650
@ kSingleRange
Plain range scan, without the MRR optimization.
@ kMultiRange
Using the MRR optimization.
BytesPerTableRow EstimateBytesPerRowTable(const TABLE *table)
Estimates the number of bytes that MySQL must process when reading a row from a table,...
Definition: cost_model.h:408
void EstimateWindowCost(AccessPath *path)
Estimate the costs and row count for a WINDOW AccessPath.
Definition: cost_model.cc:1679
bool IsClusteredPrimaryKey(const TABLE *table, unsigned key_idx)
Determines whether a given key on a table is both clustered and primary.
Definition: cost_model.h:290
double CoveringIndexAccessIOCost(const TABLE *table, unsigned key_idx, double num_rows)
Calculate the IO-cost of doing a lookup on index 'key_idx' on 'table' and then read 'num_rows' rows.
Definition: cost_model.cc:473
double EstimateTableScanCost(const TABLE *table)
Estimates the cost of a full table scan.
Definition: cost_model.h:588
constexpr size_t kMaxItemLengthEstimate
When we make cost estimates, we use this as the maximal length the values we get from evaluating an I...
Definition: cost_model.h:50
void AddCost(THD *thd, const ContainedSubquery &subquery, double num_rows, FilterCost *cost)
Used internally by EstimateFilterCost() only.
Definition: cost_model.cc:662
double EstimateIndexScanCost(const TABLE *table, unsigned key_idx)
Estimates the cost of an index scan.
Definition: cost_model.h:689
void EstimateLimitOffsetCost(AccessPath *path)
Estimate the costs and row count for a WINDOW AccessPath.
Definition: cost_model.cc:1601
double RowReadCostIndex(const TABLE *table, unsigned key_idx, double num_rows)
Computes the cost of reading a number of rows from an index.
Definition: cost_model.h:562
void EstimateDeleteRowsCost(AccessPath *path)
Definition: cost_model.cc:1544
constexpr unsigned kMinEstimatedBytesPerRow
The minimum number of bytes to return for row length estimates.
Definition: cost_model.h:299
int IndexHeight(const TABLE *table, unsigned key_idx)
Estimates the height of a B-tree index.
Definition: cost_model.h:467
double EstimateIndexRangeScanCost(const TABLE *table, unsigned key_idx, RangeScanType scan_type, double num_ranges, double num_output_rows)
Estimates the cost of an index range scan.
Definition: cost_model.cc:518
double FindOutputRowsForJoin(THD *thd, double left_rows, double right_rows, const JoinPredicate *edge)
Estimate the number of output rows from joining two relations.
Definition: cost_model.h:243
void EstimateUpdateRowsCost(AccessPath *path)
Definition: cost_model.cc:1561
constexpr double kBlockFillFactor
This is the estimated fraction of an (innodb) block that is in use (i.e.
Definition: cost_model.h:90
unsigned ClampedBlockSize(const TABLE *table)
We clamp the block size to lie in the interval between the max and min allowed block size for InnoDB ...
Definition: cost_model.h:370
constexpr double kIOStartCost
We model the IO cost for InnoDB tables with the DYNAMIC row format.
Definition: cost_model.h:83
BytesPerTableRow EstimateBytesPerRowWideTable(const TABLE *table, int64_t max_row_size)
Estimate the average number of bytes that we need to read from the storage engine when reading a row ...
Definition: cost_model.cc:317
void EstimateMaterializeCost(THD *thd, AccessPath *path)
Provide row estimates and costs for a MATERIALIZE AccessPath.
Definition: cost_model.cc:827
double IndexLookupCost(const TABLE *table, unsigned key_idx)
Estimates the cost of an index lookup.
Definition: cost_model.h:642
void EstimateStreamCost(THD *thd, AccessPath *path)
Estimate the costs and row count for a STREAM AccessPath.
Definition: cost_model.cc:1578
double EstimateDistinctRows(THD *thd, double child_rows, Bounds_checked_array< const Item *const > terms)
Estimate the number of rows with a distinct combination of values for 'terms'.
void EstimateAggregateCost(THD *thd, AccessPath *path, const Query_block *query_block)
Estimate costs and result row count for an aggregate operation.
Definition: cost_model.cc:1522
unsigned EstimateBytesPerRowIndex(const TABLE *table, unsigned key_idx)
Estimates the number of bytes that MySQL must process when reading a row from a secondary index,...
Definition: cost_model.h:439
FilterCost EstimateFilterCost(THD *thd, double num_rows, Item *condition, const Query_block *outer_query_block)
Estimate the cost of evaluating “condition”, “num_rows” times.
Definition: cost_model.cc:703
constexpr unsigned kMaxEstimatedBytesPerRow
The maximum number of bytes to return for row length estimates.
Definition: cost_model.h:309
double EstimateSemijoinFanOut(THD *thd, double right_rows, const JoinPredicate &edge)
Estimate the fan out for a left semijoin or a left antijoin.
Definition: cost_model.cc:1689
double RowReadCostTable(const TABLE *table, double num_rows)
Computes the cost of reading a number of rows from a table.
Definition: cost_model.h:543
constexpr double kIOByteCost
The additional cost of reading an extra byte from disk.
Definition: cost_model.h:86
This file includes constants used by all storage engines.
my_off_t ha_rows
Definition: my_base.h:1141
uint bitmap_bits_set(const MY_BITMAP *map)
Definition: my_bitmap.cc:416
static char * path
Definition: mysqldump.cc:149
static PFS_engine_table_share_proxy table
Definition: pfs.cc:61
Definition: os0file.h:89
T clamp(U x)
Definition: ut0ut.h:415
const mysql_service_registry_t * r
Definition: pfs_example_plugin_employee.cc:86
Access paths are a query planning structure that correspond 1:1 to iterators, in that an access path ...
Definition: access_path.h:238
This struct represents the number of bytes we expect to read for a table row.
Definition: cost_model.h:321
int64_t overflow_bytes
The number of bytes read from overflow pages.
Definition: cost_model.h:333
double overflow_probability
Definition: cost_model.h:340
int64_t record_bytes
The number of bytes read from the B-tree record.
Definition: cost_model.h:326
This class represents a subquery contained in some subclass of Item_subselect,.
Definition: item.h:862
See EstimateFilterCost.
Definition: cost_model.h:93
double init_cost_if_not_materialized
Initial cost before the filter can be applied for the first time.
Definition: cost_model.h:103
double cost_to_materialize
Cost of materializing all subqueries present in the filter.
Definition: cost_model.h:112
double cost_if_materialized
Cost of evaluating the filter for all rows if all subqueries in it have been materialized beforehand.
Definition: cost_model.h:108
double cost_if_not_materialized
Cost of evaluating the filter for all rows if subqueries are not materialized.
Definition: cost_model.h:97
A specification that two specific relational expressions (e.g., two tables, or a table and a join bet...
Definition: access_path.h:79
RelationalExpression * expr
Definition: access_path.h:80
double selectivity
Definition: access_path.h:81
enum RelationalExpression::Type type
@ SEMIJOIN
Left semijoin.
Definition: relational_expression.h:153
@ MULTI_INNER_JOIN
Definition: relational_expression.h:175
@ STRAIGHT_INNER_JOIN
Definition: relational_expression.h:161
@ ANTIJOIN
Left antijoin.
Definition: relational_expression.h:156
@ INNER_JOIN
Definition: relational_expression.h:149
@ FULL_OUTER_JOIN
Definition: relational_expression.h:168
@ TABLE
Definition: relational_expression.h:177
@ LEFT_JOIN
Definition: relational_expression.h:150
Definition: table.h:1421