MySQL 8.1.0
Source Code Documentation
bka_iterator.h
Go to the documentation of this file.
1#ifndef SQL_ITERATORS_BKA_ITERATOR_H_
2#define SQL_ITERATORS_BKA_ITERATOR_H_
3
4/* Copyright (c) 2019, 2023, Oracle and/or its affiliates.
5
6 This program is free software; you can redistribute it and/or modify
7 it under the terms of the GNU General Public License, version 2.0,
8 as published by the Free Software Foundation.
9
10 This program is also distributed with certain software (including
11 but not limited to OpenSSL) that is licensed under separate terms,
12 as designated in a particular file or component or in included license
13 documentation. The authors of MySQL hereby grant you an additional
14 permission to link the program and your derivative works with the
15 separately licensed software that they have included with MySQL.
16
17 This program is distributed in the hope that it will be useful,
18 but WITHOUT ANY WARRANTY; without even the implied warranty of
19 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20 GNU General Public License, version 2.0, for more details.
21
22 You should have received a copy of the GNU General Public License
23 along with this program; if not, write to the Free Software
24 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
25
26/**
27 Batch key access (BKA) is a join strategy that uses multi-range read (MRR)
28 to get better read ordering on the table on the inner side. It reads
29 a block of rows from the outer side, picks up the join keys (refs) from
30 each row, and sends them all off in one big read request. The handler can
31 then order these and read them in whatever order it would prefer. This is
32 especially attractive when used with rotating media; the reads can then be
33 ordered such that it does not require constant seeking (disk-sweep MRR,
34 or DS-MRR).
35
36 BKA is implemented with two iterators working in concert. The BKAIterator
37 reads rows from the outer side into a buffer. When the buffer is full or we
38 are out of rows, it then sets up the key ranges and hand it over to the
39 MultiRangeRowIterator, which does the actual request, and reads rows from it
40 until there are none left. For each inner row returned, MultiRangeRowIterator
41 loads the appropriate outer row(s) from the buffer, doing the actual join.
42
43 The reason for this split is twofold. First, it allows us to accurately time
44 (for EXPLAIN ANALYZE) the actual table read. Second, and more importantly,
45 we can have other iterators between the BKAIterator and MultiRangeRowIterator,
46 in particular FilterIterator.
47 */
48
49#include <assert.h>
50#include <stddef.h>
51#include <sys/types.h>
52#include <iterator>
53#include <memory>
54
55#include "my_alloc.h"
56
57#include "my_inttypes.h"
58#include "my_table_map.h"
59#include "sql/handler.h"
62#include "sql/join_type.h"
63#include "sql/mem_root_array.h"
64#include "sql/pack_rows.h"
65#include "sql_string.h"
66#include "template_utils.h"
67
68class Item;
69class JOIN;
71class THD;
72struct Index_lookup;
73struct KEY_MULTI_RANGE;
74struct TABLE;
75
76/**
77 The BKA join iterator, with an arbitrary iterator tree on the outer side
78 and a MultiRangeRowIterator on the inner side (possibly with a filter or
79 similar in-between). See file comment for more details.
80 */
81class BKAIterator final : public RowIterator {
82 public:
83 /**
84 @param thd Thread handle.
85 @param outer_input The iterator to read the outer rows from.
86 @param outer_input_tables Each outer table involved.
87 Used to know which fields we are to read into our buffer.
88 @param inner_input The iterator to read the inner rows from.
89 Must end up in a MultiRangeRowIterator.
90 @param max_memory_available Number of bytes available for row buffers,
91 both outer rows and MRR buffers. Note that allocation is incremental,
92 so we can allocate less than this.
93 @param mrr_bytes_needed_for_single_inner_row Number of bytes MRR needs
94 space for in its buffer for holding a single row from the inner table.
95 @param expected_inner_rows_per_outer_row Number of inner rows we
96 statistically expect for each outer row. Used for dividing the buffer
97 space between inner rows and MRR row buffer (if we expect many inner
98 rows, we can't load as many outer rows).
99 @param store_rowids Whether we need to make sure all tables below us have
100 row IDs available, after Read() has been called. Used only if
101 we are below a weedout operation.
102 @param tables_to_get_rowid_for A map of which tables BKAIterator needs
103 to call position() for itself. tables that are in outer_input_tables
104 but not in this map, are expected to be handled by some other iterator.
105 tables that are in this map but not in outer_input_tables will be
106 ignored.
107 @param mrr_iterator Pointer to the MRR iterator at the bottom of
108 inner_input. Used to send row ranges and buffers.
109 @param join_type What kind of join we are executing.
110 */
112 const Prealloced_array<TABLE *, 4> &outer_input_tables,
114 size_t max_memory_available,
115 size_t mrr_bytes_needed_for_single_inner_row,
116 float expected_inner_rows_per_outer_row, bool store_rowids,
117 table_map tables_to_get_rowid_for,
119
120 bool Init() override;
121
122 int Read() override;
123
124 void SetNullRowFlag(bool is_null_row) override {
125 m_outer_input->SetNullRowFlag(is_null_row);
126 m_inner_input->SetNullRowFlag(is_null_row);
127 }
128
129 void UnlockRow() override {
130 // Since we don't know which condition that caused the row to be rejected,
131 // we can't know whether we could also unlock the outer row
132 // (it may still be used as parts of other joined rows).
134 m_inner_input->UnlockRow();
135 }
136 }
137
138 void EndPSIBatchModeIfStarted() override {
139 m_outer_input->EndPSIBatchModeIfStarted();
140 m_inner_input->EndPSIBatchModeIfStarted();
141 }
142
143 private:
144 /// Clear out the MEM_ROOT and prepare for reading rows anew.
145 void BeginNewBatch();
146
147 /// If there are more outer rows, begin the next batch. If not,
148 /// move to the EOF state.
149 void BatchFinished();
150
151 /// Find the next unmatched row, and load it for output as a NULL-complemented
152 /// row. (Assumes the NULL row flag has already been set on the inner table
153 /// iterator.) Returns 0 if a row was found, -1 if no row was found. (Errors
154 /// cannot happen.)
156
157 /// Read a batch of outer rows (BeginNewBatch() must have been called
158 /// earlier). Returns -1 for no outer rows found (sets state to END_OF_ROWS),
159 /// 0 for OK (sets state to RETURNING_JOINED_ROWS) or 1 for error.
160 int ReadOuterRows();
161
162 enum class State {
163 /**
164 We are about to start reading outer rows into our buffer.
165 A single Read() call will fill it up, so there is no
166 in-between “currently reading” state.
167 */
169
170 /**
171 We are returning rows from the MultiRangeRowIterator.
172 (For antijoins, we are looking up the rows, but don't actually
173 return them.)
174 */
176
177 /**
178 We are an outer join or antijoin, and we're returning NULL-complemented
179 rows for those outer rows that never had a matching inner row. Note that
180 this is done in the BKAIterator and not the MRR iterator for two reasons:
181 First, it gives more sensible EXPLAIN ANALYZE numbers. Second, the
182 NULL-complemented rows could be filtered inadvertently by a FilterIterator
183 before they reach the BKAIterator.
184 */
186
187 /**
188 Both the outer and inner side are out of rows.
189 */
191 };
192
194
197
198 /// The MEM_ROOT we are storing the outer rows on, and also allocating MRR
199 /// buffer from. In total, this should not go significantly over
200 /// m_max_memory_available bytes.
202
203 /// Buffered outer rows.
205
206 /// Tables and columns needed for each outer row. Rows/columns that are not
207 /// needed are filtered out in the constructor; the rest are read and stored
208 /// in m_rows.
210
211 /// Used for serializing the row we read from the outer table(s), before it
212 /// stored into the MEM_ROOT and put into m_rows. Should there not be room in
213 /// m_rows for the row, it will stay in this variable until we start reading
214 /// the next batch of outer rows.
215 ///
216 /// If there are no BLOB/TEXT column in the join, we calculate an upper bound
217 /// of the row size that is used to preallocate this buffer. In the case of
218 /// BLOB/TEXT columns, we cannot calculate a reasonable upper bound, and the
219 /// row size is calculated per row. The allocated memory is kept for the
220 /// duration of the iterator, so that we (most likely) avoid reallocations.
222
223 /// Whether we have a row in m_outer_row_buffer from the previous batch of
224 /// rows that we haven't stored in m_rows yet.
226
227 /// For each outer row, how many bytes we need in the MRR buffer (ie., the
228 /// number of bytes we expect to use on rows from the inner table).
229 /// This is the expected number of inner rows per key, multiplied by the
230 /// (fixed) size of each inner row. We use this information to stop scanning
231 /// before we've used up the entire RAM allowance on outer rows, so that
232 /// we have space remaining for the inner rows (in the MRR buffer), too.
234
235 /// Estimated number of bytes used on m_mem_root so far.
236 size_t m_bytes_used = 0;
237
238 /// Whether we've seen EOF from the outer iterator.
240
241 /// See max_memory_available in the constructor.
243
244 /// See max_memory_available in the constructor.
246
247 /// See mrr_iterator in the constructor.
249
250 /// The join type of the BKA join.
252
253 /// If we are synthesizing NULL-complemented rows (for an outer join or
254 /// antijoin), points to the next row within "m_rows" that we haven't
255 /// considered yet.
257};
258
259/**
260 The iterator actually doing the reads from the inner table during BKA.
261 See file comment.
262 */
264 public:
265 /**
266 @param thd Thread handle.
267 @param table The inner table to scan.
268 @param ref The index condition we are looking up on.
269 @param mrr_flags Flags passed on to MRR.
270 @param join_type
271 What kind of BKA join this MRR iterator is part of.
272 @param outer_input_tables
273 Which tables are on the left side of the BKA join (the MRR iterator
274 is always alone on the right side). This is needed so that it can
275 unpack the rows into the right tables, with the right format.
276 @param store_rowids Whether we need to keep row IDs.
277 @param tables_to_get_rowid_for
278 Tables we need to call table->file->position() for; if a table
279 is present in outer_input_tables but not this, some other iterator
280 will make sure that table has the correct row ID already present
281 after Read().
282 */
284 int mrr_flags, JoinType join_type,
285 const Prealloced_array<TABLE *, 4> &outer_input_tables,
286 bool store_rowids, table_map tables_to_get_rowid_for);
287
288 /**
289 Specify which outer rows to read inner rows for.
290 Must be called before Init(), and be valid until the last Read().
291 */
294 m_begin = begin;
295 m_end = end;
296 }
297
298 /**
299 Specify an unused chunk of memory MRR can use for the returned inner rows.
300 Must be called before Init(), and must be at least big enough to hold
301 one inner row.
302 */
303 void set_mrr_buffer(uchar *ptr, size_t size) {
304 m_mrr_buffer.buffer = ptr;
305 m_mrr_buffer.buffer_end = ptr + size;
306 }
307
308 /**
309 Specify an unused chunk of memory that we can use to mark which inner rows
310 have been read (by the parent BKA iterator) or not. This is used for outer
311 joins to know which rows need NULL-complemented versions, and for semijoins
312 and antijoins to avoid matching the same inner row more than once.
313
314 Must be called before Init() for semijoins, outer joins and antijoins, and
315 never called otherwise. There must be room at least for one bit per row
316 given in set_rows().
317 */
319
320 /**
321 Mark that the BKA iterator has seen the last row we returned from Read().
322 (It could have been discarded by a FilterIterator before it reached them.)
323 Will be a no-op for inner joins; see set_match_flag_buffer()..
324 */
326 if (m_match_flag_buffer != nullptr) {
327 size_t row_number = std::distance(m_begin, m_last_row_returned);
328 m_match_flag_buffer[row_number / 8] |= 1 << (row_number % 8);
329 }
330 }
331
332 /**
333 Check whether the given row has been marked as read
334 (using MarkLastRowAsRead()) or not. Used internally when doing semijoins,
335 and also by the BKAIterator when synthesizing NULL-complemented rows for
336 outer joins or antijoins.
337 */
339 assert(m_match_flag_buffer != nullptr);
340 size_t row_number = std::distance(m_begin, row);
341 return m_match_flag_buffer[row_number / 8] & (1 << (row_number % 8));
342 }
343
344 /**
345 Do the actual multi-range read with the rows given by set_rows() and using
346 the temporary buffer given in set_mrr_buffer().
347 */
348 bool Init() override;
349
350 /**
351 Read another inner row (if any) and load the appropriate outer row(s)
352 into the associated table buffers.
353 */
354 int Read() override;
355
356 private:
357 // Thunks from function pointers to the actual callbacks.
358 static range_seq_t MrrInitCallbackThunk(void *init_params, uint n_ranges,
359 uint flags) {
360 return (pointer_cast<MultiRangeRowIterator *>(init_params))
361 ->MrrInitCallback(n_ranges, flags);
362 }
363 static uint MrrNextCallbackThunk(void *init_params, KEY_MULTI_RANGE *range) {
364 return (pointer_cast<MultiRangeRowIterator *>(init_params))
365 ->MrrNextCallback(range);
366 }
367 static bool MrrSkipRecordCallbackThunk(range_seq_t seq, char *range_info,
368 uchar *) {
369 return (reinterpret_cast<MultiRangeRowIterator *>(seq))
370 ->MrrSkipRecord(range_info);
371 }
372
373 // Callbacks we get from the handler during the actual read.
374 range_seq_t MrrInitCallback(uint n_ranges, uint flags);
376 bool MrrSkipIndexTuple(char *range_info);
377 bool MrrSkipRecord(char *range_info);
378
379 /// Handler for the table we are reading from.
381
382 /// The index condition.
384
385 /// Flags passed on to MRR.
386 const int m_mrr_flags;
387
388 /// Current outer rows to read inner rows for. Set by set_rows().
391
392 /// Which row we are at in the [m_begin, m_end) range.
393 /// Used during the MRR callbacks.
395
396 /// What row we last returned from Read() (used for MarkLastRowAsRead()).
398
399 /// Temporary space for storing inner rows, used by MRR.
400 /// Set by set_mrr_buffer().
402
403 /// See set_match_flag_buffer().
405
406 /// Tables and columns needed for each outer row. Same as m_outer_input_tables
407 /// in the corresponding BKAIterator.
409
410 /// The join type of the BKA join we are part of. Same as m_join_type in the
411 /// corresponding BKAIterator.
413};
414
415#endif // SQL_ITERATORS_BKA_ITERATOR_H_
The BKA join iterator, with an arbitrary iterator tree on the outer side and a MultiRangeRowIterator ...
Definition: bka_iterator.h:81
const size_t m_mrr_bytes_needed_for_single_inner_row
See max_memory_available in the constructor.
Definition: bka_iterator.h:245
int MakeNullComplementedRow()
Find the next unmatched row, and load it for output as a NULL-complemented row.
Definition: bka_iterator.cc:249
BKAIterator(THD *thd, unique_ptr_destroy_only< RowIterator > outer_input, const Prealloced_array< TABLE *, 4 > &outer_input_tables, unique_ptr_destroy_only< RowIterator > inner_input, size_t max_memory_available, size_t mrr_bytes_needed_for_single_inner_row, float expected_inner_rows_per_outer_row, bool store_rowids, table_map tables_to_get_rowid_for, MultiRangeRowIterator *mrr_iterator, JoinType join_type)
Definition: bka_iterator.cc:69
bool m_has_row_from_previous_batch
Whether we have a row in m_outer_row_buffer from the previous batch of rows that we haven't stored in...
Definition: bka_iterator.h:225
MEM_ROOT m_mem_root
The MEM_ROOT we are storing the outer rows on, and also allocating MRR buffer from.
Definition: bka_iterator.h:201
int ReadOuterRows()
Read a batch of outer rows (BeginNewBatch() must have been called earlier).
Definition: bka_iterator.cc:126
void BatchFinished()
If there are more outer rows, begin the next batch.
Definition: bka_iterator.cc:238
const unique_ptr_destroy_only< RowIterator > m_inner_input
Definition: bka_iterator.h:196
void EndPSIBatchModeIfStarted() override
Ends performance schema batch mode, if started.
Definition: bka_iterator.h:138
const size_t m_max_memory_available
See max_memory_available in the constructor.
Definition: bka_iterator.h:242
MultiRangeRowIterator *const m_mrr_iterator
See mrr_iterator in the constructor.
Definition: bka_iterator.h:248
State
Definition: bka_iterator.h:162
@ RETURNING_JOINED_ROWS
We are returning rows from the MultiRangeRowIterator.
@ NEED_OUTER_ROWS
We are about to start reading outer rows into our buffer.
@ RETURNING_NULL_COMPLEMENTED_ROWS
We are an outer join or antijoin, and we're returning NULL-complemented rows for those outer rows tha...
@ END_OF_ROWS
Both the outer and inner side are out of rows.
String m_outer_row_buffer
Used for serializing the row we read from the outer table(s), before it stored into the MEM_ROOT and ...
Definition: bka_iterator.h:221
size_t m_mrr_bytes_needed_per_row
For each outer row, how many bytes we need in the MRR buffer (ie., the number of bytes we expect to u...
Definition: bka_iterator.h:233
pack_rows::TableCollection m_outer_input_tables
Tables and columns needed for each outer row.
Definition: bka_iterator.h:209
JoinType m_join_type
The join type of the BKA join.
Definition: bka_iterator.h:251
size_t m_bytes_used
Estimated number of bytes used on m_mem_root so far.
Definition: bka_iterator.h:236
hash_join_buffer::BufferRow * m_current_pos
If we are synthesizing NULL-complemented rows (for an outer join or antijoin), points to the next row...
Definition: bka_iterator.h:256
int Read() override
Read a single row.
Definition: bka_iterator.cc:269
bool m_end_of_outer_rows
Whether we've seen EOF from the outer iterator.
Definition: bka_iterator.h:239
Mem_root_array< hash_join_buffer::BufferRow > m_rows
Buffered outer rows.
Definition: bka_iterator.h:204
State m_state
Definition: bka_iterator.h:193
void BeginNewBatch()
Clear out the MEM_ROOT and prepare for reading rows anew.
Definition: bka_iterator.cc:119
void SetNullRowFlag(bool is_null_row) override
Mark the current row buffer as containing a NULL row or not, so that if you read from it and the flag...
Definition: bka_iterator.h:124
void UnlockRow() override
Definition: bka_iterator.h:129
bool Init() override
Initialize or reinitialize the iterator.
Definition: bka_iterator.cc:100
const unique_ptr_destroy_only< RowIterator > m_outer_input
Definition: bka_iterator.h:195
Base class that is used to represent any kind of expression in a relational query.
Definition: item.h:853
Definition: sql_optimizer.h:132
A typesafe replacement for DYNAMIC_ARRAY.
Definition: mem_root_array.h:425
The iterator actually doing the reads from the inner table during BKA.
Definition: bka_iterator.h:263
MultiRangeRowIterator(THD *thd, TABLE *table, Index_lookup *ref, int mrr_flags, JoinType join_type, const Prealloced_array< TABLE *, 4 > &outer_input_tables, bool store_rowids, table_map tables_to_get_rowid_for)
Definition: bka_iterator.cc:321
void MarkLastRowAsRead()
Mark that the BKA iterator has seen the last row we returned from Read().
Definition: bka_iterator.h:325
void set_match_flag_buffer(uchar *ptr)
Specify an unused chunk of memory that we can use to mark which inner rows have been read (by the par...
Definition: bka_iterator.h:318
const int m_mrr_flags
Flags passed on to MRR.
Definition: bka_iterator.h:386
int Read() override
Read another inner row (if any) and load the appropriate outer row(s) into the associated table buffe...
Definition: bka_iterator.cc:430
handler *const m_file
Handler for the table we are reading from.
Definition: bka_iterator.h:380
static bool MrrSkipRecordCallbackThunk(range_seq_t seq, char *range_info, uchar *)
Definition: bka_iterator.h:367
const JoinType m_join_type
The join type of the BKA join we are part of.
Definition: bka_iterator.h:412
pack_rows::TableCollection m_outer_input_tables
Tables and columns needed for each outer row.
Definition: bka_iterator.h:408
bool Init() override
Do the actual multi-range read with the rows given by set_rows() and using the temporary buffer given...
Definition: bka_iterator.cc:334
Index_lookup *const m_ref
The index condition.
Definition: bka_iterator.h:383
static uint MrrNextCallbackThunk(void *init_params, KEY_MULTI_RANGE *range)
Definition: bka_iterator.h:363
bool MrrSkipRecord(char *range_info)
Definition: bka_iterator.cc:425
HANDLER_BUFFER m_mrr_buffer
Temporary space for storing inner rows, used by MRR.
Definition: bka_iterator.h:401
bool MrrSkipIndexTuple(char *range_info)
uchar * m_match_flag_buffer
See set_match_flag_buffer().
Definition: bka_iterator.h:404
uint MrrNextCallback(KEY_MULTI_RANGE *range)
Definition: bka_iterator.cc:380
static range_seq_t MrrInitCallbackThunk(void *init_params, uint n_ranges, uint flags)
Definition: bka_iterator.h:358
range_seq_t MrrInitCallback(uint n_ranges, uint flags)
Definition: bka_iterator.cc:375
const hash_join_buffer::BufferRow * m_current_pos
Which row we are at in the [m_begin, m_end) range.
Definition: bka_iterator.h:394
const hash_join_buffer::BufferRow * m_begin
Current outer rows to read inner rows for. Set by set_rows().
Definition: bka_iterator.h:389
void set_rows(const hash_join_buffer::BufferRow *begin, const hash_join_buffer::BufferRow *end)
Specify which outer rows to read inner rows for.
Definition: bka_iterator.h:292
bool RowHasBeenRead(const hash_join_buffer::BufferRow *row) const
Check whether the given row has been marked as read (using MarkLastRowAsRead()) or not.
Definition: bka_iterator.h:338
void set_mrr_buffer(uchar *ptr, size_t size)
Specify an unused chunk of memory MRR can use for the returned inner rows.
Definition: bka_iterator.h:303
const hash_join_buffer::BufferRow * m_last_row_returned
What row we last returned from Read() (used for MarkLastRowAsRead()).
Definition: bka_iterator.h:397
const hash_join_buffer::BufferRow * m_end
Definition: bka_iterator.h:390
A typesafe replacement for DYNAMIC_ARRAY.
Definition: prealloced_array.h:70
A context for reading through a single table using a chosen access method: index read,...
Definition: row_iterator.h:81
THD * thd() const
Definition: row_iterator.h:227
Using this class is fraught with peril, and you need to be very careful when doing so.
Definition: sql_string.h:166
For each client connection we create a separate thread with THD serving as a thread/connection descri...
Definition: sql_lexer_thd.h:33
Definition: row_iterator.h:233
TABLE * table() const
Definition: row_iterator.h:245
The handler class is the interface for dynamically loadable storage engines.
Definition: handler.h:4460
A structure that contains a list of input tables for a hash join operation, BKA join operation or a s...
Definition: pack_rows.h:92
This file contains the HashJoinRowBuffer class and related functions/classes.
static int flags[50]
Definition: hp_test1.cc:39
JoinType
Definition: join_type.h:27
This file follows Google coding style, except for the name MEM_ROOT (which is kept for historical rea...
std::unique_ptr< T, Destroy_only< T > > unique_ptr_destroy_only
std::unique_ptr, but only destroying.
Definition: my_alloc.h:488
Some integer typedefs for easier portability.
unsigned char uchar
Definition: my_inttypes.h:51
uint64_t table_map
Definition: my_table_map.h:29
PT & ref(PT *tp)
Definition: tablespace_impl.cc:358
bool distance(const dd::Spatial_reference_system *srs, const Geometry *g1, const Geometry *g2, double *distance, bool *is_null) noexcept
Computes the distance between two geometries.
Definition: distance.cc:39
Key BufferRow
Definition: hash_join_buffer.h:129
Cursor end()
A past-the-end Cursor.
Definition: rules_table_service.cc:191
Generic routines for packing rows (possibly from multiple tables at the same time) into strings,...
void * range_seq_t
Definition: handler.h:3697
join_type
Definition: sql_opt_exec_shared.h:185
Our own string classes, used pervasively throughout the executor.
Definition: handler.h:3691
uchar * buffer_end
Definition: handler.h:3693
uchar * buffer
Definition: handler.h:3692
Structure used for index-based lookups.
Definition: sql_opt_exec_shared.h:66
Definition: my_base.h:1130
The MEM_ROOT is a simple arena, where allocations are carved out of larger blocks.
Definition: my_alloc.h:82
Definition: table.h:1394
Definition: gen_lex_token.cc:148