The BKA join iterator, with an arbitrary iterator tree on the outer side and a MultiRangeRowIterator on the inner side (possibly with a filter or similar in-between). More...

#include <bka_iterator.h>

Inheritance diagram for BKAIterator:

[legend]

Public Member Functions
	BKAIterator (THD thd, unique_ptr_destroy_only< RowIterator > outer_input, const Prealloced_array< TABLE , 4 > &outer_input_tables, unique_ptr_destroy_only< RowIterator > inner_input, size_t max_memory_available, size_t mrr_bytes_needed_for_single_inner_row, float expected_inner_rows_per_outer_row, bool store_rowids, table_map tables_to_get_rowid_for, MultiRangeRowIterator *mrr_iterator, JoinType join_type)

bool	Init () override
	Initialize or reinitialize the iterator. More...

int	Read () override
	Read a single row. More...

void	SetNullRowFlag (bool is_null_row) override
	Mark the current row buffer as containing a NULL row or not, so that if you read from it and the flag is true, you'll get only NULLs no matter what is actually in the buffer (typically some old leftover row). More...

void	UnlockRow () override

void	EndPSIBatchModeIfStarted () override
	Ends performance schema batch mode, if started. More...

Public Member Functions inherited from RowIterator
	RowIterator (THD *thd)

virtual	~RowIterator ()=default

	RowIterator (const RowIterator &)=delete

	RowIterator (RowIterator &&)=default

virtual const IteratorProfiler *	GetProfiler () const
	Get profiling data for this iterator (for 'EXPLAIN ANALYZE'). More...

virtual void	SetOverrideProfiler ([[maybe_unused]] const IteratorProfiler *profiler)

virtual void	StartPSIBatchMode ()
	Start performance schema batch mode, if supported (otherwise ignored). More...

virtual RowIterator *	real_iterator ()
	If this iterator is wrapping a different iterator (e.g. More...

virtual const RowIterator *	real_iterator () const

Private Types
enum class	State { NEED_OUTER_ROWS , RETURNING_JOINED_ROWS , RETURNING_NULL_COMPLEMENTED_ROWS , END_OF_ROWS }

Private Member Functions
void	BeginNewBatch ()
	Clear out the MEM_ROOT and prepare for reading rows anew. More...

void	BatchFinished ()
	If there are more outer rows, begin the next batch. More...

int	MakeNullComplementedRow ()
	Find the next unmatched row, and load it for output as a NULL-complemented row. More...

int	ReadOuterRows ()
	Read a batch of outer rows (BeginNewBatch() must have been called earlier). More...

Private Attributes
State	m_state

const unique_ptr_destroy_only< RowIterator >	m_outer_input

const unique_ptr_destroy_only< RowIterator >	m_inner_input

MEM_ROOT	m_mem_root
	The MEM_ROOT we are storing the outer rows on, and also allocating MRR buffer from. More...

Mem_root_array< hash_join_buffer::BufferRow >	m_rows
	Buffered outer rows. More...

pack_rows::TableCollection	m_outer_input_tables
	Tables and columns needed for each outer row. More...

String	m_outer_row_buffer
	Used for serializing the row we read from the outer table(s), before it stored into the MEM_ROOT and put into m_rows. More...

bool	m_has_row_from_previous_batch = false
	Whether we have a row in m_outer_row_buffer from the previous batch of rows that we haven't stored in m_rows yet. More...

size_t	m_mrr_bytes_needed_per_row
	For each outer row, how many bytes we need in the MRR buffer (ie., the number of bytes we expect to use on rows from the inner table). More...

size_t	m_bytes_used = 0
	Estimated number of bytes used on m_mem_root so far. More...

bool	m_end_of_outer_rows = false
	Whether we've seen EOF from the outer iterator. More...

const size_t	m_max_memory_available
	See max_memory_available in the constructor. More...

const size_t	m_mrr_bytes_needed_for_single_inner_row
	See max_memory_available in the constructor. More...

MultiRangeRowIterator *const	m_mrr_iterator
	See mrr_iterator in the constructor. More...

JoinType	m_join_type
	The join type of the BKA join. More...

hash_join_buffer::BufferRow *	m_current_pos
	If we are synthesizing NULL-complemented rows (for an outer join or antijoin), points to the next row within "m_rows" that we haven't considered yet. More...

Additional Inherited Members
Protected Member Functions inherited from RowIterator
THD *	thd () const

Detailed Description

The BKA join iterator, with an arbitrary iterator tree on the outer side and a MultiRangeRowIterator on the inner side (possibly with a filter or similar in-between).

See file comment for more details.

Member Enumeration Documentation

◆ State

enum class BKAIterator::State

strongprivate

Enumerator
NEED_OUTER_ROWS	We are about to start reading outer rows into our buffer. A single Read() call will fill it up, so there is no in-between “currently reading” state.
RETURNING_JOINED_ROWS	We are returning rows from the MultiRangeRowIterator. (For antijoins, we are looking up the rows, but don't actually return them.)
RETURNING_NULL_COMPLEMENTED_ROWS	We are an outer join or antijoin, and we're returning NULL-complemented rows for those outer rows that never had a matching inner row. Note that this is done in the BKAIterator and not the MRR iterator for two reasons: First, it gives more sensible EXPLAIN ANALYZE numbers. Second, the NULL-complemented rows could be filtered inadvertently by a FilterIterator before they reach the BKAIterator.
END_OF_ROWS	Both the outer and inner side are out of rows.

Constructor & Destructor Documentation

◆ BKAIterator()

BKAIterator::BKAIterator	(	THD *	thd,
		unique_ptr_destroy_only< RowIterator >	outer_input,
		const Prealloced_array< TABLE *, 4 > &	outer_input_tables,
		unique_ptr_destroy_only< RowIterator >	inner_input,
		size_t	max_memory_available,
		size_t	mrr_bytes_needed_for_single_inner_row,
		float	expected_inner_rows_per_outer_row,
		bool	store_rowids,
		table_map	tables_to_get_rowid_for,
		MultiRangeRowIterator *	mrr_iterator,
		JoinType	join_type
	)

Parameters

thd	Thread handle.
outer_input	The iterator to read the outer rows from.
outer_input_tables	Each outer table involved. Used to know which fields we are to read into our buffer.
inner_input	The iterator to read the inner rows from. Must end up in a MultiRangeRowIterator.
max_memory_available	Number of bytes available for row buffers, both outer rows and MRR buffers. Note that allocation is incremental, so we can allocate less than this.
mrr_bytes_needed_for_single_inner_row	Number of bytes MRR needs space for in its buffer for holding a single row from the inner table.
expected_inner_rows_per_outer_row	Number of inner rows we statistically expect for each outer row. Used for dividing the buffer space between inner rows and MRR row buffer (if we expect many inner rows, we can't load as many outer rows).
store_rowids	Whether we need to make sure all tables below us have row IDs available, after Read() has been called. Used only if we are below a weedout operation.
tables_to_get_rowid_for	A map of which tables BKAIterator needs to call position() for itself. tables that are in outer_input_tables but not in this map, are expected to be handled by some other iterator. tables that are in this map but not in outer_input_tables will be ignored.
mrr_iterator	Pointer to the MRR iterator at the bottom of inner_input. Used to send row ranges and buffers.
join_type	What kind of join we are executing.

Member Function Documentation

◆ BatchFinished()

void BKAIterator::BatchFinished ( )

private

If there are more outer rows, begin the next batch.

If not, move to the EOF state.

◆ BeginNewBatch()

void BKAIterator::BeginNewBatch ( )

private

Clear out the MEM_ROOT and prepare for reading rows anew.

◆ EndPSIBatchModeIfStarted()

void BKAIterator::EndPSIBatchModeIfStarted ( )

inlineoverridevirtual

Ends performance schema batch mode, if started.

It's always safe to call this.

Iterators that have children (composite iterators) must forward the EndPSIBatchModeIfStarted() call to every iterator they could conceivably have called StartPSIBatchMode() on. This ensures that after such a call to on the root iterator, all handlers are out of batch mode.

Reimplemented from RowIterator.

◆ Init()

bool BKAIterator::Init ( )

overridevirtual

Initialize or reinitialize the iterator.

You must always call Init() before trying a Read() (but Init() does not imply Read()).

You can call Init() multiple times; subsequent calls will rewind the iterator (or reposition it, depending on whether the iterator takes in e.g. a Index_lookup) and allow you to read the records anew.

Implements RowIterator.

◆ MakeNullComplementedRow()

int BKAIterator::MakeNullComplementedRow ( )

private

Find the next unmatched row, and load it for output as a NULL-complemented row.

(Assumes the NULL row flag has already been set on the inner table iterator.) Returns 0 if a row was found, -1 if no row was found. (Errors cannot happen.)

◆ Read()

int BKAIterator::Read ( )

overridevirtual

Read a single row.

The row data is not actually returned from the function; it is put in the table's (or tables', in case of a join) record buffer, ie., table->records[0].

Return values

0	OK
-1	End of records
1	Error

Implements RowIterator.

◆ ReadOuterRows()

int BKAIterator::ReadOuterRows ( )

private

Read a batch of outer rows (BeginNewBatch() must have been called earlier).

Returns -1 for no outer rows found (sets state to END_OF_ROWS), 0 for OK (sets state to RETURNING_JOINED_ROWS) or 1 for error.

◆ SetNullRowFlag()

void BKAIterator::SetNullRowFlag ( bool is_null_row )

inlineoverridevirtual

Mark the current row buffer as containing a NULL row or not, so that if you read from it and the flag is true, you'll get only NULLs no matter what is actually in the buffer (typically some old leftover row).

This is used for outer joins, when an iterator hasn't produced any rows and we need to produce a NULL-complemented row. Init() or Read() won't necessarily reset this flag, so if you ever set is to true, make sure to also set it to false when needed.

Note that this can be called without Init() having been called first. For example, NestedLoopIterator can hit EOF immediately on the outer iterator, which means the inner iterator doesn't get an Init() call, but will still forward SetNullRowFlag to both inner and outer iterators.

TODO: We shouldn't need this. See the comments on AggregateIterator for a bit more discussion on abstracting out a row interface.

Implements RowIterator.

◆ UnlockRow()

void BKAIterator::UnlockRow ( )

inlineoverridevirtual

Implements RowIterator.

Member Data Documentation

◆ m_bytes_used

size_t BKAIterator::m_bytes_used = 0

private

Estimated number of bytes used on m_mem_root so far.

◆ m_current_pos

hash_join_buffer::BufferRow* BKAIterator::m_current_pos

private

If we are synthesizing NULL-complemented rows (for an outer join or antijoin), points to the next row within "m_rows" that we haven't considered yet.

◆ m_end_of_outer_rows

bool BKAIterator::m_end_of_outer_rows = false

private

Whether we've seen EOF from the outer iterator.

◆ m_has_row_from_previous_batch

bool BKAIterator::m_has_row_from_previous_batch = false

private

Whether we have a row in m_outer_row_buffer from the previous batch of rows that we haven't stored in m_rows yet.

◆ m_inner_input

const unique_ptr_destroy_only<RowIterator> BKAIterator::m_inner_input

private

◆ m_join_type

JoinType BKAIterator::m_join_type

private

The join type of the BKA join.

◆ m_max_memory_available

const size_t BKAIterator::m_max_memory_available

private

See max_memory_available in the constructor.

◆ m_mem_root

MEM_ROOT BKAIterator::m_mem_root

private

The MEM_ROOT we are storing the outer rows on, and also allocating MRR buffer from.

In total, this should not go significantly over m_max_memory_available bytes.

◆ m_mrr_bytes_needed_for_single_inner_row

const size_t BKAIterator::m_mrr_bytes_needed_for_single_inner_row

private

See max_memory_available in the constructor.

◆ m_mrr_bytes_needed_per_row

size_t BKAIterator::m_mrr_bytes_needed_per_row

private

For each outer row, how many bytes we need in the MRR buffer (ie., the number of bytes we expect to use on rows from the inner table).

This is the expected number of inner rows per key, multiplied by the (fixed) size of each inner row. We use this information to stop scanning before we've used up the entire RAM allowance on outer rows, so that we have space remaining for the inner rows (in the MRR buffer), too.

◆ m_mrr_iterator

MultiRangeRowIterator* const BKAIterator::m_mrr_iterator

private

See mrr_iterator in the constructor.

◆ m_outer_input

const unique_ptr_destroy_only<RowIterator> BKAIterator::m_outer_input

private

◆ m_outer_input_tables

pack_rows::TableCollection BKAIterator::m_outer_input_tables

private

Tables and columns needed for each outer row.

Rows/columns that are not needed are filtered out in the constructor; the rest are read and stored in m_rows.

◆ m_outer_row_buffer

String BKAIterator::m_outer_row_buffer

private

Used for serializing the row we read from the outer table(s), before it stored into the MEM_ROOT and put into m_rows.

Should there not be room in m_rows for the row, it will stay in this variable until we start reading the next batch of outer rows.

If there are no BLOB/TEXT column in the join, we calculate an upper bound of the row size that is used to preallocate this buffer. In the case of BLOB/TEXT columns, we cannot calculate a reasonable upper bound, and the row size is calculated per row. The allocated memory is kept for the duration of the iterator, so that we (most likely) avoid reallocations.

◆ m_rows

Mem_root_array<hash_join_buffer::BufferRow> BKAIterator::m_rows

private

Buffered outer rows.

◆ m_state

State BKAIterator::m_state

private

The documentation for this class was generated from the following files:

sql/iterators/bka_iterator.h
sql/iterators/bka_iterator.cc

Public Member Functions

Private Types

Private Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Member Enumeration Documentation

◆ State

Constructor & Destructor Documentation

◆ BKAIterator()

Member Function Documentation

◆ BatchFinished()

◆ BeginNewBatch()

◆ EndPSIBatchModeIfStarted()

◆ Init()

◆ MakeNullComplementedRow()

◆ Read()

◆ ReadOuterRows()

◆ SetNullRowFlag()

◆ UnlockRow()

Member Data Documentation

◆ m_bytes_used

◆ m_current_pos

◆ m_end_of_outer_rows

◆ m_has_row_from_previous_batch

◆ m_inner_input

◆ m_join_type

◆ m_max_memory_available

◆ m_mem_root

◆ m_mrr_bytes_needed_for_single_inner_row

◆ m_mrr_bytes_needed_per_row

◆ m_mrr_iterator

◆ m_outer_input

◆ m_outer_input_tables

◆ m_outer_row_buffer

◆ m_rows

◆ m_state