Handles aggregation (typically used for GROUP BY) for the case where the rows are already properly grouped coming in, ie., all rows that are supposed to be part of the same group are adjacent in the input stream. More...

#include <composite_iterators.h>

Inheritance diagram for AggregateIterator:

[legend]

Public Member Functions
	AggregateIterator (THD thd, unique_ptr_destroy_only< RowIterator > source, JOIN join, pack_rows::TableCollection tables, std::span< AccessPath * > single_row_index_lookups, bool rollup)

void	SetNullRowFlag (bool is_null_row) override
	Mark the current row buffer as containing a NULL row or not, so that if you read from it and the flag is true, you'll get only NULLs no matter what is actually in the buffer (typically some old leftover row). More...

void	StartPSIBatchMode () override
	Start performance schema batch mode, if supported (otherwise ignored). More...

void	EndPSIBatchModeIfStarted () override
	Ends performance schema batch mode, if started. More...

void	UnlockRow () override

Public Member Functions inherited from RowIterator
	RowIterator (THD *thd)

virtual	~RowIterator ()=default

	RowIterator (const RowIterator &)=delete

	RowIterator (RowIterator &&)=default

bool	Init ()
	Initialize or reinitialize the iterator. More...

int	Read ()
	Read a single row. More...

virtual const IteratorProfiler *	GetProfiler () const
	Get profiling data for this iterator (for 'EXPLAIN ANALYZE'). More...

virtual void	SetOverrideProfiler (const IteratorProfiler *profiler)

virtual RowIterator *	real_iterator ()
	If this iterator is wrapping a different iterator (e.g. More...

virtual const RowIterator *	real_iterator () const

uint64_t	num_init_calls () const
	Returns the number of times Init() has been called on this iterator. More...

uint64_t	num_rows () const
	Returns the number of times Read() has returned a row successfully from this iterator. More...

uint64_t	num_full_reads () const
	Returns the number of times the iterator has been fully read. More...

Private Types
enum	{ READING_FIRST_ROW , LAST_ROW_STARTED_NEW_GROUP , OUTPUTTING_ROLLUP_ROWS , DONE_OUTPUTTING_ROWS }

Private Member Functions
bool	DoInit () override

int	DoRead () override

void	SetRollupLevel (int level)

Private Attributes
enum AggregateIterator:: { ... }	m_state

unique_ptr_destroy_only< RowIterator >	m_source

JOIN *	m_join = nullptr
	The join we are part of. More...

bool	m_seen_eof
	Whether we have seen the last input row. More...

table_map	m_save_nullinfo
	Used to save NULL information in the specific case where we have zero input rows. More...

const bool	m_rollup
	Whether this is a rollup query. More...

int	m_last_unchanged_group_item_idx
	For rollup: The index of the first group item that did not change when we last switched groups. More...

int	m_current_rollup_position
	If we are in state OUTPUTTING_ROLLUP_ROWS, where we are in the iteration. More...

pack_rows::TableCollection	m_tables
	The list of tables we are reading from; they are the ones for which we need to save and restore rows. More...

String	m_first_row_this_group
	Packed version of the first row in the group we are currently processing. More...

String	m_first_row_next_group
	If applicable, packed version of the first row in the next group. More...

std::span< AccessPath * >	m_single_row_index_lookups
	All the single-row index lookups that provide rows to this iterator. More...

int	m_output_slice = -1
	The slice we're setting when returning rows. More...

Additional Inherited Members
Protected Member Functions inherited from RowIterator
THD *	thd () const

Detailed Description

Handles aggregation (typically used for GROUP BY) for the case where the rows are already properly grouped coming in, ie., all rows that are supposed to be part of the same group are adjacent in the input stream.

(This could be because they were sorted earlier, because we are scanning an index that already gives us the rows in a group-compatible order, or because there is no grouping.)

AggregateIterator needs to be able to save and restore rows; it doesn't know when a group ends until it's seen the first row that is part of the next group. When that happens, it needs to tuck away that next row, and then restore the previous row so that the output row gets the correct grouped values. A simple example, doing SELECT a, SUM(b) FROM t1 GROUP BY a:

t1.a t1.b SUM(b) 1 1 <– first row, save it 1 1 2 3 1 3 6 2 1 <– group changed, save row [1 1] <– restore first row, output 6 reset aggregate --> 0 [2 1] <– restore new row, process it 1 2 10 11 <– EOF, output 11

To save and restore rows like this, it uses the infrastructure from pack_rows.h to pack and unpack all relevant rows into record[0] of every input table. (Currently, there can only be one input table, but this may very well change in the future.) It would be nice to have a more abstract concept of sending a row around and taking copies of it if needed, as opposed to it implicitly staying in the table's buffer. (This would also solve some issues in EQRefIterator and when synthesizing NULL rows for outer joins.) However, that's a large refactoring.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum

private

Enumerator
READING_FIRST_ROW
LAST_ROW_STARTED_NEW_GROUP
OUTPUTTING_ROLLUP_ROWS
DONE_OUTPUTTING_ROWS

Constructor & Destructor Documentation

◆ AggregateIterator()

AggregateIterator::AggregateIterator	(	THD *	thd,
		unique_ptr_destroy_only< RowIterator >	source,
		JOIN *	join,
		pack_rows::TableCollection	tables,
		std::span< AccessPath * >	single_row_index_lookups,
		bool	rollup
	)

Member Function Documentation

◆ DoInit()

bool AggregateIterator::DoInit ( )

overrideprivatevirtual

Implements RowIterator.

◆ DoRead()

int AggregateIterator::DoRead ( )

overrideprivatevirtual

Implements RowIterator.

◆ EndPSIBatchModeIfStarted()

void AggregateIterator::EndPSIBatchModeIfStarted ( )

inlineoverridevirtual

Ends performance schema batch mode, if started.

It's always safe to call this.

Iterators that have children (composite iterators) must forward the EndPSIBatchModeIfStarted() call to every iterator they could conceivably have called StartPSIBatchMode() on. This ensures that after such a call to on the root iterator, all handlers are out of batch mode.

Reimplemented from RowIterator.

◆ SetNullRowFlag()

void AggregateIterator::SetNullRowFlag ( bool is_null_row )

inlineoverridevirtual

Mark the current row buffer as containing a NULL row or not, so that if you read from it and the flag is true, you'll get only NULLs no matter what is actually in the buffer (typically some old leftover row).

This is used for outer joins, when an iterator hasn't produced any rows and we need to produce a NULL-complemented row. Init() or Read() won't necessarily reset this flag, so if you ever set is to true, make sure to also set it to false when needed.

Note that this can be called without Init() having been called first. For example, NestedLoopIterator can hit EOF immediately on the outer iterator, which means the inner iterator doesn't get an Init() call, but will still forward SetNullRowFlag to both inner and outer iterators.

TODO: We shouldn't need this. See the comments on AggregateIterator for a bit more discussion on abstracting out a row interface.

Implements RowIterator.

◆ SetRollupLevel()

void AggregateIterator::SetRollupLevel ( int level )

private

◆ StartPSIBatchMode()

void AggregateIterator::StartPSIBatchMode ( )

inlineoverridevirtual

Start performance schema batch mode, if supported (otherwise ignored).

PFS batch mode is a mitigation to reduce the overhead of performance schema, typically applied at the innermost table of the entire join. If you start it before scanning the table and then end it afterwards, the entire set of handler calls will be timed only once, as a group, and the costs will be distributed evenly out. This reduces timer overhead.

If you start PFS batch mode, you must also take care to end it at the end of the scan, one way or the other. Do note that this is true even if the query ends abruptly (LIMIT is reached, or an error happens). The easiest workaround for this is to simply call EndPSIBatchModeIfStarted() on the root iterator at the end of the scan. See the PFSBatchMode class for a useful helper.

The rules for starting batch and ending mode are:

If you are an iterator with exactly one child (FilterIterator etc.), forward any StartPSIBatchMode() calls to it.
If you drive an iterator (read rows from it using a for loop or similar), use PFSBatchMode as described above.
If you have multiple children, ignore the call and do your own handling of batch mode as appropriate. For materialization, #2 would typically apply. For joins, it depends on the join type (e.g., NestedLoopIterator applies batch mode only when scanning the innermost table).

The upshot of this is that when scanning a single table, batch mode will typically be activated for that table (since we call StartPSIBatchMode() on the root iterator, and it will trickle all the way down to the table iterator), but for a join, the call will be ignored and the join iterator will activate batch mode by itself as needed.

Reimplemented from RowIterator.

◆ UnlockRow()

void AggregateIterator::UnlockRow ( )

inlineoverridevirtual

Implements RowIterator.

Member Data Documentation

◆ m_current_rollup_position

int AggregateIterator::m_current_rollup_position

private

If we are in state OUTPUTTING_ROLLUP_ROWS, where we are in the iteration.

This value will start at the index of the last group expression and then count backwards down to and including m_last_unchanged_group_item_idx. It is used to communicate to the rollup group items whether to turn themselves into NULLs, and the sum items which of their sums to output.

◆ m_first_row_next_group

String AggregateIterator::m_first_row_next_group

private

If applicable, packed version of the first row in the next group.

This is used only in the LAST_ROW_STARTED_NEW_GROUP state; we just saw a row that didn't belong to the current group, so we saved it here and went to output a group. On the next Read() call, we need to process this deferred row first of all.

Even when not in use, this string contains a buffer that is large enough to pack a full row into, sans blobs. (If blobs are present, StoreFromTableBuffers() will automatically allocate more space if needed.)

◆ m_first_row_this_group

String AggregateIterator::m_first_row_this_group

private

Packed version of the first row in the group we are currently processing.

◆ m_join

JOIN* AggregateIterator::m_join = nullptr

private

The join we are part of.

It would be nicer not to rely on this, but we need a large number of members from there, like which aggregate functions we have, the THD, temporary table parameters and so on.

◆ m_last_unchanged_group_item_idx

int AggregateIterator::m_last_unchanged_group_item_idx

private

For rollup: The index of the first group item that did not change when we last switched groups.

E.g., if we have group fields A,B,C,D and then switch to group A,B,E,D, this value will become 1 (which means that we need to output rollup rows for 2 – A,B,E,NULL – and then 1 – A,B,NULL,NULL). m_current_rollup_position will count down from the end until it becomes less than this value.

If we do not have rollup, this value is perennially zero.

◆ m_output_slice

int AggregateIterator::m_output_slice = -1

private

The slice we're setting when returning rows.

See the comment in the constructor.

◆ m_rollup

const bool AggregateIterator::m_rollup

private

Whether this is a rollup query.

◆ m_save_nullinfo

table_map AggregateIterator::m_save_nullinfo

private

Used to save NULL information in the specific case where we have zero input rows.

◆ m_seen_eof

bool AggregateIterator::m_seen_eof

private

Whether we have seen the last input row.

◆ m_single_row_index_lookups

std::span<AccessPath *> AggregateIterator::m_single_row_index_lookups

private

All the single-row index lookups that provide rows to this iterator.

◆ m_source

unique_ptr_destroy_only<RowIterator> AggregateIterator::m_source

private

◆

enum { ... } AggregateIterator::m_state

◆ m_tables

pack_rows::TableCollection AggregateIterator::m_tables

private

The list of tables we are reading from; they are the ones for which we need to save and restore rows.

The documentation for this class was generated from the following files:

sql/iterators/composite_iterators.h
sql/iterators/composite_iterators.cc

Public Member Functions

Private Types

Private Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Member Enumeration Documentation

◆ anonymous enum

Constructor & Destructor Documentation

◆ AggregateIterator()

Member Function Documentation

◆ DoInit()

◆ DoRead()

◆ EndPSIBatchModeIfStarted()

◆ SetNullRowFlag()

◆ SetRollupLevel()

◆ StartPSIBatchMode()

◆ UnlockRow()

Member Data Documentation

◆ m_current_rollup_position

◆ m_first_row_next_group

◆ m_first_row_this_group

◆ m_join

◆ m_last_unchanged_group_item_idx

◆ m_output_slice

◆ m_rollup

◆ m_save_nullinfo

◆ m_seen_eof

◆ m_single_row_index_lookups

◆ m_source

◆

◆ m_tables