MySQL 8.4.2
Source Code Documentation
|
Like semijoin materialization, weedout works on the basic idea that a semijoin is just like an inner join as we long as we can get rid of the duplicates somehow. More...
#include <composite_iterators.h>
Public Member Functions | |
WeedoutIterator (THD *thd, unique_ptr_destroy_only< RowIterator > source, SJ_TMP_TABLE *sj, table_map tables_to_get_rowid_for) | |
bool | Init () override |
Initialize or reinitialize the iterator. More... | |
int | Read () override |
Read a single row. More... | |
void | SetNullRowFlag (bool is_null_row) override |
Mark the current row buffer as containing a NULL row or not, so that if you read from it and the flag is true, you'll get only NULLs no matter what is actually in the buffer (typically some old leftover row). More... | |
void | EndPSIBatchModeIfStarted () override |
Ends performance schema batch mode, if started. More... | |
void | UnlockRow () override |
Public Member Functions inherited from RowIterator | |
RowIterator (THD *thd) | |
virtual | ~RowIterator ()=default |
RowIterator (const RowIterator &)=delete | |
RowIterator (RowIterator &&)=default | |
virtual const IteratorProfiler * | GetProfiler () const |
Get profiling data for this iterator (for 'EXPLAIN ANALYZE'). More... | |
virtual void | SetOverrideProfiler ([[maybe_unused]] const IteratorProfiler *profiler) |
virtual void | StartPSIBatchMode () |
Start performance schema batch mode, if supported (otherwise ignored). More... | |
virtual RowIterator * | real_iterator () |
If this iterator is wrapping a different iterator (e.g. More... | |
virtual const RowIterator * | real_iterator () const |
Private Attributes | |
unique_ptr_destroy_only< RowIterator > | m_source |
SJ_TMP_TABLE * | m_sj |
const table_map | m_tables_to_get_rowid_for |
Additional Inherited Members | |
Protected Member Functions inherited from RowIterator | |
THD * | thd () const |
Like semijoin materialization, weedout works on the basic idea that a semijoin is just like an inner join as we long as we can get rid of the duplicates somehow.
(This is advantageous, because inner joins can be reordered, whereas semijoins generally can't.) However, unlike semijoin materialization, weedout removes duplicates after the join, not before it. Consider something like
SELECT * FROM t1 WHERE a IN ( SELECT b FROM t2 );
Semijoin materialization solves this by materializing t2, with deduplication, and then joining. Weedout joins t1 to t2 and then leaves only one output row per t1 row. The disadvantage is that this potentially needs to discard more rows; the (potential) advantage is that we deduplicate on t1 instead of t2.
Weedout, unlike materialization, works in a streaming fashion; rows are output (or discarded) as they come in, with a temporary table used for recording the row IDs we've seen before. (We need to deduplicate on t1's row IDs, not its contents.) See SJ_TMP_TABLE for details about the table format.
WeedoutIterator::WeedoutIterator | ( | THD * | thd, |
unique_ptr_destroy_only< RowIterator > | source, | ||
SJ_TMP_TABLE * | sj, | ||
table_map | tables_to_get_rowid_for | ||
) |
|
inlineoverridevirtual |
Ends performance schema batch mode, if started.
It's always safe to call this.
Iterators that have children (composite iterators) must forward the EndPSIBatchModeIfStarted() call to every iterator they could conceivably have called StartPSIBatchMode() on. This ensures that after such a call to on the root iterator, all handlers are out of batch mode.
Reimplemented from RowIterator.
|
overridevirtual |
Initialize or reinitialize the iterator.
You must always call Init() before trying a Read() (but Init() does not imply Read()).
You can call Init() multiple times; subsequent calls will rewind the iterator (or reposition it, depending on whether the iterator takes in e.g. a Index_lookup) and allow you to read the records anew.
Implements RowIterator.
|
overridevirtual |
Read a single row.
The row data is not actually returned from the function; it is put in the table's (or tables', in case of a join) record buffer, ie., table->records[0].
0 | OK |
-1 | End of records |
1 | Error |
Implements RowIterator.
|
inlineoverridevirtual |
Mark the current row buffer as containing a NULL row or not, so that if you read from it and the flag is true, you'll get only NULLs no matter what is actually in the buffer (typically some old leftover row).
This is used for outer joins, when an iterator hasn't produced any rows and we need to produce a NULL-complemented row. Init() or Read() won't necessarily reset this flag, so if you ever set is to true, make sure to also set it to false when needed.
Note that this can be called without Init() having been called first. For example, NestedLoopIterator can hit EOF immediately on the outer iterator, which means the inner iterator doesn't get an Init() call, but will still forward SetNullRowFlag to both inner and outer iterators.
TODO: We shouldn't need this. See the comments on AggregateIterator for a bit more discussion on abstracting out a row interface.
Implements RowIterator.
|
inlineoverridevirtual |
Implements RowIterator.
|
private |
|
private |
|
private |