MySQL 9.1.0
Source Code Documentation
Histogram_sampler Class Reference

#include <row0pread-histogram.h>

Public Member Functions

 Histogram_sampler (size_t max_threads, int sampling_seed, double sampling_percentage, enum_sampling_method sampling_method)
 Constructor. More...
 
 ~Histogram_sampler ()
 Destructor. More...
 
bool init (trx_t *trx, dict_index_t *index, row_prebuilt_t *prebuilt)
 Initialize the sampler context. More...
 
dberr_t buffer_next ()
 Buffer next row. More...
 
void buffer_end ()
 End parallel read in case the reader thread is still active and wait for its exit. More...
 
void set (byte *buf)
 Set the buffer. More...
 
dberr_t run ()
 Start the sampling process. More...
 
bool skip ()
 Check if the processing of the record needs to be skipped. More...
 

Private Member Functions

void wait_for_start_of_buffering ()
 Wait till there is a request to buffer the next row. More...
 
void wait_for_end_of_buffering ()
 Wait till the buffering of the row is complete. More...
 
void signal_start_of_buffering ()
 Signal that the next row needs to be buffered. More...
 
void signal_end_of_buffering ()
 Signal that the buffering of the row is complete. More...
 
void set_error_state (dberr_t err)
 Set the error state. More...
 
bool is_error_set () const
 
dberr_t start_callback (Parallel_reader::Thread_ctx *reader_thread_ctx)
 Each parallel reader thread's init function. More...
 
dberr_t finish_callback (Parallel_reader::Thread_ctx *reader_thread_ctx)
 Each parallel reader thread's end function. More...
 
dberr_t sample_rec (const Parallel_reader::Ctx *ctx, const rec_t *rec, ulint *offsets, const dict_index_t *index, row_prebuilt_t *prebuilt)
 Convert the row in InnoDB format to MySQL format and store in the buffer for server to use. More...
 
dberr_t process_non_leaf_rec (const Parallel_reader::Ctx *ctx, row_prebuilt_t *prebuilt)
 For each record in a non-leaf block at level 1 (if leaf level is 0) check if the child page needs to be sampled and if so sample all the rows in the child page. More...
 
dberr_t process_leaf_rec (const Parallel_reader::Ctx *ctx, row_prebuilt_t *prebuilt)
 Process the record in the leaf page. More...
 

Private Attributes

bytem_buf {nullptr}
 Buffer to store the sampled row which is in the MySQL format. More...
 
os_event_t m_start_buffer_event
 Event to notify if the next row needs to be buffered. More...
 
os_event_t m_end_buffer_event
 Event to notify if the next row has been buffered. More...
 
dberr_t m_err {DB_SUCCESS}
 Error code when the row was buffered. More...
 
Parallel_reader m_parallel_reader
 The parallel reader. More...
 
std::mt19937 m_random_generator
 Random generator engine used to provide us random uniformly distributed values required to decide if the row in question needs to be sampled or not. More...
 
enum_sampling_method m_sampling_method {enum_sampling_method::NONE}
 Sampling method to be used for sampling. More...
 
double m_sampling_percentage {}
 Sampling percentage to be used for sampling. More...
 
int m_sampling_seed {}
 Sampling seed to be used for sampling. More...
 
std::atomic_size_t m_n_sampled
 Number of rows sampled. More...
 

Static Private Attributes

static std::uniform_real_distribution< double > m_distribution
 Uniform distribution used by the random generator. More...
 

Constructor & Destructor Documentation

◆ Histogram_sampler()

Histogram_sampler::Histogram_sampler ( size_t  max_threads,
int  sampling_seed,
double  sampling_percentage,
enum_sampling_method  sampling_method 
)
explicit

Constructor.

Parameters
[in]max_threadsMaximum number of threads to use.
[in]sampling_seedseed to be used for sampling
[in]sampling_percentagepercentage of sampling that needs to be done
[in]sampling_methodsampling method to be used for sampling

◆ ~Histogram_sampler()

Histogram_sampler::~Histogram_sampler ( )

Destructor.

Member Function Documentation

◆ buffer_end()

void Histogram_sampler::buffer_end ( )

End parallel read in case the reader thread is still active and wait for its exit.

This can happen if we're ending sampling prematurely.

◆ buffer_next()

dberr_t Histogram_sampler::buffer_next ( )

Buffer next row.

Returns
error code

◆ finish_callback()

dberr_t Histogram_sampler::finish_callback ( Parallel_reader::Thread_ctx reader_thread_ctx)
private

Each parallel reader thread's end function.

Parameters
[in]reader_thread_ctxcontext information related to the thread
Returns
DB_SUCCESS or error code.

◆ init()

bool Histogram_sampler::init ( trx_t trx,
dict_index_t index,
row_prebuilt_t prebuilt 
)

Initialize the sampler context.

Parameters
[in]trxTransaction used for parallel read.
[in]indexclustered index.
[in]prebuiltprebuilt info
Return values
trueon success.

◆ is_error_set()

bool Histogram_sampler::is_error_set ( ) const
inlineprivate
Returns
true if in error state.

◆ process_leaf_rec()

dberr_t Histogram_sampler::process_leaf_rec ( const Parallel_reader::Ctx ctx,
row_prebuilt_t prebuilt 
)
private

Process the record in the leaf page.

This would happen only when the root page is the leaf page and in such a case we process the page regardless of the sampling percentage.

Parameters
[in]ctxParallel read context.
[in]prebuiltRow meta-data cache.
Returns
error code

◆ process_non_leaf_rec()

dberr_t Histogram_sampler::process_non_leaf_rec ( const Parallel_reader::Ctx ctx,
row_prebuilt_t prebuilt 
)
private

For each record in a non-leaf block at level 1 (if leaf level is 0) check if the child page needs to be sampled and if so sample all the rows in the child page.

Parameters
[in]ctxParallel read context.
[in]prebuiltRow meta-data cache.
Returns
error code

◆ run()

dberr_t Histogram_sampler::run ( )

Start the sampling process.

Returns
DB_SUCCESS or error code.

◆ sample_rec()

dberr_t Histogram_sampler::sample_rec ( const Parallel_reader::Ctx ctx,
const rec_t rec,
ulint offsets,
const dict_index_t index,
row_prebuilt_t prebuilt 
)
private

Convert the row in InnoDB format to MySQL format and store in the buffer for server to use.

Parameters
[in]ctxParallel read context.
[in]recrecord that needs to be converted
[in]offsetsoffsets belonging to the record
[in]indexindex of the record
[in]prebuiltRow meta-data cache.
Returns
DB_SUCCESS or error code.

◆ set()

void Histogram_sampler::set ( byte buf)
inline

Set the buffer.

Parameters
[in]bufbuffer to be used to store the row converted to MySQL format.

◆ set_error_state()

void Histogram_sampler::set_error_state ( dberr_t  err)
inlineprivate

Set the error state.

Parameters
[in]errError state to set to.

◆ signal_end_of_buffering()

void Histogram_sampler::signal_end_of_buffering ( )
private

Signal that the buffering of the row is complete.

◆ signal_start_of_buffering()

void Histogram_sampler::signal_start_of_buffering ( )
private

Signal that the next row needs to be buffered.

◆ skip()

bool Histogram_sampler::skip ( )

Check if the processing of the record needs to be skipped.

In case of record belonging to non-leaf page, we decide if the child page pertaining to the record needs to be skipped. In case of record belonging to leaf page, we read the page regardless.

Returns
true if it needs to be skipped, else false.

◆ start_callback()

dberr_t Histogram_sampler::start_callback ( Parallel_reader::Thread_ctx reader_thread_ctx)
private

Each parallel reader thread's init function.

Parameters
[in]reader_thread_ctxcontext information related to the thread
Returns
DB_SUCCESS or error code.

There are data members in row_prebuilt_t that cannot be accessed in multi-threaded mode e.g., blob_heap.

row_prebuilt_t is designed for single threaded access and to share it among threads is not recommended unless "you know what you are doing". This is very fragile code as it stands.

To solve the blob heap issue in prebuilt we request parallel reader thread to use blob heap per thread and we pass this blob heap to the InnoDB to MySQL row format conversion function.

◆ wait_for_end_of_buffering()

void Histogram_sampler::wait_for_end_of_buffering ( )
private

Wait till the buffering of the row is complete.

◆ wait_for_start_of_buffering()

void Histogram_sampler::wait_for_start_of_buffering ( )
private

Wait till there is a request to buffer the next row.

Member Data Documentation

◆ m_buf

byte* Histogram_sampler::m_buf {nullptr}
private

Buffer to store the sampled row which is in the MySQL format.

◆ m_distribution

std::uniform_real_distribution< double > Histogram_sampler::m_distribution
staticprivate

Uniform distribution used by the random generator.

◆ m_end_buffer_event

os_event_t Histogram_sampler::m_end_buffer_event
private

Event to notify if the next row has been buffered.

◆ m_err

dberr_t Histogram_sampler::m_err {DB_SUCCESS}
private

Error code when the row was buffered.

◆ m_n_sampled

std::atomic_size_t Histogram_sampler::m_n_sampled
private

Number of rows sampled.

◆ m_parallel_reader

Parallel_reader Histogram_sampler::m_parallel_reader
private

The parallel reader.

◆ m_random_generator

std::mt19937 Histogram_sampler::m_random_generator
private

Random generator engine used to provide us random uniformly distributed values required to decide if the row in question needs to be sampled or not.

◆ m_sampling_method

enum_sampling_method Histogram_sampler::m_sampling_method {enum_sampling_method::NONE}
private

Sampling method to be used for sampling.

◆ m_sampling_percentage

double Histogram_sampler::m_sampling_percentage {}
private

Sampling percentage to be used for sampling.

◆ m_sampling_seed

int Histogram_sampler::m_sampling_seed {}
private

Sampling seed to be used for sampling.

◆ m_start_buffer_event

os_event_t Histogram_sampler::m_start_buffer_event
private

Event to notify if the next row needs to be buffered.


The documentation for this class was generated from the following files: