MySQL 8.0.40
Source Code Documentation
ddl::FTS::Parser Struct Reference

For parsing and sorting the documents. More...

Classes

struct  Handler
 Data structures for building an index. More...
 

Public Member Functions

 Parser (size_t id, Context &ctx, Dup *dup, bool doc_id_32_bit) noexcept
 Constructor. More...
 
 ~Parser () noexcept
 Destructor. More...
 
size_t id () const noexcept
 
dberr_t init (size_t n_threads) noexcept
 Initialize the data structures. More...
 
file_t release_file (size_t id) noexcept
 Releases ownership of the i'th file used. More...
 
dberr_t get_error () const noexcept
 
void set_error (dberr_t err) noexcept
 Set the error code. More...
 
dberr_t enqueue (FTS::Doc_item *doc_item) noexcept
 Enqueue a document to parse. More...
 
void parse (Builder *builder) noexcept
 Function performs parallel tokenization of the incoming doc strings. More...
 
void set_parent_state (Thread_state state) noexcept
 Set the parent thread state. More...
 

Public Attributes

Diagnostics_area da {false}
 

Private Types

using Docq = mpmc_bq< FTS::Doc_item * >
 
using Docq_ptr = std::unique_ptr< Docq, std::function< void(Docq *)> >
 
using Handler_ptr = std::unique_ptr< Handler, std::function< void(Handler *)> >
 
using Handlers = std::array< Handler_ptr, FTS_NUM_AUX_INDEX >
 

Private Member Functions

bool doc_tokenize (doc_id_t doc_id, fts_doc_t *doc, dtype_t *word_dtype, Tokenize_ctx *t_ctx) noexcept
 Tokenize incoming text data and add to the sort buffer. More...
 
void get_next_doc_item (FTS::Doc_item *&doc_item) noexcept
 Get next doc item from fts_doc_lis. More...
 
void tokenize (fts_doc_t *doc, st_mysql_ftparser *parser, Tokenize_ctx *t_ctx) noexcept
 Tokenize by fts plugin parser. More...
 

Static Private Member Functions

static int add_word (MYSQL_FTPARSER_PARAM *param, char *word, int word_len, MYSQL_FTPARSER_BOOLEAN_INFO *boolean_info) noexcept
 FTS plugin parser 'myql_add_word' callback function for row merge. More...
 

Private Attributes

size_t m_id {}
 Parallel sort ID. More...
 
Dupm_dup {}
 Descriptor of FTS index. More...
 
Contextm_ctx
 DDL context. More...
 
Handlers m_handlers {}
 Buffers etc. More...
 
bool m_doc_id_32_bit {}
 Whether to use 4 bytes instead of 8 bytes integer to store Doc ID during sort, if Doc ID will not be big enough to use 8 bytes value. More...
 
Docq_ptr m_docq {}
 Doc queue to process. More...
 
std::atomic_size_t m_memory_used {}
 Memory used by fts_doc_list. More...
 
Thread_state m_parent_state {Thread_state::UNKNOWN}
 Parent thread state. More...
 

Detailed Description

For parsing and sorting the documents.

Member Typedef Documentation

◆ Docq

◆ Docq_ptr

using ddl::FTS::Parser::Docq_ptr = std::unique_ptr<Docq, std::function<void(Docq *)> >
private

◆ Handler_ptr

using ddl::FTS::Parser::Handler_ptr = std::unique_ptr<Handler, std::function<void(Handler *)> >
private

◆ Handlers

Constructor & Destructor Documentation

◆ Parser()

ddl::FTS::Parser::Parser ( size_t  id,
Context ctx,
Dup dup,
bool  doc_id_32_bit 
)
noexcept

Constructor.

Parameters
[in]idParser ID.
[in,out]ctxDDL context.
[in,out]dupDescriptor of FTS index being created.
[in]doc_id_32_bitSize of the doc ID column to use for sort.

◆ ~Parser()

ddl::FTS::Parser::~Parser ( )
noexcept

Destructor.

Member Function Documentation

◆ add_word()

int ddl::FTS::Parser::add_word ( MYSQL_FTPARSER_PARAM param,
char *  word,
int  word_len,
MYSQL_FTPARSER_BOOLEAN_INFO boolean_info 
)
staticprivatenoexcept

FTS plugin parser 'myql_add_word' callback function for row merge.

Refer to 'MYSQL_FTPARSER_PARAM' for more detail.

Parameters
[in]paramParser parameter.
[in]wordToken word.
[in]word_lenWord len.
[in]boolean_infoBoolean info.
Returns
always 0 - plugin requirement.

◆ doc_tokenize()

bool ddl::FTS::Parser::doc_tokenize ( doc_id_t  doc_id,
fts_doc_t doc,
dtype_t word_dtype,
Tokenize_ctx t_ctx 
)
privatenoexcept

Tokenize incoming text data and add to the sort buffer.

Parameters
[in]doc_idDoc ID.
[in]docDoc to be tokenized.
[in]word_dtypeData structure for word col.
[in,out]t_ctxTokenize context.
Returns
true if the record passed, false if out of space

◆ enqueue()

dberr_t ddl::FTS::Parser::enqueue ( FTS::Doc_item doc_item)
noexcept

Enqueue a document to parse.

Parameters
[in,out]doc_itemDocument to parse.
Returns
DB_SUCCESS or error code.

◆ get_error()

dberr_t ddl::FTS::Parser::get_error ( ) const
inlinenoexcept
Returns
the parser error status.

◆ get_next_doc_item()

void ddl::FTS::Parser::get_next_doc_item ( FTS::Doc_item *&  doc_item)
privatenoexcept

Get next doc item from fts_doc_lis.

Parameters
[in,out]doc_itemDoc item.

◆ id()

size_t ddl::FTS::Parser::id ( ) const
inlinenoexcept
Returns
the parser ID.

◆ init()

dberr_t ddl::FTS::Parser::init ( size_t  n_threads)
noexcept

Initialize the data structures.

Parameters
[in]n_threadsNumber of parsing threads.
Returns
DB_SUCCESS or error code.

◆ parse()

void ddl::FTS::Parser::parse ( Builder builder)
noexcept

Function performs parallel tokenization of the incoming doc strings.

Parameters
[in,out]builderIndex builder instance.

◆ release_file()

file_t ddl::FTS::Parser::release_file ( size_t  id)
inlinenoexcept

Releases ownership of the i'th file used.

Returns
the i'th file.

◆ set_error()

void ddl::FTS::Parser::set_error ( dberr_t  err)
inlinenoexcept

Set the error code.

Parameters
[in]errError code to set.

◆ set_parent_state()

void ddl::FTS::Parser::set_parent_state ( Thread_state  state)
inlinenoexcept

Set the parent thread state.

Parameters
[in]stateThe parent state.

◆ tokenize()

void ddl::FTS::Parser::tokenize ( fts_doc_t doc,
st_mysql_ftparser parser,
Tokenize_ctx t_ctx 
)
privatenoexcept

Tokenize by fts plugin parser.

Parameters
[in]docTo tokenize
[in]parserPlugin parser instance.
[in,out]t_ctxTokenize ctx instance.

Member Data Documentation

◆ da

Diagnostics_area ddl::FTS::Parser::da {false}

◆ m_ctx

Context& ddl::FTS::Parser::m_ctx
private

DDL context.

◆ m_doc_id_32_bit

bool ddl::FTS::Parser::m_doc_id_32_bit {}
private

Whether to use 4 bytes instead of 8 bytes integer to store Doc ID during sort, if Doc ID will not be big enough to use 8 bytes value.

◆ m_docq

Docq_ptr ddl::FTS::Parser::m_docq {}
private

Doc queue to process.

◆ m_dup

Dup* ddl::FTS::Parser::m_dup {}
private

Descriptor of FTS index.

◆ m_handlers

Handlers ddl::FTS::Parser::m_handlers {}
private

Buffers etc.

◆ m_id

size_t ddl::FTS::Parser::m_id {}
private

Parallel sort ID.

◆ m_memory_used

std::atomic_size_t ddl::FTS::Parser::m_memory_used {}
private

Memory used by fts_doc_list.

◆ m_parent_state

Thread_state ddl::FTS::Parser::m_parent_state {Thread_state::UNKNOWN}
private

Parent thread state.


The documentation for this struct was generated from the following file: