MySQL 8.4.3
Source Code Documentation
regexp::Regexp_facade Class Reference

This class handles. More...

#include <regexp_facade.h>

Public Member Functions

bool SetPattern (Item *pattern_expr, uint32_t flags)
 Sets the pattern if called for the first time or the pattern_expr is non-constant. More...
 
std::optional< bool > Matches (Item *subject_expr, int start, int occurrence)
 Tries to match the subject against the compiled regular expression. More...
 
std::optional< int > Find (Item *subject_expr, int start, int occurrence, bool after_match)
 Searches the subject for a match of the compiled regular expression and returns a position. More...
 
StringReplace (Item *subject_expr, Item *replacement_expr, int start, int occurrence, String *result)
 
StringSubstr (Item *subject_expr, int start, int occurrence, String *result)
 
void cleanup ()
 Delete the "engine" data structure after execution. More...
 
bool EngineHasWarning () const
 Did any operation return a warning? For unit testing. More...
 

Private Member Functions

bool Reset (Item *subject_expr, int start=1)
 Resets the compiled regular expression with a new string. More...
 
bool SetupEngine (Item *pattern_expr, uint flags)
 Actually compiles the regular expression. More...
 
int ConvertCodePointToLibPosition (int position) const
 Converts a string position in m_current_subject. More...
 
int ConvertLibPositionToCodePoint (int position) const
 Converts a string position in m_current_subject. More...
 
StringAssignResult (const char *str, size_t length, String *result)
 Helper function for setting the result from SQL regular expression functions that return a string value. More...
 

Private Attributes

unique_ptr_destroy_only< Regexp_enginem_engine
 Used for all the actual regular expression matching, search-and-replace, and positional and string information. More...
 
std::u16string m_current_subject
 ICU does not copy the subject string, so we keep the subject buffer here. More...
 

Detailed Description

This class handles.

  • Conversion to the regexp library's character set, and buffers the converted strings during matching.
  • Re-compilation of the regular expression in case the pattern is a field reference or otherwise non-constant.
  • NULL handling.
  • Conversion between indexing conventions. Clients of this class can use one-based indexing, while the classes used by this class use zero-based indexing.

Member Function Documentation

◆ AssignResult()

String * regexp::Regexp_facade::AssignResult ( const char *  str,
size_t  length,
String result 
)
private

Helper function for setting the result from SQL regular expression functions that return a string value.

Depending on character sets used by arguments and result, this function may copy, convert or just set the result. In particular, it handles the special case of the BINARY character set being interpreted as CP-1252.

Parameters
strThe result string from the regexp function.
lengthLength in bytes.
[out]resultThe result string.
Returns
A pointer to the same string as the argument, or nullptr in case of failure.

◆ cleanup()

void regexp::Regexp_facade::cleanup ( )
inline

Delete the "engine" data structure after execution.

◆ ConvertCodePointToLibPosition()

int regexp::Regexp_facade::ConvertCodePointToLibPosition ( int  position) const
private

Converts a string position in m_current_subject.

Parameters
positionOne-based code point position.
Returns
Zero-based byte position.

◆ ConvertLibPositionToCodePoint()

int regexp::Regexp_facade::ConvertLibPositionToCodePoint ( int  position) const
private

Converts a string position in m_current_subject.

Parameters
positionZero-based UTF-16 position.
Returns
Zero-based code point position.

◆ EngineHasWarning()

bool regexp::Regexp_facade::EngineHasWarning ( ) const
inline

Did any operation return a warning? For unit testing.

◆ Find()

std::optional< int > regexp::Regexp_facade::Find ( Item subject_expr,
int  start,
int  occurrence,
bool  after_match 
)

Searches the subject for a match of the compiled regular expression and returns a position.

Parameters
subject_exprThe string to search.
startStart position, 1-based.
occurrenceWhich occurrence of the pattern should be searched for.
after_matchIf true, the position following the end of the match is returned. If false, the position before the match is returned.
Returns
The first character of the match, or a null value if not found.

◆ Matches()

std::optional< bool > regexp::Regexp_facade::Matches ( Item subject_expr,
int  start,
int  occurrence 
)

Tries to match the subject against the compiled regular expression.

Parameters
subject_exprIs evaluated into a string to search.
startStart position, 1-based.
occurrenceWhich occurrence of the pattern should be searched for.
Return values
trueA match was found.
falseA match was not found.
nullptrEither the engine was not compiled, or subject_expr evaluates to NULL. This is useful for the Item_func_regexp object, since it doesn't have to make a special case for when the regular expression is NULL. Instead, the case is handled here in the facade.

◆ Replace()

String * regexp::Regexp_facade::Replace ( Item subject_expr,
Item replacement_expr,
int  start,
int  occurrence,
String result 
)
Parameters
subject_exprThe string to search.
replacement_exprThe string to replace the match with.
startStart position, 1-based.
occurrenceWhich occurrence of the pattern should be searched for.
[in,out]resultHolds the buffer for writing the result.

◆ Reset()

bool regexp::Regexp_facade::Reset ( Item subject_expr,
int  start = 1 
)
private

Resets the compiled regular expression with a new string.

Parameters
subject_exprThe new string to search.
startIf present, start on this code point.
Return values
falseOK.
trueEither there is no compiled regular expression, or the expression evaluated to NULL.

◆ SetPattern()

bool regexp::Regexp_facade::SetPattern ( Item pattern_expr,
uint32_t  flags 
)

Sets the pattern if called for the first time or the pattern_expr is non-constant.

This function is meant to be called for every row in a command such as

SELECT regexp_like( column, 'a+' ) FROM table;

In this case, the client of this class may call SetPattern() for every row without paying any penalty, as this becomes a no-op for all consecutive calls. In cases such as

SELECT regexp_like( column, regexp_column ) FROM table;

The regexp_column expression is non-constant and hence we have to recompile the regular expression for each row.

◆ SetupEngine()

bool regexp::Regexp_facade::SetupEngine ( Item pattern_expr,
uint  flags 
)
private

Actually compiles the regular expression.

◆ Substr()

String * regexp::Regexp_facade::Substr ( Item subject_expr,
int  start,
int  occurrence,
String result 
)

Member Data Documentation

◆ m_current_subject

std::u16string regexp::Regexp_facade::m_current_subject
private

ICU does not copy the subject string, so we keep the subject buffer here.

A call to Reset() causes it to be overwritten.

See also
Regexp_engine::reset()

◆ m_engine

unique_ptr_destroy_only<Regexp_engine> regexp::Regexp_facade::m_engine
private

Used for all the actual regular expression matching, search-and-replace, and positional and string information.

If either the regular expression pattern or the subject is NULL, this pointer is empty.


The documentation for this class was generated from the following files: