WL#4772: Improve building of classes that represent DML statements
Status: Complete
DML statements (ie SELECT, UPDATE, DELETE and INSERT statements, etc) are represented by objects from the Item class hierarchy and by st_select_lex and st_select_lex_unit objects. This WL fixes a problem regarding how these object hierarchies are built. st_select_lex and st_select_lex_unit objects are currently built top-down and not bottom-up. This is a major problem when simplifying our SQL parser, because we want it to use the parser's embedded stack and build data structures bottom-up. The worklog also removes the base class st_select_lex_node.
1. The LEX class The static st_select_lex and st_select_lex_unit objects contained in this class are replaced with pointers to such objects. Functions that generate select_lex objects and attach them to the LEX are added. 2. The st_select_lex class. This class can be used to represent an SQL query specification object (a complete SELECT clause), a VALUES clause or a TABLE clause. The constructs are commonly referred to as query blocks (they may be regarded as the building blocks of a query expression. The class always refers to the tables involved in the clause. For a SELECT clause, it also lists the selected expressions, the WHERE clause, the HAVING clause, the GROUP BY and ORDER BY column lists, as well as other relevant information. For a TABLE clause, it represents the associated table and a "SELECT * " operation against the table. The class is currently not equipped to support a VALUES clause. 3. The st_select_lex_unit class This class represents an SQL query expression, ie one or more query blocks (usually SELECT clauses). If there are more than one query block, they must be syntactically combined with UNION and UNION ALL clauses. A SELECT statement will always contain one query expression object (the top-level or outer-most query expression). In addition to this object, an SQL statement contains one st_select_lex_unit object per subquery. Note for QA: This is a clean refactoring that affects internal representation of DML statements. There is no new functionality and no result changes, possibly apart from some changed references inside plan descriptions. Each query has two more object allocations, so a tiny performance impact is expected, however it should be well below 1% of overall performance. Simple queries (e.g sysbench) should have the most visible performance impact (if any), and can be used to check for performance regressions.
1. The LEX class New functions: new_top_level_query() - Creates first st_select_lex_unit and st_select_lex object and attaches them to the current LEX object. The change here is that these objects used to be a static part of the LEX object. The function is called only one place, ie in the lex_start() function. Later on, this function should be called only when needed, ie for DML type statements and for CREATE TABLE AS and CREATE VIEW statements. The function is basically implemented by a call to new_query(). new_query() - Creates an st_select_lex_unit object and an st_select_lex object and links them below LEX::current_select. The function is used when a top-level query specification is encountered, when a subquery expression is encountered, and when the left-most query part of a UNION construct is encountered. It is also used to represent the top-level part of INSERT, UPDATE and DELETE statements. new_union_query() - Creates an st_select_lex object for all query blocks in a UNION construct, except for the left-most one, and attaches it to an existing st_select_lex_unit object. Also creates a 'fake_select' object if it does not already exist. new_static_query() - Creates a top-level query with objects allocated statically. Used in certain parser contexts where no additional subqueries are expected. new_empty_query_block() - Creates an empty query block and associates it with this LEX object. reset() - Resets the LEX object so that it is ready for a new execution. Modified functions: lex_start() - Creates the top-level st_select_lex_unit and st_select_lex objects. 2. The st_select_lex class The class now inherits from Sql_alloc instead of st_select_lex_node. Functions init_query() and init_select() have been replaced with a constructor that accepts pointers to the basic parts of a query specification object, ie a list of tables, a list of selected expressions, a WHERE clause, a HAVING clause, etc. The function set_context() is added. It sets the name resolution context for this query specification based on outer select_lex object and placement of this subquery in the outer query block (ie whether it is located in the WHERE, HAVING or ON clause, or whether it is a derived table or not. include_down(), include_neighbour(), include_standalone() and include_global() are re-implemented without reliance on a base class. include_global_chain() is a new function that adds a chain of st_select_lex objects to the global list. 3. The st_select_lex_unit class The class now inherits from Sql_alloc instead of st_select_lex_node. Functions init_query() and init_select() have been replaced with a simple constructor. include_down() is re-implemented without reliance on a base class. include_chain() is a new function that adds a chain of st_select_lex_unit objects to an st_select_lex object.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.