WL#8907: Parser refactoring: merge all SELECT rules into one.

Affects: Server-8.0   —   Status: Complete

We have 7 separate grammar rules for the SELECT in different contexts:

  • top-level SELECT,
  • CREATE TABLE ... SELECT and INSERT/REPLACE ... SELECT,
  • CREATE VIEW ... AS SELECT,
  • inner (SELECT ...) inside a UNION,
  • derived table,
  • a special case of a derived table in a join list,
  • another special case for a UNION-based derived table.

Some of these grammars are same, but some are slightly different, so our syntax rules for SELECT and UNION differs in different contexts. Also we have to maintain similar but different parse tree node object for each of that context.

The idea of the WL: merge all SELECT grammar rules into one for the grammar/code deduplication and to make the SELECT syntax uniform in any context. A benefit will be that adding a WITH clause to various forms of SELECT will be doable with one single code change in sql_yacc.yy.

Functional: N/A.

Non-functional: No performance changes.

Contents


Implementation plan

The implementation plan is :

  1. INSERT/REPLACE ... SELECT: use query_expression in the existent insert_query_expression
  2. CREATE TABLE ... SELECT
  3. CREATE/ALTER VIEW

Part I: INSERT/REPLACE ... SELECT

Old syntax rules

 (INSERT|REPLACE)... insert_query_expression

where

 insert_query_expression ::=
   create_select [opt_union_clause]
 | '(' create_select ')' [union_opt]
                          

New syntax rules

 insert_query_expression ::=
   query_expression | query_expression_paren

Visible and invisible side effects

  • Error messages: like in a regular SELECT, now we output ER_PARSE_ERROR instead of the "Incorrect usage of UNION and ORDER BY" error in unions. This is also true for the next two parts.
  • -3 shift/reduce conflicts: unification of create_select with query_specification and select_part2.

Part II: CREATE TABLE ... SELECT

Old syntax

 CREATE TABLE ... create2

where

 create2 ::=
   '(' <fields> ')' [<table options>] [<partitioning>] [create3]
 | '(' [<partitioning>] create_select ')' union_opt
 | [<table options>] [<partitioning>] [create3]
 | ...
 create3 ::=
   [REPLACE|IGNORE] [AS] create_select [opt_union_clause]
 | [REPLACE|IGNORE] [AS] '(' create_select ')' [union_opt]
  • Note: This syntax is undocumented and doesn't seem to be in use, so the WL proposes a removal of it:

 CREATE TABLE ... (PARTITION ... SELECT ...) ...

New syntax

 create2 ::=
   '(' [<fields>] ')' opt_create_table_options_etc
 | opt_create_table_options_etc
 opt_create_table_options_etc ::=
   create_table_options opt_create_partitioning_etc
 | opt_create_partitioning_etc
 opt_create_partitioning_etc ::=
   create_partitioning [opt_duplicate_as_qe] | [opt_duplicate_as_qe]
 opt_duplicate_as_qe ::=
   duplicate as_create_query_expression
 | as_create_query_expression
 as_create_query_expression ::=
   AS create_query_expression
 | create_query_expression
 create_query_expression ::= query_expression | query_expression_or_parens
  • Note: Such a nested syntax is a workaround to not introduce reduce/reduce conflicts. Moreover, this allows to remove a shift/reduce conflict.

Side effects

  • The CREATE TABLE ... (PARTITION ... SELECT ...) ... syntax is removed as not documented and unused.
  • -1 shift/reduce conflict.

Part III: CREATE/ALTER VIEW

Old syntax

 (CREATE|ALTER) VIEW ... AS view_select ...

where

 view_select ::= 
   create_view_select [opt_union_clause]
 | '(' create_view_select_paren ')' [union_opt]

 create_view_select_paren ::=
   create_view_select
 | '(' create_view_select_paren ')'
 create_view_select ::= <sort of query_specification>

New syntax

 view_select ::= query_expression | query_expression_paren
 

Side effects

  • Optional/discussable: a new message has been added to output warnings on hinted SELECT keywords inside CREATE/ALTER VIEW query expression: ER_WARN_UNSUPPORTED_HINT.

Proposed changes in the Documentation

The current documentation for the CREATE TABLE statements is:

 CREATE [TEMPORARY] TABLE [IF NOT EXISTS] tbl_name
     [(create_definition,...)]
     [table_options]
     [partition_options]
     select_statement

...

 select_statement:
     [IGNORE | REPLACE] [AS] SELECT ...   (Some valid select statement)

It would be nice to move "[IGNORE | REPLACE] [AS]" out of the select_statement and replace select_statement with query_expression:

 CREATE [TEMPORARY] TABLE [IF NOT EXISTS] tbl_name
     [(create_definition,...)]
     [table_options]
     [partition_options]
     [IGNORE | REPLACE]
     [AS] query_expression

...

 query_expression:
      SELECT ...   (Some valid select or union statement)

Part IV: NESTED UNIONS

Currently we don't support nested query expression, i.e. "... UNION (...)" is a parse error.

Historically we support the left-hand nesting of unions in subqueries: union expressions are left-associative, so semantically "e1 UNION e2 UNION ..." and "(e1 UNION e2) UNION ..." are same. However in top-level statements (i.e. not in subqueries) even that quasi-nested unions were not supported.

This WL add a support for left-hand nested query expressions at the top level:

 (SELECT 1 UNION SELECT 1) UNION SELECT 1; -- this is a valid query now.