WL#6242: Move "PROCEDURE ANALYSE" implementation to select_send-like class
Status: Complete
1. Move "PROCEDURE ANALYSE()" implementation outside of the Optimizer, encapsulate it into a select_result successor class, that replaces resulting data rows with "PROCEDURE ANALYSE()" statistics. 2. Remove the ability to create new "procedures" ("Procedure API"). 3. Docs: remove Documentation pages about the "Procedure API": * 22.4 Adding New Procedures to MySQL https://dev.mysql.com/doc/refman/5.6/en/adding-procedures.html * 22.4.2 Writing a Procedure https://dev.mysql.com/doc/refman/5.6/en/writing-a-procedure.html (see details in HLS 3.)
0. Contents =========== 1.Motivation 1.1. No need for generic "procedure" API 1.2. Unused grouping stuff makes Optimizer overcomplicated 2. Conclusion 3. Changes in the Documentation 3.1. Wrong syntax 3.2. Valid syntax 3.3. Obsolete documentation pages 4. External references 1. Motivation ============= 1.1. No need for generic "procedure" API ---------------------------------------- MySQL has a very special syntax for so-called PROCEDURE ANALYSE: SELECT ... FROM ... WHERE ... PROCEDURE ANALYSE([max_elements,[max_memory]]) (See https://dev.mysql.com/doc/refman/5.6/en/procedure-analyse.html ) This is the only "PROCEDURE" that we support. However, in addition to the "PROCEDURE ANALYSE" functionality implementation, the Server has a generic framework class "Procedure" for such "procedures". 1.2. Unused grouping stuff makes Optimizer overcomplicated ---------------------------------------------------------- The Procedure framework class has fields "ORDER *group,*param_fields;" those are related to a grouping of a query result. The optimizer takes into account these fields like GROUP BY clauses. This makes Optimizer more complicated to support since we have to create temporary tables and indices for "procedures". OTOH "PROCEDURE ANALYSE" does nothing with either grouping or Procedure::group/param_fields variables. It seems like, Procedure::group and Procedure::param_fields variables are a legacy of earlier MySQL versions (pre-3.23): static struct st_procedure_def { const char *name; Procedure *(*init)(THD *thd,ORDER *param,select_result *result, List- &field_list); } sql_procs[] = { #ifdef USE_PROC_RANGE { "split_sum",proc_sum_range_init }, // Internal procedure at TCX { "split_count",proc_count_range_init }, // Internal procedure at TCX { "matris_ranges",proc_matris_range_init }, // Internal procedure at TCX #endif { "analyse",proc_analyse_init } // Analyse a result }; USE_PROC_RANGE conditional constant is never defined in MySQL. "Internal procedure at TCX" commentary says, that long time ago the TcX AB firm, where Monty developed early Server before MySQL AB firm, had additional 3 "procedures" for its internal needs: "split_sum", "split_count" and "matris_ranges". Currently MySQL code lack of these "procedures". So, nobody uses grouping stuff of the Procedure class, however we have to support it. There is no easy way to write regression tests for this unused stuff, so it makes further development and refactoring of the Optimizer overcomplicated. 2. Conclusion ============= We remove: a) the Procedure class, b) all references to Procedure API from the Optimizer. The we convert the "PROCEDURE ANALYSE()" implementation (the "analyse" class) to some select_result successor class. The object of this class will wrap the original select_result output stream and convert data into "PROCEDURE ANALYSE()" statistics. Pros: * The Optimizer doesn't rely on the Procedure stuff any more. * The syntax of SELECT ... PROCEDURE ANALYSE() is not fully documented: after the number of bugs the syntax of underlying SELECT has been significantly limited to enclose buggy places. This modification allows to remove most of SELECT ... PROCEDURE ANALYSE() syntax limitations. Contras: This select_result successor is a bit unique: we already have a) interceptors to catch internal data streams inside complex INSERT/UPDATE/DELETE commands and b) output transformer for EXPLAIN command, but none of them serve a regular SELECT query. However, the SELECT ... PROCEDURE ANALYSE() is a regular SELECT query, so it should work in all SELECT-enabled context, i.e. as a source query for materialized cursors. The interception of materialized cursor source stream is a bit unusual task. Risks: There are may be [not very complex] bugs. It would be nice to push the implementation before the WL#6071 "Inline tmp tables into nested loops algorithm" to make its code coverage testing easier. 3. Changes in the Documentation =============================== To Docs: 3.1 Wrong syntax ---------------- Please note in the Documentation, that PROCEDURE ANALYSE is applicable to *regular* *top-level* SELECT queries only, those return their result in a form of a regular *recordset* (not "SELECT INTO"). The following queries are syntax errors: * CREATE TABLE ... FROM SELECT ... PROCEDURE ANALYSE(), * CREATE VIEW ... AS SELECT ... PROCEDURE ANALYSE(), * INSERT INTO ... SELECT ... PROCEDURE ANALYSE(), * SELECT ... PROCEDURE ANALYSE() inside subquery, * SELECT ... INTO ... PROCEDURE ANALYSE(), * UNION of SELECT ... PROCEDURE ANALYSE(). 3.2. Valid syntax ----------------- You may want to document two valid complex syntaxes for sure (manual doesn't ban any of them): a) DECLARE ... CURSOR FOR SELECT ... PROCEDURE ANALYSE(); This query creates a cursor that returns the same result as underlying SELECT ... PROCEDURE ANALYSE(); This is the pre-WL behaviour and it is covered with old tests. b) EXPLAIN SELECT ... PROCEDURE ANALYSE(); Before the WL this construction was disabled as a syntax error. In the current implementation this is a valid syntax. This EXPLAIN shows a real Query Execution Plan like "EXPLAIN SELECT" without "PROCEDURE ANALYSE" does. Note: after this WL, both SELECTs with or without "PROCEDURE ANALYSE" have the same QEP. 3.3 Obsolete documentation pages -------------------------------- 1. Please remove these obsolete pages: * 22.4 Adding New Procedures to MySQL https://dev.mysql.com/doc/refman/5.6/en/adding-procedures.html * 22.4.2 Writing a Procedure https://dev.mysql.com/doc/refman/5.6/en/writing-a-procedure.html 2. Please move this page to a different node (without "adding new procedures" title). * 22.4 Adding New Procedures to MySQL :: 22.4.1 PROCEDURE ANALYSE https://dev.mysql.com/doc/refman/5.6/en/procedure-analyse.html 4. External references ====================== MySQL 5.6 Reference Manual :: 22 Extending MySQL :: 22.4 Adding New Procedures to MySQL https://dev.mysql.com/doc/refman/5.6/en/adding-procedures.html MySQL 5.6 Reference Manual :: 22 Extending MySQL :: 22.4 Adding New Procedures to MySQL :: 22.4.1 PROCEDURE ANALYSE https://dev.mysql.com/doc/refman/5.6/en/procedure-analyse.html MySQL 5.6 Reference Manual :: 22 Extending MySQL :: 22.4 Adding New Procedures to MySQL :: 22.4.2 Writing a Procedure https://dev.mysql.com/doc/refman/5.6/en/writing-a-procedure.html How to write "procedure" extensions: https://forge.mysql.com/wiki/MySQL_Internals_Result_Postprocessing_Procedures MySQL Performance Blog, "PROCEDURE ANALYSE", March 23, 2009 By Peter Zaitsev: http://www.mysqlperformanceblog.com/2009/03/23/procedure-analyse/
1. Remove the Procedure class and all references to it from the Optimizer; also remove: * struct st_procedure_def, * class Item_proc_real (unused), * setup_procedure function, * add_proc_to_list function, * setup_new_fields function. 2. Rename the "analyse" class to "select_analyse" and inherit it from the select_send class. (select_interceptor class is not very well suited to be a parent for the select_analyse since it was designed to not output any data). 3. Wrap the original select_send output stream with the select_analyse class object in a couple of places: a) the regular SELECT handler (execute_sqlcom_select()) and b) the SELECT validator (needed for PS evaluation, mysql_test_select()). 4. Materialized cursors: take into account, that PROCEDURE ANALYSE installs a result filter that has a different set of input and output column Items. 5. Modify the Parser to accept strict SELECT ... PROCEDURE ANALYSE() syntax instead of generic "procedure" name/parameter processing. 6. Renew checks for bad syntaxes: * CREATE TABLE ... FROM SELECT ... PROCEDURE ANALYSE(), * CREATE VIEW ... AS SELECT ... PROCEDURE ANALYSE(), * INSERT INTO ... SELECT ... PROCEDURE ANALYSE(), * SELECT ... PROCEDURE ANALYSE() inside subquery, * SELECT ... INTO ... PROCEDURE ANALYSE(), * UNION of SELECT ... PROCEDURE ANALYSE().
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.