WL#6242: Move "PROCEDURE ANALYSE" implementation to select_send-like class

Status: Complete   —   Priority: Medium

1. Move "PROCEDURE ANALYSE()" implementation outside of the Optimizer,
encapsulate it into a select_result successor class, that replaces
resulting data rows with "PROCEDURE ANALYSE()" statistics.
2. Remove the ability to create new "procedures" ("Procedure API").
3. Docs: remove Documentation pages about the "Procedure API":
   * 22.4 Adding New Procedures to MySQL
     https://dev.mysql.com/doc/refman/5.6/en/adding-procedures.html
   * 22.4.2 Writing a Procedure
     https://dev.mysql.com/doc/refman/5.6/en/writing-a-procedure.html
 (see details in HLS 3.)

0. Contents
===========

1.Motivation
1.1. No need for generic "procedure" API
1.2. Unused grouping stuff makes Optimizer overcomplicated
2. Conclusion
3. Changes in the Documentation
3.1. Wrong syntax
3.2. Valid syntax
3.3. Obsolete documentation pages
4. External references

1. Motivation
=============

1.1. No need for generic "procedure" API
----------------------------------------

MySQL has a very special syntax for so-called PROCEDURE ANALYSE:

  SELECT ... FROM ... WHERE ... PROCEDURE ANALYSE([max_elements,[max_memory]])

(See https://dev.mysql.com/doc/refman/5.6/en/procedure-analyse.html )

This is the only "PROCEDURE" that we support.

However, in addition to the "PROCEDURE ANALYSE" functionality implementation,
the Server has a generic framework class "Procedure" for such "procedures".

1.2. Unused grouping stuff makes Optimizer overcomplicated
----------------------------------------------------------

The Procedure framework class has fields "ORDER *group,*param_fields;"
those are related to a grouping of a query result.
The optimizer takes into account these fields like GROUP BY clauses.
This makes Optimizer more complicated to support since we have to create
temporary tables and indices for "procedures".

OTOH "PROCEDURE ANALYSE" does nothing with either grouping or
Procedure::group/param_fields variables.

It seems like, Procedure::group and Procedure::param_fields variables are a
legacy of earlier MySQL versions (pre-3.23):

static struct st_procedure_def {
  const char *name;
  Procedure *(*init)(THD *thd,ORDER *param,select_result *result,
                     List<Item> &field_list);
} sql_procs[] = {
#ifdef USE_PROC_RANGE
  { "split_sum",proc_sum_range_init },          // Internal procedure at TCX
  { "split_count",proc_count_range_init },      // Internal procedure at TCX
  { "matris_ranges",proc_matris_range_init },   // Internal procedure at TCX
#endif
  { "analyse",proc_analyse_init }               // Analyse a result
};

USE_PROC_RANGE conditional constant is never defined in MySQL.

"Internal procedure at TCX" commentary says, that long time ago the TcX AB firm,
where Monty developed early Server before MySQL AB firm, had additional 3
"procedures" for its internal needs: "split_sum", "split_count" and
"matris_ranges". Currently MySQL code lack of these "procedures".

So, nobody uses grouping stuff of the Procedure class, however we have to
support it.

There is no easy way to write regression tests for this unused stuff,
so it makes further development and refactoring of the Optimizer
overcomplicated.

2. Conclusion
=============

We remove:
  a) the Procedure class,
  b) all references to Procedure API from the Optimizer.

The we convert the "PROCEDURE ANALYSE()" implementation (the "analyse"
class) to some select_result successor class.

The object of this class will wrap the original select_result output stream
and convert data into "PROCEDURE ANALYSE()" statistics.

Pros:
  * The Optimizer doesn't rely on the Procedure stuff any more.
  * The syntax of SELECT ... PROCEDURE ANALYSE() is not fully documented:
    after the number of bugs the syntax of underlying SELECT has been
    significantly limited to enclose buggy places.
    This modification allows to remove most of SELECT ... PROCEDURE ANALYSE()
    syntax limitations.

Contras:
  This select_result successor is a bit unique: we already have
  a) interceptors to catch internal data streams inside complex
     INSERT/UPDATE/DELETE commands and
  b) output transformer for EXPLAIN command,
  but none of them serve a regular SELECT query.
  However, the SELECT ... PROCEDURE ANALYSE() is a regular SELECT query,
  so it should work in all SELECT-enabled context, i.e. as a source
  query for materialized cursors.
  The interception of materialized cursor source stream is a bit unusual
  task.

Risks:
  There are may be [not very complex] bugs.

It would be nice to push the implementation before the WL#6071 "Inline tmp
tables into nested loops algorithm" to make its code coverage testing
easier.

3. Changes in the Documentation
===============================

To Docs:

3.1 Wrong syntax
----------------

Please note in the Documentation, that PROCEDURE ANALYSE is applicable to
*regular* *top-level* SELECT queries only, those return their result in a form
of a regular *recordset* (not "SELECT INTO").

The following queries are syntax errors:
   * CREATE TABLE ... FROM SELECT ... PROCEDURE ANALYSE(),
   * CREATE VIEW ... AS SELECT ... PROCEDURE ANALYSE(),
   * INSERT INTO ... SELECT ... PROCEDURE ANALYSE(),
   * SELECT ... PROCEDURE ANALYSE() inside subquery,
   * SELECT ... INTO ... PROCEDURE ANALYSE(),
   * UNION of SELECT ... PROCEDURE ANALYSE().

3.2. Valid syntax
-----------------

You may want to document two valid complex syntaxes for sure (manual doesn't
ban any of them):
    
a) DECLARE ... CURSOR FOR SELECT ... PROCEDURE ANALYSE();

   This query creates a cursor that returns the same result as
   underlying SELECT ... PROCEDURE ANALYSE();
   This is the pre-WL behaviour and it is covered with old tests.

b) EXPLAIN SELECT ... PROCEDURE ANALYSE();

   Before the WL this construction was disabled as a syntax error.
   In the current implementation this is a valid syntax.
   This EXPLAIN shows a real Query Execution Plan like "EXPLAIN SELECT"
   without "PROCEDURE ANALYSE" does.
   Note: after this WL, both SELECTs with or without "PROCEDURE ANALYSE"
         have the same QEP.
   
3.3 Obsolete documentation pages
--------------------------------

1. Please remove these obsolete pages:

* 22.4 Adding New Procedures to MySQL
  https://dev.mysql.com/doc/refman/5.6/en/adding-procedures.html

* 22.4.2 Writing a Procedure
  https://dev.mysql.com/doc/refman/5.6/en/writing-a-procedure.html

2. Please move this page to a different node (without "adding new procedures"
title).

* 22.4 Adding New Procedures to MySQL :: 22.4.1 PROCEDURE ANALYSE
  https://dev.mysql.com/doc/refman/5.6/en/procedure-analyse.html


4. External references
======================

MySQL 5.6 Reference Manual :: 22 Extending MySQL :: 22.4 Adding New Procedures
to MySQL
  https://dev.mysql.com/doc/refman/5.6/en/adding-procedures.html

MySQL 5.6 Reference Manual :: 22 Extending MySQL :: 22.4 Adding New Procedures
to MySQL :: 22.4.1 PROCEDURE ANALYSE
  https://dev.mysql.com/doc/refman/5.6/en/procedure-analyse.html

MySQL 5.6 Reference Manual :: 22 Extending MySQL :: 22.4 Adding New Procedures
to MySQL :: 22.4.2 Writing a Procedure
  https://dev.mysql.com/doc/refman/5.6/en/writing-a-procedure.html

How to write "procedure" extensions:
  https://forge.mysql.com/wiki/MySQL_Internals_Result_Postprocessing_Procedures

MySQL Performance Blog, "PROCEDURE ANALYSE", March 23, 2009 By Peter Zaitsev:
  http://www.mysqlperformanceblog.com/2009/03/23/procedure-analyse/
1. Remove the Procedure class and all references to it from the Optimizer;
   also remove:
   * struct st_procedure_def,
   * class Item_proc_real (unused),
   * setup_procedure function,
   * add_proc_to_list function,
   * setup_new_fields function.

2. Rename the "analyse" class to "select_analyse" and inherit it from the
   select_send class. (select_interceptor class is not very well suited to
   be a parent for the select_analyse since it was designed to not output
   any data).

3. Wrap the original select_send output stream with the select_analyse class
   object in a couple of places:
     a) the regular SELECT handler (execute_sqlcom_select()) and
     b) the SELECT validator (needed for PS evaluation, mysql_test_select()).

4. Materialized cursors: take into account, that PROCEDURE ANALYSE
   installs a result filter that has a different set of input and
   output column Items.

5. Modify the Parser to accept strict SELECT ... PROCEDURE ANALYSE() syntax
   instead of generic "procedure" name/parameter processing.

6. Renew checks for bad syntaxes:
   * CREATE TABLE ... FROM SELECT ... PROCEDURE ANALYSE(),
   * CREATE VIEW ... AS SELECT ... PROCEDURE ANALYSE(),
   * INSERT INTO ... SELECT ... PROCEDURE ANALYSE(),
   * SELECT ... PROCEDURE ANALYSE() inside subquery,
   * SELECT ... INTO ... PROCEDURE ANALYSE(),
   * UNION of SELECT ... PROCEDURE ANALYSE().