WL#3433: Stored Procedures: Dynamic Cursors

Affects: Server-7.1 — Status: Un-Assigned

Description
High Level Architecture

Well-supported dynamic cursors have four characteristics:

1. Result Set from a 'SELECT' in a variable, not a SELECT
that's hard-coded in the DECLARE CURSOR statement.

2. Metadata about the number of columns and their data types.

3. Variable definition of variables, that is, "DECLARE
variable data_type" statements (or something equivalent),
for as many variables as there are in the 'SELECT'
variable, with declared-variable data types the same as
data types of the columns of the 'SELECT' result set.

4. FETCH statements that bring in exactly the number of
columns in the 'SELECT' result set, instead of the fixed
number that MySQL has now with "FETCH c INTO v1,v2,v3;".

Here are four different proposals. All accomplish the same
thing, approximately. Feel free to add more.

Proposal #1: Do Nothing
-----------------------

Users can get the "dynamic cursor" effects with
temporary tables. To create the result set: use
a CREATE TEMPORARY TABLE ... AS SELECT. To view
the metadata: use information_schema with
table_name = temporary table name. To define
variables: DECLARE, in advance, as many variables
as could possibly be necessary for a SELECT,
with a generic data type like VARCHAR (just
depend on MySQL's automatic data type conversion
if the actual data type is not VARCHAR).

Example:

CREATE PROCEDURE p1 (select_statement VARCHAR(255))
BEGIN
  DECLARE v1,v2 VARCHAR(255);
  DECLARE c CURSOR FOR SELECT * FROM t;
  SET @v = CONCAT('create temporary table t as ',select_statement);
  PREPARE stmt1 FROM @v;
  EXECUTE stmt1;
  OPEN c;
  FETCH c INTO v1,v2;
  SELECT v1,v2;
END//

Actually this works well enough that we can propose
it as a workaround for people who complain about the
lack of support for dynamic cursors. And perhaps all
the other Proposals will merely be new syntaxes which
hide the use of a temporary table.

The disadvantages are:
- The temporary table is not local to the procedure, so
  two different procedures must use different names,
  and two invocations of the same procedure must avoid
  "table already exists" messages.
- The 'SELECT' variable (@v in the example) is not
  local to the procedure either, although there is an
  intent to fix that (WL#2793).
- The FETCH is still fixed, so one would have to add
  NULLs to the SELECT list so that the number of columns
  in the result set will equal the number to be FETCHed.
- not all SHOW statements have an equivalent INFORMATION_SCHEMA.
  Currently there is no way to retrieve their result
  set via a cursor.

Proposal #2: Rowtype
--------------------

We can copy the Oracle %ROWTYPE, which is
(I quote the Oracle manual here) for "a record type
that represents a row ... from a previously declared
cursor. Fields in the record and corresponding columns
in the row have the same names and datatypes."
WL#2954, in raw-idea bin, mentions %ROWTYPE.

Our stored-procedure syntax differs from Oracle's, so the
following is a hypothetical example using %ROWTYPE with
MySQL syntax, not Oracle syntax.

CREATE PROCEDURE p2 ()
BEGIN
  DECLARE cursor1 CURSOR FOR SELECT c1,c2 FROM t; 
  BEGIN
    DECLARE v cursor1.%ROWTYPE;
    OPEN cursor1;
    FETCH cursor1 INTO v;
    SELECT v.c2;
  END;
END//

Here, v is a row variable which has two parts,
c1 and c2. Their names and definitions come from
the names and definitions in the cursor's SELECT
list. In the example, v can be a qualifier, so
"SELECT v.c1" means "SELECT the value that was
just FETCHed for the c2 column in the cursor's
SELECT list".

The disadvantages are:
- %ROWTYPE is non-standard
- our %ROWTYPE isn't very Oracle-like either,
  although Oracle users should understand what
  we're doing immediately
- we don't have the metadata for v, probably
  it would be best to "extend" by allowing
  references to ordinal numbers, for example
  'v.[2]' is the same as 'v.c2'.

Proposal #3: Allocations
------------------------

We can do the cursors and variable declarations with
dynamic ALLOCATE statements, as opposed to DECLAREs.
This is vaguely like what happens with embedded SQL,
and is perhaps closest to the standard (SQL:2003
non-core feature B031 "Basic dynamic SQL" for module
language). The important new statements are ALLOCATE
CURSOR, ALLOCATE DESCRIPTOR, DESCRIBE, and GET.

Example:

CREATE PROCEDURE p3 ()
BEGIN
  DECLARE v1 INT;
  DECLARE v2 CHAR(64) CHARACTER SET UTF8;
  PREPARE stmt1 FROM 'SELECT 5 AS a';
  ALLOCATE c CURSOR FOR stmt1;
  ALLOCATE DESCRIPTOR d;
  DESCRIBE OUTPUT stmt1 USING DESCRIPTOR d;
  OPEN c;
  FETCH c USING d;
  GET DESCRIPTOR d v1 = CARDINALITY;
  GET DESCRIPTOR d VALUE 5 v2 = NAME;
  CLOSE c;
END//

ALLOCATE CURSOR will create a cursor.
It's not DECLARE so it doesn't have to be
right at the procedure start. Therefore it
can follow PREPARE, without requiring a
separate BEGIN ... END like the one in
Proposal#1.

ALLOCATE DESCRIPTOR will create a space for
the descriptor. That is: it's just a malloc.

DESCRIBE OUTPUT will fill the descriptor
with the metadata for the prepared statement.
I used "DESCRIBE OUTPUT stmt1" in the example
but I could just as easily have said "DESCRIBE
OUTPUT c", that is, I could use either the
prepared statement name or the cursor name.

FETCH will bring in the data, but not directly
into declared variables. Instead, think of the
descriptor as having pointers to the data as a
result of the FETCH.

Now the descriptor has both metadata and
pointers to data. There are many descriptor
fields, we can access them with GET statements.
"GET DESCRIPTOR ... CARDINALITY" will place the
"number of columns in the result set" into the
declared variable v1. "GET DESCRIPTOR ... VALUE
5 ... NAME" will place the "name of the fifth
column in the result set" into the declared
variable v2.

The disadvantages are:
- This can be complicated, which may not be obvious
  because I've left out all the details.
- The technique is more commonly used in
  embedded SQL than in stored procedures,
  and I think Monty dislikes things that
  look like embedded SQL.

Proposal #4: User-Defined Type
------------------------------

This is a variant of Proposal#2. We can play with
syntax that has %ROWTYPE's effect, but looks more
like standard syntax than Oracle syntax.

Example:

CREATE PROCEDURE p4 ()
BEGIN
  DECLARE c CURSOR FOR stmt1;
  BEGIN
    CREATE TYPE t FROM c;
    BEGIN
      DECLARE v t;
      OPEN c;
      FETCH c INTO v;
    END;
  END;
END//

Here, we're creating a new data type in an
executable statement ("CREATE TYPE ..."), and
it's a row data type, so we can use it just
like we used cursor1.%ROWTYPE for the 'v'
variable in Proposal#2.

Disadvantages:
- Although CREATE TYPE looks like standard SQL,
  this particular example isn't.
- People might actually prefer the %ROWTYPE syntax.

References
----------

Other worklog entries:
WL#2793 Prepared Statements: convenient SQL syntax for use in Dynamic SQL
WL#2954 Procedures - Add support for Oracle's PL/SQL "TYPE" matching in variable
declerations

Feature requests:
BUG#17153 Cursors impossible on dynamic SQL

Oracle, "Fundamentals of the PL/SQL Language"
http://www.stanford.edu/dept/itss/docs/oracle/10g/appdev.101/b10807/02_funds.htm#i27306