WL#4166: Prepared statements: automatic re-prepare

Affects: Server-5.1 — Status: Complete

Description
High Level Architecture
Low Level Design

After WL#4165 is implemented, execution of a prepared statement
may fail with a new error -- ER_NEED_REPREPARE.

This error is returned when any base table, temporary
table, or view used in the statement has been altered since
statement prepare. The purpose of validating metadata and
terminating statement execution upon a validation error is given
in WL#4165, but in short the parse tree of the prepared statement
may be no longer valid in the context of the new (altered) schema.
Indeed, a used table may be replaced with a view, or undergo
changes that abrogate some rule-based optimisations performed on
the parsed tree.

However, in cases when the parsed tree is invalid, but SQL
statement text itself is still meaningful in the new schema, the
server should not return the error to the user.

This is necessary to preserve backward compatibility with old 5.0
applications, which do not expect the new error, and also to
provide continuous operation of the server in 24x7 environments.
Generally, we should not complicate the job of client application
developers by introducing a new error, unless it is strictly
necessary.

Recovery from an error is implemented by means of re-prepare of
the terminated prepared statement.
Details of this procedure, as well as new introduced errors, 
variables and protocol flags are specified in this worklog.

A reprepare happens when execution of the prepared statement
failed with ER_NEED_REPREPARE error. 

A reprepare of a prepared statement is, in fact, a creation of a
new prepared statement from the original SQL statement text and 
in the original environment (current database, SQL mode).

Each attempt to reprepare a statement increments new 
session status variable - Com_stmt_reprepare. The global
counterpart of this variable contains the total number of
reprepare attempts done in all sessions.

If preparation of the copy fails, the whatever error that occurred
during preparation is returned to the user, and the copy is
destroyed.

Otherwise, if the new statement has been created successfully, the
parse tree of the original statement is exchanged with the parse
tree of the new one. Parameter types and values, which represent
the execution state of Item_param objects, are copied in the opposite
direction, from the original parsed tree to the new one. This produces a
prepared statement that has the new parse tree but the original
parameter data.

In addition to the status variable, the client is notified about
the automatic reprepare by means of a new result set status flag 
-- SERVER_STATUS_STATEMENT_REPREPARED.

In case the statement returns a result set, result set metadata
has been previously transferred to the client, and the number of
result set columns has changed since then, server status
flags also contain SERVER_STATUS_METADATA_CHANGED flag.
This flag indicates to the client application that it may need to
rebind its output buffers. The flag is necessary, since 4.1 and
5.0 versions of the client library could crash if the number of
columns changes between prepare and execute. 

In future additional checks could be implemented: 
 - for type
 - character set and collation
 - width and precision
 - pad space option
 - null/not null option

All MySQL connectors currently support this type of result set
metadata change by performing an automatic data
conversion whenever necessary.

Therefore, and since there is no request from any Connector to
implement the additional checks, they've been made subject of a
separate worklog entry, WL#4177 "Prepared statements: additonal
metadata checks after reprepare".

An attempt to execute a statement after reprepare may fail
again with ER_NEED_REPREPARE error: the server does not keep the
tables locked between prepare and execute, and thus a new DDL
statement may happen in between. In that case a reprepare is
executed again, but only until MAX_REPREPARE_ATTEMTS (a constant
in sql_prepare.cc) has been reached.
Since ER_NEED_REPREPARE is returned by the execution engine,
each failed reprepare attempt not only increases
Com_stmt_reprepare and Com_stmt_execute statistics counter, but
also the respective SQLCOM status counter, such as Com_insert or
Com_select.

Reprepare has a side effect on the query cache: if at the time
of reprepare the query cache is switched off, the new statement
will not use it in all subsequent executions, even if later the
cache is switched back on again.
For performance reasons, the decision to allow a prepared
statement work with the query cache is made at prepare 
time, and thus is triggered automatically at reprepare.  This
reevaluation could be suppressed, however, the whole issue is
considered too minor to justify the work necessary to preserve
full backward compatibility.

Note, that reprepare never leads to destruction of the original
statement, even if it fails. The operation is considered to be
fully automatic, and any error it generates -- intermittent. 

MySQL supports two kinds of prepared statements -- available through
extensions of the client/server API, and SQL prepared statements,
available inside stored procedures and ordinary SQL.

The differences between these two kinds of prepared statements are:
- SQL prepared statements do not return metadata to the client
at prepare. If prepare succeeded, OK status packet is returned,
similarly to INSERT or UPDATE SQL commands. On the contrary,
C API prepared statement compute and return result set metadata
(if any) at statement prepare.

- placeholders in SQL prepared statements ('?', also known as
parameter markers) can be assigned from user variables
(@variables) only. Value for placeholders of C API
parameter markers are retrieved from COM_STMT_EXECUTE
protocol packet, or, in case of blob data, assembled
from multiple COM_STMT_LONG_DATA packets

This difference in working with parameters and metadata
resulted in two entry points into statement execution subsystem in
the current server:

mysql_stmt_execute() (C API prepared statements)
mysql_sql_stmt_execute() (SQL prepared statements).

The job of the two functions above is to take care of the different
way of assignment of parameter values. 
The rest of the work is delegated to method
Prepared_statement::execute().

This structure of execution defines the shape of low level
design.
After introduction of metadata validation,
Prepared_statement::execute() can return with a new error
code. Therefore, mysql_stmt_execute() and mysql_sql_stmt_execute()
must be changed to handle the new return value.

The following new methods are introduced to support
reexecution:

Prepared_statement::execute_loop() -- try to execute
a statement, reprepare in case of validation error, and try
again until MAX_REPREPARE_ATTEMPTS has been reached.
mysql_{sql_}stmt_execute() are changed to invoke this
method instead of plain execute().

Prepared_statement::reprepare() -- reprepare a prepared
statement, called from execute_loop() in case of ER_NEED_REPREPARE
error.

Prepared_statement::validate_metadata() -- called from reprepare()
in case of reprepare success to check how much metadata of the
result set of the new prepared statement is different from the
metadata of the original prepared statement, and update the server
status in case of a significant change. As specified earlier, 
only the number of columns is currently checked.

Prepared_statement::swap_prepared_statement() -- called
from reprepare() in case of reprepare() success to 
exchange the parsed trees of the new and the old prepared
statements.

swap_parameter_array() -- called from reprepare()
in  case of success to swap the data and preserve types
of parameter markers, if any.

Use of "swapping", instead of "copying" is employed to
efficiently transfer the new parsed tree to the original
statement, while preserving the execution parameters.

Let's consider it in more detail.
Schematically, the parsed tree consists of the LEX and
its members, Query_arena instance and the parse memory root.
Among LEX members, there is a list of items to represent
prepared statement parameter markers (lex->param_list).
Parameter markers of the original statement are assigned
values, which are stored on the system heap.
"Swapping" of the parse trees exchanges the contents of the old
and the new prepared statement step by step, pointer by pointer.
New and old lex are exchanged, then new and old memory roots, and
so on. In the end of this process the original prepared statement
contains state of the new one and vice versa.
However, the original prepared statement is not ready for
execution before its parameters have been assigned.  Thus, before
destroying the new (now old) prepared statement, a similar
"swapping" is performed for parameter data, which ensures that
potentially large memory blocks for blob data are not copied.

Item_param::set_param_type_and_swap_value() -- called
from swap_parameter_array() to initialize/swap data
of an individual parameter marker.

Pseudocode of some of these methods follows below.

mysql_sql_stmt_execute()
{
  
  
  
  invoke Prepared_statement::execute_loop();
}

mysql_stmt_execute()
{
  
  
  
  invoke Prepared_statement::execute_loop();
}


Prepared_statement::execute_loop()
{
  int reprepare_attemt= 0;

reexecute:
  error= execute();

  if (error == ER_NEED_REPREPARE &&
      reprepare_attempt < MAX_REPREPARE_ATTEMPTS)
  {
    thd->clear_error();

    error= reprepare();

    if (! error)                                /* Success */
      goto reexecute;
  }

  return error;
}


Prepared_statement::reprepare()
{
  

  Prepared_statement *copy= new Prepared_statement;

  error= copy->prepare(query, query_length);

  

  if (! error)
  {
    swap_prepared_statement(this, copy);
    swap_parameter_array(param_array, copy->param_array, param_count);
    validate_metadata();
  }

  delete copy;
  return error;
}


Prepared_statement::validate_metadata(Prepared_statement *copy)
{
   if (this->param_count != copy->param_count)
      thd->server_status|= SERVER_STATUS_METADATA_CHANGED;
}


Prepared_statement::swap_prepared_statement()
{
  swap_variables(ulong, id, copy->id);
  swap_variables(MEM_ROOT *, mem_root, copy->mem_root);
  swap_variables(Item_param **, param_array, copy->param_array);
  swap_variables(LEX *, lex, copy->lex);
  ...
}


void
swap_parameter_array(Item_param **param_array1, Item_param **param_array2)
{
  Item_param *parameter1, *parameter2;
  for each (parameter1 in param_array1, parameter2 in param_array2)
     parameter1->set_type_and_swap_value(parameter2);
}


Annex: comparison to other RDBMS
--------------------------------

DB2
---

References:
http://publib.boulder.ibm.com/infocenter/db2luw/v8//topic/com.ibm.db2.udb.doc/admin/r0000948.htm

DB2 does implement a reprepare internally,
and the documentation lists all the statements that can trigger a reprepare.

This task has a moderate impact on server modules and files:
 - all prepared statement related changes reside in sql_prepare.cc, where 
   class Prepared_statement, mysql_stmt_execute(), mysql_sql_stmt_execute()
   and all introduced methods are implemented
 - Item_param class, which resides in item.{h,cc} is extended with
 method set_param_type_and_swap_value()
 - the client library (libmysql/libmysql.c) is fixed to not crash
   upon a changed number of result set columns
 - various tests are added to new ps_ddl.test in the main test suite and
   in mysql_client_test.c