WL#2634: Stored Procedures: provide debugger

Affects: Server-7.1 — Status: Assigned

Description
High Level Architecture

MySQL will provide a debugger for stored procedures.

Rationale:
-- From Josh Chamas, in an email dated May 16/2005
(thread: Dynamic SQL / Exec() for B1 ?)
> When putting together the Oracle Migration session 
> at the last UC, I found these to be a priority to support 
> in the future:
> 
> Stored Procedures Wish List
> ---------------------------
> Debugger, Better Compiler Errors
> Error Propagation (SIGNAL) to be fixed for 5.1
> Packages/Modules
> Dynamic SQL (EXECUTE IMMEDIATE)
> Scrollable & Updatable Cursors

The following is a work of fiction. Any resemblance to real
plans, living or dead, is purely coincidental.

General
-------

MySQL will allow remote debugging of SQL stored
procedures, functions, triggers and events. This will be
possible entirely with SQL statements. No changes
to CREATE statements or user processes are required.
You can take over a running task. Overhead in the debugged
routine is high.

Protocol
--------

Debugging is remote. That is, you are not debugging
your own stored procedure, you are debugging the
procedure on another connection. This makes it
possible to debug events, and should reduce
interference that would occur if the debugging
job is also executing SQL statements.

With remote debugging, there are two connections: the
debugger and the debugged (sometimes called the "debugee"
or "debuggee").

Every job, at the end of every statement, is checking:
does somebody want to debug me? If not, there is no
overhead other than that end-of-statement check.

The debugging job, when the statement DEBUG START
occurs, tells the potential debugged job: debug-start.

The potential debugged job sees this debug-start, and
decides whether to accept. There are various reasons
for rejecting; see below.

If the potential debugged job accepts, it becomes a
debugged job. It sets up a shared area for the debugger
to see. A debugged job will "halt" at the
end of every statement, that is, it will stop until
the debugger tells it to continue.

When a debugged halts, it fills the shared area with:
- the completion data (error number, error message)
  for the end of the statement
- everything that would appear with SHOW VARIABLES
- the name of the procedure that it is in (NULL if
  it's not in a procedure) (a string of procedure
  names if it's in a procedure that's being called
  from another procedure)
- the ordinal statement number within the procedure
- the name of every DECLAREd variable that is in
  scope in the procedure, and its data type, and
  its value (which might be truncated).
Now the debugged waits. SHOW PROCESSLIST will show
that its state is "debug/halt".

The debugger may examine the shared area with a
statement that looks much like SELECT. The debugger
may set a byte in the shared area meaning "continue"
with a statement that looks like STEP [OVER|INTO],
or a byte meaning "debugging ends" with a statement
that looks like STOP.

Thus a debugged is stopping after every statement in
the procedure and moving hundreds of items to the
shared-memory area. Not fast. On the other hand, any
non-debugged has no overhead except the end-of-statement
check.

When a debugged is halted, the only thing it knows
how to do is check whether the debugger wants it to
continue. The debugger, then, is the genius which does
all the complex thinking.

Syntax
------

All debugging statements begin with the word DEBUG.
They are legal inside routines!

DEBUG START connection-ID [characteristics];

  You can find a connection-ID with SHOW PROCESSLIST.

  DEBUG START 5 links the debugger with the job on
  connection number 5, which will become the debugged.

  Possible errors:
  "Connection is already being debugged"
  "Debuggers cannot be debuggeds"
  "You've already said DEBUG START"

  You can set some characteristics of the link at this time:
  - Maximum number of breakpoints
  - Whether it's okay to keep things in a locked state indefinitely
    (the debugged can't ignore KILL, but can ignore timeout)
  - Whether the debugged can ignore timeout
  - The size of the shared area, which is fixed at DEBUG START
    time, so if it's too small there will be truncation.

  Once a successful link occurs, the debugged is halted.

DEBUG BREAK [ ON | OFF ]
   { PROCEDURE | FUNCTION | TRIGGER | EVENT } name
   [ statement-number ]

  DEBUG BREAK ON sets a breakpoint. That is, it identifies a place
  where the debugger will not tell the debugged to continue.

  The [ statement-number ], if omitted, is the first
  statement in the procedure. Statement numbers are fairly
  easy to count, just look for semicolons in routine_description,
  except for a few (BEGIN/LOOP/WHILE/REPEAT/IF).
  Negative statement-numbers are counted backwards from routine end.

  There are other ways to identify a statement:
  - Line number. MySQL strips line breaks from the original
    CREATE statement, but it could put them back in later.
  - Statement type. For example "DEBUG BREAK ON INSERT"
    would put a breakpoint on every INSERT statement in
    the procedure.
  - Label. But labels can be ambiguous, they depend on scope.
  - Handler type. For example DEBUG BREAK ON SQLEXCEPTION
    would put a breakpoint on the first statement within any
    DECLARE CONTINUE|EXIT HANDLER FOR SQLEXCEPTION block.
  But statement-number is sufficient, other ways are redundant.

  It is possible to specify a statement which is not executable
  because it's purely declarative, it's the word BEGIN, it's
  been optimized away, or something like that. That doesn't matter,
  there's no message, there's only a non-functional breakpoint.

  Possible errors:
  "Name not found"
  "Too many breakpoints"

  To cancel a breakpoint, use DEBUG BREAK OFF.

  How it works: the debugger stores the breakpoint location.
  It tells the debugged: go forward one statement.
  The debugged goes forward one statement and halts,
  filling in the shared area with (among many other things)
  procedure name and statement number.
  The debugger reads the shared area to see whether the
  debugged has reached the breakpoint location.
  If not: loop: it tells the debugged: go forward one statement.
  The loop goes on until an error occurs or a breakpoint is hit.

DEBUG STEP [ INTO | OVER [n] ]

  STEP INTO means if the current statement is "CALL p", then enter
  the routine p -- or, if the current statement contains a function call,
  or causes a trigger, then step inside the function|trigger.
  STEP OVER means go to the next statement (the next statement that
  is to be executed, that is). STEP OVER n means skip the next n
  statements. STEP OVER would effectively mean DEBUG CONTINUE|RESTART
  i.e. just keep going till the end or till the next breakpoint.

  How it works: the debugger starts a counter.
  It tells the debugged: go forward one statement.
  When it sees that the debugged has halted,
  it decrements the counter and repeats the instruction.
  The loop goes on until an error occurs or a breakpoint is hit
  or the counter is 0.

DEBUG STOP

  This breaks the link between debugger and debugged.

SELECT ... FROM shared_area;

  The "shared_area" view is a one-row table containing
  all the variables that the debugged put in the shared area.

Example
-------

Suppose this stored procedure exists:
CREATE PROCEDURE p ()
BEGIN
  DECLARE v INT;
  SET v = 0;
  WHILE v < 2 DO
    INSERT INTO t VALUES (v);
    SET v = v + 1;
    END WHILE;
  END//

On connection#2, the debugger, enter:
DEBUG START 1;

Connection#1, the debugged, accepts and halts.

On connection#2, enter:
BREAK p;

Nothing happens. The name 'p' is stored on the debugger.

On connection#2, enter:
STEP OVER;

Now the debugged loops, executing one statement at a
time and halting, until its last statement was:
CALL p();

The debugger sees this and does not tell the debugged
to continue stepping. The loop ends.

The debugger says:
SELECT shared_area.v FROM shared_area;

The result is a NULL, because v is not declared yet.
(References to not-yet-declared variables are not errors.)

The debugger says:
CREATE PROCEDURE p2 ()
BEGIN
  WHILE shared_area.statement <> 'END' DO
    SELECT shared_area.statement AS a, shared_area.v AS v
    FROM shared_area;
    END WHILE;
  END//
CALL p2()

The result will be:
a                                 v
--------------------------------- ----
BEGIN                             NULL
DECLARE v INT;                    NULL
SET v = 0;                        0
WHILE v < 2 DO                    0
INSERT INTO t VALUES (v);         0
SET v = v + 1;                    1
END WHILE;                        1
WHILE v < 2 DO                    1
INSERT INTO t VALUES (v);         1
SET v = v + 1;                    2
END WHILE;                        2
END                               2

The example indicates that, by allowing DEBUG statements
inside procedures, MySQL has no need for watches.

Max_debugs
----------

There is a variable max_debugs, for the maximum number of debug links
that can be active at one time. The default is zero and zero means zero.

Any session may declare itself to be undebuggable by setting its
session max_debugs variable to zero. This requires no privileges.

Somewhere else, perhaps via SHOW STATUS, you can find
out how many debug sessions are currently going on.

Privileges
----------

Privileges must be tough, because the original procedure-writer had
no intention that the variable values and instructions would be
visible outside. Privileges must be tough, because the debugged
might be halting after locking tables.

You'll need at least EXECUTE privilege on every routine that you
can break on or step into. You'll also need SELECT privilege on
every table/column referenced in the routines, because column
values might be copied to, or compared with, variables. And
(let's just throw this in) you'll need PROCESS privilege.

Miscellaneous limitations
-------------------------

You can't debug for a connection on a different instance.

If a connection gets lost, the debugger goes off.

You cannot see, for example, what locks are held. The debugger
has to work with the monitor to see that. And there's no monitor.

There's nothing "graphic" about the debugging, and figuring
out locations (by counting statements) is tedious + tricky.
The intent is to be universal (work on any platform) and internal
(require no tool outside the server). Somebody from GUI
development, from Merlin, or from an outside tool maker should
be able to write a GUI debugger based on all of the
command-line debugging that the server supplies.

(By the way, the MySQL Reference Manual says:
"Additional GUI Tool features to ease development and
debugging of Stored Procedures, etc are under development.")

A user who is entering statements on the mysql command-line
interface may be irritated to discover that a debugger has
taken over the connection and is executing a statement at a
time, with long pauses. There will be no helpful error like:
"All incoming statements and actions (except control-C or
QUIT) will be ignored while connection N is debugging this."

You cannot set a breakpoint within a statement, or see values
within a statement (such as column values).

You cannot see what the debugged is outputting (back to the client,
or to a log).

You cannot change what the debugged does, or what the debugged sees.
For example, you cannot say that the debugged will now exit with an
error, and you cannot say that the debugged's @variable value is 77.
That would be tough, it would require real two-way communication.
And it would be a nightmare with respect to privileges and audits.

Timeouts still apply for the debugged. If the debugged is idle
(waiting for the debugged to tell it to continue) for 10 minutes,
and the timeout variable says ten minutes is maximum, then the
debugged goes away. And that kills the debugger/debugged link
because if the debugged starts later it has a different connection number.

The specific instructions are not the same as those used by other
implementations of SQL debuggers by other DBMS vendors.

You can do nothing with external routines. Other DBMSs can,
by using the host language's debugger.

One debugger can only control one debugged. There is no syntax
like DEBUG START ALL followed by connection-specific breakpoint
setting and selecting.

Abbreviations are illegal, even though various folk will want to
say "D S" instead of "DEBUG START".

Developments in early 2010
--------------------------

There has been a suggestion of debugging MySQL via Oracle SQL Developer.

Konstantin Osipov started a discussion on MySQL's bi thread here:
[ mysql intranet ] /secure/mailarchive/mail.php?folder=2&mail=9619

References
----------

MySQL Forge "Debugging Stored Procedures" (ingenious)
http://forge.mysql.com/wiki/DebuggingStoredProcedures

Oracle JDeveloper debugger
http://www.oracle.com/technology/obe/obe10gdb/develop/plsql/plsql.htm

Debugging PL/SQL with DBMS_DEBUG
http://www.adp-gmbh.ch/ora/plsql/debug.html

forums.mysql.com thread
http://forums.mysql.com/read.php?98,68379,68688

To debug with SQL Server 2005, you need Visual Studio now.
That has irritated some people,
http://www.eggheadcafe.com/forumarchives/SQLServerclients/Nov2005/post24682873.asp
... But if you intend to combine SQL programs with host programs, that's not so bad.

EnterpriseDB debugger (for PostgreSQL)
http://www.enterprisedb.com/products/enterprisedb_debugger.do

email discuss thread (Zinner, Kodinov, Karlsson, Zoratti, Milivojevic, Lischke)
"Re: MySQL Stored Procedures debugger - http://www.mydebugger.com/"
starting with
https://intranet.mysql.com/secure/mailarchive/mail.php?folder=6&mail=14002