WL#4062: Online Backup: Metadata freeze

Affects: Server-6.0   —   Status: Complete

RATIONALE
---------

It should not be possible to execute DDL etc during backup/restore since
the backup/restore kernel is not built for it.

So we need to block those statements.


IMPLEMENTATION
--------------

Implementation of freeze_metadata()

(Each SQLCOM in sql_parse.cc which affect metadata should be blocked 
with mutex or similar, so that they can not be executed while a backup is 
ongoing.)

NOTE
----

Think if mechanisms used to implement meta-data freeze can be also used to 
ensure only one BACKUP/RESTORE operation at a time.

See WL#3573 for work completed for capturing metadata for table CREATE, stored 
procedures and functions.

REQUIREMENT
-----------
backup kernel should block OPTIMIZE|REPAIR|TRUNCATE TABLE. There are
DBUG_ASSERT in the MyISAM code to verify they do not happen.
DDL Commands Supported
----------------------
The first release of this work shall support blocking of DDL commands that 
correspond to the objects supported by online backup. These commands are:

  * CREATE TABLE
  * ALTER TABLE
  * RENAME TABLE
  * REPAIR TABLE
  * OPTIMIZE TABLE
  * DROP TABLE
  * CREATE DATABASE 
  * DROP DATABASE
  * RENAME DATABASE
  * ALTER DATABASE 
  * TRUNCATE TABLE
  * CREATE INDEX
  * DROP INDEX

Implementation Concept
----------------------
The metadata freeze solution, hence "DDL blocker," HLS shall be implemented as 
follows. Note that the ideal solution is to have the freeze affect only those 
tables that are in the backup. However, this will not be the case for the first 
release. See limitations listed in the LLD below.

There will be several helper methods created to encapsulate the DDL blocker 
functionality. The following methods shall be defined in sql_parse.cc.

A method will be used to check a condition variable and block DDL statements if 
the DDL blocker is turned on. This will be inserted in all locations in 
the "big switch" for all DDL commands described above. 

A method will used to block the backup from running when the DDL operations are 
in progress. This method shall signal the backup when done.

Requirements
------------
The DDL blocker shall ensure the following:

  * Any DDL command supported shall block a backup operation from
    running until completed. 
  * Once the backup operation has fired the DDL blocker, all DDL
    commands are blocked until backup is completed.

Note: "backup operation" in this case refers to either backup or restore.

Historical Reference
--------------------
The following is the original HLS. It has been retained to give a perspective 
between the original concept and the implemented concept (as described above).

> To make it easy to integrate solution with the rest of the code
> let's encapsulate it in dedicated functions:
> 
> int freeze_metadata(...)
> int unfreeze_metadata(...)
> 
> What arguments are needed needs to be decided (right now looks 
> like none, but list of tables being backed-up might be needed). 
> 
> It is also possible that locking would require 2 stages: first
> to collect list of tables which need to be backed-up, second 
> to block DDL operations on these tables. Then we can add extra 
> function:
> 
> int freeze_prepare(...)
> 
> These functions will be used by backup kernel as follows:
> 
> freeze_prepare(...)
> 
> 
> 
> freeze_metadata();
> 
> ;
> 
> unfreeze_metadata();

Implementation Details
----------------------
The DDL blocker shall be implemented using a condition variable to block the 
supported DDL commands. A single method shall be used to implement this 
functionality.

The DDL blocker shall be implemented using a counter to keep track of the 
number of supported DDL operations in progress. The backup operation shall 
block until that counter reaches 0. A mutex shall be used to ensure that no two 
threads can change the counter at the same. 

Thus, the DDL blocker is two parts – one to block DDL operations and another to 
block backup operations.

Implemented in sql_parse.cc
---------------------------
The following methods shall be implemented inside ./sql/sql_parse.cc and 
accessed from the CASE statements for the supported DDL commands:

static my_bool check_DDL_blocker(THD *thd) – This method checks to see if there 
is a backup running and if so wait on the condition variable until backup 
signals. This method is called prior to executing the DDL command.

static void block_backup() – This method increments the counter to indicate 
another DDL operation is in progress. This method is called inside the 
check_DDL_blocker but implemented separate for clarity.

static void unblock_backup() – This method decrements the counter and if it 
reaches 0 signals a waiting backup operation. This method is called once the 
DDL operation is complete.

Note: the block and unblock methods use the same mutex as the backup’s wait 
loop.

Implemented in sql_backup.cc
----------------------------
The following methods shall be implemented inside ./sql/backup/sql_backup.cc 
and accessed from the execute_backup_command() method.

my_bool block_DDL(THD *thd) – This method is used to block all DDL commands. It 
checks the counter DDL_blocks and if > 0 it blocks the backup until all DDL 
operations are complete and the condition variable has been signaled. The 
method also sets the boolean DDL_blocked to TRUE to tell the DDL operations 
that they must block until the backup operation is complete.

void unblock_DDL() - This method is used to unblock all DDL commands. It sets 
the boolean DDL_blocked to FALSE to tell the DDL operations that they can 
proceed.

Testing Philosophy
------------------
The test shall be written to ensure the following assumptions are true for all 
DDL commands supported:

  * DDL in progress are not blocked and backup blocks
  * backup blocks all DDL even if not part of backup
  * DDL operations do not block each other

Limitations
-----------
  * The DDL blocker shall be limited to only the DDL
    operations on tables. This is because the online
    backup release is only concerned with tables and
    does not support views, events, etc. It is also
    in part because there is ongoing work concerning
    blocking DDL and as such this implementation will
    be temporary.  
  * All DDL operations for tables are blocked whether
    they are on the tables in the backup operation or
    not. This allows a simpler implementation and follows
    the same philosophy concerning the temporary use of
    this work.
  * In light of the above restriction, the DDL blocker for
    the restore operation must occur after the restore has
    restored the metadata for the tables in the restore. 
    While the next operation is to block DDL operations,
    it is possible for another thread to start a DDL 
    operation on one of the tables to be restored. The 
    implementation for this worklog shall attempt to 
    overcome this design limitation if possible.