WL#4644: Online Backup: Fix the DDL blocker
Affects: Server-6.0
—
Status: Complete
RATIONALE --------- To fix BUG#32702. DETAILS ------- DDL blocker is used inside backup system to prevent any metadata changes during backup/restore process. It blocks DDL statements which could change the metadata. DDL blocker is implemented but it is incomplete - it does not block changes for all types of objects being backed-up (for example it does not block DROP VIEW and many others). BUG#32702 contains a list of what is and what is not blocked by the current implementation. ALTERNATIVES ------------ There are 2 alternative understandings of what a DDL blocker should do: A) Block all known DDL statements. B) Block these (and only these) SQL statements which need to be blocked to ensure BACKUP/RESTORE consistency. Ad A) Note that it is not completely clear what a DDL statement is. There are some statements which could be objected as being DDLs, such as TRUNCATE TABLE, OPTIMIZE/REPAIR TABLE, DROP/CREATE INDEX. Nevertheless, they must be blocked for BACKUP to work correctly. If DDL blocker does not block them, it becomes useless for the backup subsystem. Ad B) For the analysis what needs to be blocked, see HLS. If this semantics is implemented then a better name for the service could be picked. For example "Backup Metadata Lock", or BML for short. This worklog will implement alternative B. NOTE: The "DDL blocker" service is considered a temporary solution. In future versions it is planned to replace it by more appropriate server mechanisms such as fine-grained metadata locks.
WHAT STATEMENTS NEED TO BE BLOCKED ---------------------------------- Here we identify a *minimal* set of statements which should be blocked to ensure correct functioning of BACKUP/RESTORE commands. Backup image stores metadata for the following types of objects: - databases - tablespaces - privileges - tables - views - stored functions - stored procedures - events - triggers Hence, the following metadata changes should be frozen during backup operation: 1. Databases being backed-up should not disappear or be changed. 2. In case of "BACKUP DATABASE * ..." new databases should not appear. 3. List of objects inside each database should not change. 4. The objects in the databases should not change (their metadata). 5. The set of privileges for each database should not change. 6. Users for which privileges are stored should not disappear or change. 7. Tablespaces used by tables being backed-up should not disappear or change. The above can be achieved by blocking the following statements (they will block more than required, but I believe it is a minimal set for ensuring 1-7). DROP/ALTER DATABASE (for 1) CREATE DATABASE (for 2 [b]) CREATE/DROP per-db-object (for 3) ALTER/RENAME per-db-object (for 4) CREATE/DROP INDEX (for 4 [a]) GRANT/REVOKE (for 5) DROP/RENAME USER (for 6) DROP/ALTER TABLESPACE (for 7) per-db-object is one of: TABLE, VIEW, FUNCTION, PROCEDURE, EVENT, TRIGGER. [a] This is because CREATE/DROP INDEX alters the definition of the table. [b] This could be made optional, to be used only when "BACKUP DATABASE * ..." is executed. That can greatly improve user experience since in the typical case when specific database list is given for BACKUP, users will be allowed to create new databases. Additionally, these statements must be blocked for technical reasons: TRUNCATE TABLE - because it does DROP TABLE internally. OPTIMIZE TABLE - because it does ALTER TABLE internally. REPAIR TABLE - because it can remove/rename files storing physical table data (e.g. for MyISAM tables). The complete list of statements which need to be blocked by BML is as follows: DROP DATABASE/TABLE/VIEW/FUNCTION/PROCEDURE/EVENT/TRIGGER/INDEX DROP USER/TABLESPACE CREATE DATABASE/TABLE/VIEW/FUNCTION/PROCEDURE/EVENT/TRIGGER/INDEX ALTER DATABASE/TABLE/VIEW/FUNCTION/PROCEDURE/EVENT/TABLESPACE RENAME TABLE/USER GRANT/REVOKE TRUNCATE/OPTIMIZE/REPAIR TABLE
DDL blocker rename ================== To avoid confused objections of the type "why DDL blocker is not blocking this or that DDL statement" the service will be renamed to "Backup Metadata Lock", BML in short. Here is a list of proposed renames (only public members mentioned - for private members, names along the same lines will be chosen). ddl_blocker.{h,cc} -> bml.{h,cc} DDL_blocker_class -> BML_class DDL_blocker_class::check_DDL_blocer() -> BML_class::bml_enter() DDL_blocker_class::end_DDL() -> BML_class::bml_leave() DDL_blocker_class::block_DDL() -> BML_class::bml_get() DDL_blocker_class::unblock_DDL() -> BML_class::bml_release() In si_objects.h: obs::ddl_blocker_enable() -> obs::bml_get() obs::ddl_blocker_disable() -> obs::bml_release() obs::ddl_blocker_exception_on() -> obs::bml_exception_on() obs::ddl_blocker_exception_off() -> obs::bml_exception_off() Changes in sql_parse.cc ======================= The infrastructure ensuring blocking correct statements in sql_parse.cc will be changed. Currently methods bml_enter() and bml_leave() are called inside the big switch in the cases corresponding to the statements which should be blocked. Instead of this, the statements to be blocked will be marked with a new CF_BLOCKED_BY_BML flag: + + /* + Mark commands to be blocked by Backup Metadata Lock. See bml.cc. + */ + sql_command_flags[SQLCOM_DROP_USER]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_DB]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_TABLE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_VIEW]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_FUNCTION]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_PROCEDURE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_EVENT]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_TRIGGER]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_DROP_INDEX]|= CF_BLOCKED_BY_BML; + + sql_command_flags[SQLCOM_CREATE_DB]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_CREATE_TABLE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_CREATE_VIEW]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_CREATE_SPFUNCTION]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_CREATE_PROCEDURE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_CREATE_EVENT]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_CREATE_TRIGGER]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_CREATE_INDEX]|= CF_BLOCKED_BY_BML; + + sql_command_flags[SQLCOM_ALTER_TABLESPACE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_ALTER_DB]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_ALTER_DB_UPGRADE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_ALTER_TABLE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_ALTER_FUNCTION]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_ALTER_PROCEDURE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_ALTER_EVENT]|= CF_BLOCKED_BY_BML; + + sql_command_flags[SQLCOM_RENAME_USER]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_RENAME_TABLE]|= CF_BLOCKED_BY_BML; + + sql_command_flags[SQLCOM_GRANT]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_REVOKE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_REVOKE_ALL]|= CF_BLOCKED_BY_BML; + + sql_command_flags[SQLCOM_REPAIR]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_OPTIMIZE]|= CF_BLOCKED_BY_BML; + sql_command_flags[SQLCOM_TRUNCATE]|= CF_BLOCKED_BY_BML; Then the calls to bml_enter() and bml_leave() for the marked statements will be done inside mysql_execute_command() before and after the big switch, respectively. @@ -1993,6 +2042,15 @@ mysql_execute_command(THD *thd) if (opt_implicit_commit(thd, CF_IMPLICT_COMMIT_BEGIN)) goto error; + if (is_bml_blocked(lex->sql_command) && + !(bml_registered= BML_instance->bml_enter(thd))) + goto error; + switch (lex->sql_command) { case SQLCOM_SHOW_EVENTS: ... @@ -4670,6 +4675,10 @@ create_sp_error: thd->row_count_func= -1; finish: + + if (bml_registered) + BML_instance->bml_leave(); + if (need_start_waiting) ... Testing strategy ================ After fixing, BML will block 38 statements. To test them all, the current backup_ddl_blocker test must be refactored as straightforward extension would make it unmaintainable. The test runs different DDL statements in parallel with BACKUP/RESTORE command and checks that the required blocking happens. The main part of the test, which is executing and synchronizing the statements in several parallel threads will be moved into a "subroutine" in the form of an include file to be sourced by the main test. The main test will setup a context for the "subroutine" and source it several times to test all affected statements. Few test will be added to check that statements not handled by BML will not be blocked.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.