WL#3726: DDL locking for all metadata objects
Affects: Server-5.5
—
Status: Complete
NOTE: This task is needed for implementation of foreign keys. Progress reports for this work can be found in WL#148. Implement DDL locking for all metadata objects. This is necessary to solve BUG#25144 "replication / binlog with view breaks".
Resolution of the architecture meeting in Stockholm on BUG#25144 and BUG#989: Extend open_cache to be: 1) transaction-wide 2) supports all objects types: * base tables * views * stored procedures and functions * time zone tables * databases 3) designed to contain only relevant data Version: 5.1 (changed to version 6.0 after discussion; this change is considered too risky for 5.1 at this late stage). This worklog only implements item 2) and 3). Please see WL#4284 for support of metadata locks across multiple statements in a transaction.
Table of contents ================= Introduction Part I -- The new MDL API. A. The new API A-1 Main data structures A-2 Initialization and destruction A-3 Acquirer's primitives to work with lock objects A-4 Acquisition of locks A-5 Releasing locks A-6 Utility functions B. Deployment of the API. C. Metadata lock on a view. Part II -- Transactional locks. Appendix A. Future work. Introduction ============ This worklog task consists of two parts. Part I introduces a new API for metadata locking. Its goal is to encapsulate locking functionality in a standalone and extensible module. This part also aligns implementation of auto-repair with the locking protocol supported by the system (for details see Deployment of the API). It generalizes the concept of a metadata lock and allows to lock any object type, including views and stored programs. Finally, it fixes BUG#25144, which root cause is in absence of metadata locking for views. Part II introduces transactional metadata locks, i.e. extends life span of a metadata lock from one statement to one transaction. It's primary goal is to fix BUG#989, which occurred because currently MySQL keeps no metadata locks for tables used in a transaction. Note: Part II is not yet ready and is not submitted for a review. Please refer to MySQL internals manual for the description of the current system and terminology: http://forge.mysql.com/w/images/0/0a/Mdl.pdf. Part I Metadata locking API. ============================ A. The new API. --------------- Implementation of metadata locks is moved from a table cache into a standalone module that has no dependencies on other parts of the system. The interface of the module is clearly defined: functions are provided to acquire shared and exclusive locks and release locks. The current locking protocol and algorithms, however, are kept mostly unchanged: improvement of locking algorithms is not the goal of this task. Since MySQL metadata locking relies on a number of external invariants to resolve deadlocks, the module communicates with the rest of the server by means of 2 identified external functions, used at particular moments of lock acquisition. Correct acquisition of metadata locks ensures consistency of the data dictionary, statement and row based replication. (See chapter "What metadata locks protect" in metadata locking documentation). The metadata locks are acquired for every operation on the database state. They are intended to be acquired first order during statement execution and thus the API requires nothing but object names to operate (compare with the current API which requires a TABLE_LIST instance to represent a lock request). A-1 Main data structures ------------------------ /** Context of the owner of metadata locks. I.e. each server connection has such a context. */ struct MDL_CONTEXT; /** A pending lock request or a granted metadata lock. A lock is requested or granted based on a fully qualified name and type. E.g. for a table the key consists of <0 (=table)>++ . Later in this document this triple will be referred to simply as "key" or "name". */ struct MDL_EL; /** The lock context. Created internally for an acquired lock. For a given name, there exists only one MDL_LOCK instance, and it exists only when the lock has been granted. Can be seen as an MDL subsystem's version of TABLE_SHARE. */ struct MDL_LOCK; A-2 Initialiaztion and destruction ---------------------------------- /** Initialize the metadata locking subsystem. This function is called at server startup. In particular, initializes the new global mutex and the associated condition variable: LOCK_mdl and COND_mdl. These locking primitives are implementation details of the MDL subsystem and are private to it. Note, that even though the new implementation adds acquisition of a new global mutex to the execution flow of almost every SQL statement, the design capitalizes on that to later save on look ups in the table definition cache. This leads to reduced contention overall and on LOCK_open in particular. Please see the description of mdl_acquire_shared_lock() for details. */ void mdl_init(); /** Release resources of metadata locking subsystem. Destroys the global mutex and the condition variable. Called at server shutdown. */ void mdl_destroy(); /** Initialize a metadata locking context. This is to be called when a new server connection is created. This is a no-cost operation to just reset to NULL the list of lock requests that this originator currently has. */ void mdl_context_init(MDL_CONTEXT *context); /** Destroy metadata locking context. Assumes and asserts that there are no active or pending locks associated with this context at the time of the destruction. Currently does nothing. Asserts that there are no pending or satisfied lock requests. The pending locks must be released prior to destruction. This is a new way to express the assertion that all tables are closed before a connection is destroyed. */ void mdl_context_destroy(MDL_CONTEXT *context); A-3 Acquirer's primitives to work with lock objects --------------------------------------------------- /** Initialize a lock request. This is to be used for every lock request. Note that initialization and allocation are split into two calls. This is to allow flexible memory management of lock requests. Normally a lock request is stored in statement memory (e.g. is a member of struct TABLE_LIST), but we would also like to allow allocation of lock requests in other memory roots, for example in the grant subsystem, to lock privilege tables. The MDL subsystem does not own or manage memory of lock requests. Instead it assumes that the life time of every lock request encloses calls to mdl_acquire_shared_lock() and mdl_release_locks(). @param mdl Pointer to an MDL_EL object to initialize @param key_buff Pointer to the buffer for key for the lock request (should be at least strlen(db) + strlen(name) + 2 bytes, or, if the lengths are not known, MAX_DBNAME_LENGTH) @param type Id of type of object to be locked @param db Name of database to which the object belongs @param name Name of of the object Stores the database name, object name and the type in the key buffer. Initializes mdl_el to point to the key. We can't simply initialize mdl_el with type, db and name by-pointer because of the underlying HASH implementation requires the key to be a contiguous buffer. Suggested lock types: TABLE - 0 PROCEDURE - 1 FUNCTION - 2 Note that tables and views have the same lock type, since they share the same name space in the SQL standard. */ void mdl_init_lock(MDL_EL *mdl_el, char *key_buff, int type, const char *db, const char *name); /** Allocate and initialize one lock request. Same as mdl_init_lock(), but allocates the lock and the key buffer on a memory root. Necessary to lock ad-hoc tables, e.g. mysql.* tables of grant and data dictionary subsystems. @param type Id of type of object to be locked @param db Name of database to which object belongs @param name Name of of object @param root MEM_ROOT on which object should be allocated @retval 0 Error @retval non-0 Pointer to an object representing a lock request */ MDL_EL *mdl_alloc_lock(int type, const char *db, const char *name, MEM_ROOT *root); /** Add a lock request to the list of lock requests of the context. The procedure to acquire metadata locks is: - allocate and initialize lock requests (mdl_alloc_lock()) - associate them with a context (mdl_add_lock()) - call mdl_acquire_shared_lock()/mdl_release_lock() (maybe repeatedly). Associates a lock request with the given context. Note, that since we continue using pre-locking and try-and-back-off techniques, this function asserts that there are no granted locks when a new lock request is added to the context. @param context The MDL context to associate the lock with. There should be no more than one context per connection, to avoid deadlocks. @param lock The lock request to be added. @param exclusive Type of the lock requested */ void mdl_add_lock(MDL_CONTEXT *context, MDL_EL *lock, bool exclusive); /** Clear all lock requests in the context (clear the context). Disassociates lock requests from the context. All granted locks must be released prior to calling this function. In other words, the expected procedure to release locks is: - mdl_release_locks(); - mdl_remove_all_locks(); We could possibly merge mdl_remove_all_locks() and mdl_release_locks(), but this function comes in handy when we need to back off: in that case we release all the locks acquired so-far but do not free them, since we know that the respective lock requests will be used again. @param context Context to be cleared. */ void mdl_remove_all_locks(MDL_CONTEXT *context); A-4 Acquisition of locks ------------------------ /** Try to acquire one shared lock. Unlike exclusive locks, shared locks are acquired one by one. This is interface is chosen to simplify introduction of the new locking API to the system. mdl_acquire_shared_lock() is currently used from open_table(), and there we have only one table to work with. In future we may consider allocating multiple shared locks at once. This function must be called after the lock is added to a context. There must be no prior acquired exclusive lock in the context. @param lock Lock request object for lock to be acquired @retval FALSE success @retval TRUE a conflicting lock exists. Another attempt should be made after releasing all current locks and waiting for conflicting lock go away (using mdl_wait_for_locks()). */ bool mdl_acquire_shared_lock(MDL_CONTEXT *context, MDL_EL *lock); Pseudocode: {
let mdl_lock = look up an MDL_LOCK in the HASH using object name if (mdl_lock is found) { if (there is an exclusive lock, granted or pending) { /* We have other locks granted on this name and must back off. */ return TRUE; } } else { insert a new MDL_LOCK into the HASH; } Append the lock to the list of granted locks in MDL_LOCK return FALSE; } /** A callback to the server internals that is used to address special cases of the locking protocol. Invoked when acquiring an exclusive lock, for each thread that has a conflicting shared metadata lock. This function: - aborts waiting of the thread on a data lock, to make it notice the pending exclusive lock and back off. - if the thread has an open HANDLER to the table in question, closes the cursor of the HANDLER - if the thread is an INSERT DELAYED thread, sends it a KILL signal to terminate it. @note This function does not wait for the thread to give away its locks. Waiting is done outside for all threads at once. @param in_use The thread to wake up @retval TRUE if the thread was woken up @retval FALSE otherwise (e.g. it was not waiting for a table-level lock). @note It is one of two places where border between MDL and the rest of the server is broken. */ extern bool notify_thread_having_shared_lock(THD *in_use) /** Acquire exclusive locks. The context must contain the list of locks to be acquired. There must be no granted locks in the context. This is a replacement of lock_table_names(). It is used in RENAME, DROP and other DDL SQL statements. @param context A context containing requests for exclusive locks The context may not have other lock requests. */ void mdl_acquire_exclusive_locks(MDL_CONTEXT *context); Pseudocode: { foreach (mdl_el: request for an exclusive lock from the context) { Assert: the request is for an exclusive lock. let mdl_lock = look up an MDL_LOCK in the HASH using object name if (mdl_lock is NULL) { add a new mdl_lock instance to the cache } append the lock request to the list of pending exclusive locks } while (there is some competing shared or exclusive lock) { foreach (mdl_el: request for an exclusive lock from the context) { let mdl_lock = mdl_el->lock; if (there are active readers in mdl_lock) { foreach (mdl_el_dep: reader in mdl_lock) notify_thread_having_shared_lock(mdl_el_dep->thd); } else if (there is a pending exclusive lock request ahead of us in the lsit or a granted exclusive lock) need_wait= TRUE; } if (need_wait) wait_on(LOCK_mdl, COND_mdl); } foreach (mdl_el: request for an exclusive lock) { move the lock from "pending list" to "granted list"; } } /** Upgrade a shared metadata lock to exclusive. Used in ALTER TABLE, when a copy of the table with the new definition has been constructed. @pre To avoid deadlocks, this must be the only lock in the context. @param context Context to which shared long belongs @param type Id of object type @param db Name of the database @param name Name of the object */ void mdl_upgrade_shared_lock_to_exclusive(MDL_CONTEXT *context, int type, const char *db, const char *name); Pseudocode: { let mdl_lock = look up the lock object by name add lock type to the list of pending exclusive locks remove lock type from the list of granted shared locks while (there is some competing shared or exclusive lock) { if (there are active readers in mdl_lock) { foreach (mdl_el_dep: reader in mdl_lock) notify_thread_having_shared_lock(mdl_el_dep->thd); } else if (there is a pending exclusive lock request ahead of us or a granted exclusive lock) need_wait= TRUE; if (need_wait) wait_on(LOCK_mdl, COND_mdl); } move the lock from "pending list" to "granted list"; } /** Try to acquire an exclusive lock on the object if there are no conflicting locks. Similar to the previous function, but returns immediately without any side effect if encounters a lock conflict. Otherwise takes the lock. This function is used in CREATE TABLE ... LIKE to acquire a lock on the table to be created. In this statement we don't want to block and wait for the lock if the table already exists. @param context The context containing the lock request @param lock The lock request @retval FALSE the lock was granted @retval TRUE there were conflicting locks. */ bool mdl_try_acquire_exclusive_lock(MDL_CONTEXT *context, MDL_EL *lock); /** Wait until there are no old versions of tables in the table definition cache for the locks that we try to acquire. This is the second point of interaction of the meta-data locking with the rest of the server. It is used by mdl_wait_for_locks(), which is called by the acquirer after a backs off. In other words, if mdl_acquire_shared_lock() returns TRUE the acquire must call: - mdl_release_locks() - mdl_wait_for_locks() and then try again. @param names Names of tables to be waited for */ extern void tdc_wait_for_old_versions(List *names); /** Wait until there will be no locks that conflict with lock requests in the context. This is a part of the locking protocol and must be used by the acquirer of shared locks after a back-off. Does not acquire the locks! @param context Context with which lock requests are associated. */ void mdl_wait_for_locks(MDL_CONTEXT *context); A-5 Releasing locks ------------------- /** Release all locks associated with the context, but leave them in the context as lock requests. This function is used to back off in case of a lock conflict. It is also used to release shared locks in the end of an SQL statement. @param context The context with which the locks to be released are associated. */ void mdl_release_locks(MDL_CONTEXT *context); Pseudocode: { foreach (mdl_el: lock in the context) { if (the lock is granted) move the lock from "granted list" to "pending list"; } broadcast(COND_mdl); } /** Release all exclusive locks associated with context. Removes the locks from the context. @param context Context with exclusive locks. @note Shared locks are left intact. */ void mdl_release_exclusive_locks(MDL_CONTEXT *context); /** Release a lock. Removes the lock from the context. @param context Context containing lock in question @param lock Lock to be released */ void mdl_release_lock(MDL_CONTEXT *context, MDL_EL *lock); /** Downgrade all exclusive locks in the context to shared. @param context A context with exclusive locks. */ void mdl_downgrade_exclusive_locks(MDL_CONTEXT *context); A-6 Utility functions -------------------- /** Associate pointer to an opaque object with a lock. @param l Lock request for the lock with which the object should be associated. @param cached_object Pointer to the object @param release_hook Cleanup function to be called when MDL subsystem decides to remove lock or associate another object. This is used to cache a pointer to TABLE_SHARE in the lock structure. Such caching can save one acquisition of LOCK_open and one table definition cache lookup for every table. Since the pointer may be stored only inside an acquired lock, the caching is only effective when there is more than one lock granted on a given table. This function has the following usage pattern: - try to acquire an MDL lock - when done, call for mdl_get_cached_object(). If it returns NULL, our thread has the only lock on this table. - look up TABLE_SHARE in the table definition cache - call mdl_set_cache_object() to assign the share to the opaque pointer. The release hook is invoked when the last shared metadata lock on this name is released. */ void mdl_set_cached_object(MDL_EL *l, void *cached_object, mdl_cached_object_release_hook release_hook); /** Get a pointer to an opaque object that associated with the lock. @param l Lock request for the lock with which the object is associated. @return Pointer to an opaque object associated with the lock. */ void *mdl_get_cached_object(MDL_EL *l); Part I. B. Deployment of the API ================================ The new API obsoletes the following functions: - lock_table_name(), lock_table_names(), unlock_table_names(), lock_table_names_exclusively(), lock_table_name_if_not_cached(), lock_and_wait_for_table_name(), table_cache_insert_placeholder(), is_table_name_exclusively_locked_by_this_thread(), mysql_wait_completed_table(), get_name_lock(), release_name_lock() - since the locking role of the table cache was moved to the MDL subsystem, it became possible to merge the table cache and the table definition cache. All open tables for a given name are attached to the respective TABLE_SHARE. A list of unused TABLE instances is still maintained in unused_list and is purged on FIFO basis. Used and unused TABLE elements are now attached to their respective TABLE_SHAREs, i.e. elements of the table definition cache. The following structures and functions were removed: open_cache, table_cache_key(), table_cache_init(), table_cache_destroy(), table_cache_free(), remove_db_from_cache() TABLE_SHARE elements do not have to be looked up in the table definition cache in all cases: they are attached to the metadata lock if there is one by means of mdl_{get,set}_cached_object() API. This addresses the performance bottleneck identified in BUG#33948. - auto-repair implementation is rewritten to not acquire an exclusive metadata lock from within open_table(). Since acquisition of an exclusive lock must wait for other threads to give away their locks, theoretically this could lead to deadlocks. Even though it's practically impossible to trigger this situation with auto-repair, the change was done to fullfill the assertions of the implemented locking protocol. Instead of starting auto-repair, open_table() now returns a special return code, that causes the thread close all currently open tables, repair the table in question and try to reopen the tables after that. Part I. C. A metadata lock on a view ==================================== Since views and base tables share the same name space according to the SQL standard, a metadata lock on a view has the same type (0) as one on a table. Acqusition and release primitives are identical as well. Part II. Transactional locks ============================ Appendix A: Future work ======================= Most todo from this list items will be addressed in separate tasks. They are collected here to provide a roadmap for metadata locking. 1. INFORMATION_SCHEMA and NDB cluster binlog invokes NDB auto-discovery frivolously: ndbcluster_find_files() is called while keeping other metadata locks. This may lead to deadlocks. Auto-discovery should be re-written to not be invoked while keeping other metaddata locks. Requires architecture work and prototyping. 2. Auto-discovery code invoked inside open_unireg_entry() (Data Dictionary, addition of a table to the table definition cache). The same disease as in item 1. DONE 3. TRUNCATE implementation. The current code first accesses the data dictionary to detect the storage engine of the subject table on hand, and only then acquires an exclusive metadata lock. This is an opportunity for a race condition and must be eliminated. A minor code change that can be done in a separate patch. 4. ALTER VIEW implementation. A similar gap is present between an access to the data dictionary to establish the table type (a view or a base table) and an exclusive metadata lock. A minor code change that can be done in a separate patch. 5. INFORMATION_SCHEMA should use MDL subsystem to acquire special metadata locks on used tables. A new lock type will be introduced for this purpose. A fairly substantial change but has been approved by the architecture board in Orlando. Requires specification and implementation. DONE 6. LOCK TABLE t1 WRITE should acquire an shared-for-upgrade lock type on the table. When this is done, FLUSH TABLE or DROP TABLE under LOCK TABLES can upgrade the lock type to EXCLUSIVE without having to abort transactions, simply by waiting for all shared locks to go away). A minor change that can be addressed in a separate patch. DONE 7. It should be possible to lock a table with LOCK TABLE statement that does not yet exist. At the same time, it should be impossible to use tables outside the set locked with LOCK TABLES, even for DDL. Example: LOCK TABLE t1 WRITE, t2 WRITE; DROP TABLE t1; ALTER TABLE t2 RENAME t1; -- should be allowed (currently fails during ALTER) LOCK TABLE t1; DROP TABLE t2; ALTER TABLE t1 RENAME t2; -- should be prohibited A minor change that can be addressed in a separate patch. Since is an incompatible change, must be done in an alpha. Should be done together with item 6. 8. Support for metadata locks for a database (schema), tablespace, user. In order to support RENAME DATABASE and proper operation with tablespaces, MySQL will need to implement metadata locks on schemas, tablespaces, and other compound objects. Two alternative solutions have been identified: * do not support hierarchies inside MDL subsystem. If a database is used in a statement, lock it in shared mode just as if it was any other object type (table, procedure, view). * represent the list of locks requested or granted as a hierarchy or a tree. Allow to lock parts of this tree. Requires prototyping and an approval of architects. Will be addressed in a separate worklog task.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.