WL#7304: Improve MDL performance and scalability by implementing "fast-path" for DML locks
Affects: Server-5.7 — Status: Complete — Priority: Medium
Since typical user workload/workload used in typical benchmarks consists of DML statements it makes sense to improve performance/scalability by optimizing MDL subsystem for such type of statements. One of possible approaches is to implemenent "fast-path" for metadata locks acquired by DML statements, which will convert acquisition/release of DML lock (S, SH, SW, SR locks) into counter increment/decrement (under protection of MDL_lock::m_rwlock) instead of more complex code involving list manipulation. (Indeed this means that MDL lock acquisition for DDL becomes more complex). Benchmarking of draft patch implementing this idea shown that it provides at least 10% performance improvement in single-table OLTP_RO/POINT_SELECT SysBench tests. User Documentation ================== http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-4.html
There are no functional requirements for this task as user visible functionality is not supposed to be changed by it. NF1) The only non-functional requirement is performance improvement for POINT_SELECT/OLTP_RO tests of 1-table SysBench/InnoDB tables/multi-core machine.
None needed. See LLD for design.
General idea ============ General idea is to split all lock types for each of MDL namespaces in two sets: A) "DML" lock types 1) Each type from this set should be compatible with all other types from the set (including itself). 2) These types should be common for DML operations Our goal is to optimize acquisition and release of locks of this type by avoiding complex checks and manipulations on m_waiting/m_granted sets/lists and replacing it with a check of and increment/decrement of integer counters. We will call the latter type of acquisition/release "fast path". Use of "fast path" should reduce the size of critical section associated with MDL_lock::m_rwlock lock in the common case and thus increases scalability. 2) "DDL" lock types 1) Granted or pending lock of those type is incompatible with some other lock (including itself). 2) Not common for DML operations These locks have to be always acquired in the old fashion - involving manipulations with m_waiting/m_granted sets/lists, i.e. using "slow path". Moreover in the presence of active/pending locks from "slow" set we have to acquire even locks of "DML" type using "slow path". "DML" and "DDL" lock sets =========================== 1) For GLOBAL/COMMIT and SCHEMA namespaces (i.e. namespaces with lock represented by MDL_scoped_lock class): "DML" locks set consists of IX lock type. "DDL" locks set consists of S and X lock types. Note that DML statements that change data acquire only IX locks in GLOBAL/COMMIT namespaces so the above makes perfect sense. 2) For all other namespaces (i.e. represented by MDL_object_lock class): "DML" locks set consists of S, SH, SR and SW locks. "DDL" locks set consists of SU, SNW, SNRW and X. Again normally DML statements (including queries to I_S and prepare phase for prepared statements) acquire locks only from "DML" set. Main code transformation ======================== MDL_lock object gets two new members: - MDL_lock::m_ddl_locks_granted_pending_count - number of granted or pending locks of "DDL" types. Necessary to quickly verify that we can grant "DML" locks without further checking. - MDL_lock::m_fast_path_granted_count - packed counter of number of granted locks of specific "DML" type which were granted using fast-path algorithm and not using "slow path". For namespaces using MDL_scoped_lock we can use 60 bits of 64-bit variable to count number of IX locks (we will use remaining 4 bits in future MDL work). For namespaces using MDL_object_lock we can use chunks of 20 bits for S, SR and SW locks. SH locks do not need separate counter as they differ from S only in case of presence of "DDL" locks, in which case fast path optimization can't be used. We could have used separate counters for each type but this would complicate future switch to lock-free MDL implementation. (Again 4-bits reserved for future use). I assume that overflow is not an issue as we are far away from handling 2^20 concurrent connections. The above two members are to be protected by MDL_lock::m_rwlock lock. In future we will switch to updating MDL_lock::m_fast_path_granted_count using atomic primitives (see WL#7306 and WL#7305, which is prerequisite for the former). Essentially we replace: 1) addition/removal of "DML" lock to MDL_lock::m_granted list during lock acquisition/release with and incrementing/decrementing of corresponding part of m_fast_path_granted_count. 2) check for granted or pending locks which conflict with "DML" lock is replaced with check on m_ddl_locks_granted_pending_count counter. We still allocate MDL_ticket objects for requests which are satisfied using fast path algoritm, but we mark them using MDL_ticket::m_is_fast_path member, so we know that such ticket can be released in simplified fashion without need to remove from MDL_lock::m_granted list. In some cases we will need to do so called "materialization" of fast path tickets which will clear this flag, add ticket to appropriate m_granted list and decrement corresponding m_fast_path_granted_count counter under protection of MDL_lock::m_rwlock. These cases are marked by (***) below. Let us analyze places which can be affected by the above transformations: 1) The fact that we no longer include "DML" lock in some cases into MDL_lock::m_granted affects: a) Naturally, MDL_context::try_acquire_lock_impl() and MDL_context::release_lock() where this logic is to be implemented. b) MDL_lock::is_empty() should take value of m_fast_path_granted_count into account. c) Manipulation with MDL_lock::reschedule_waiters() is not affected - since we only add locks to m_granted there. d) MDL_lock::can_grant_lock(): *) In addition to m_granted.bitmap() we should take into account contents of m_fast_path_granted_count when looking at granted locks. *) For code determining if incompatible lock belongs to same context to work properly we need to consider two cases: I) We are trying to acquire "DDL" lock. In this case for check to work we need to materialize "fast path" locks in advance. (e.g. consider acquiring X after S). (***) II) We are acquiring "DML" lock. Since such type of lock can only conflict with locks from "DDL" set and such locks will always present in "m_granted" explicitly, there is no issue here. (e.g. consider acquiring SW after SNW). e) MDL_lock::clone_ticket() needs to be adjusted to take into account "fast path" tickets. f) MDL_object_lock::notify_conflicting_locks(). Nowadays this method does something only for tickets belonging to contexts with MDL_context::m_needs_lock_thr_abort set. I.e. contexts with open HANDLERs. We can always use slow path/materialize tickets for such contexts. (***) g) MDL_scoped_lock::notify_conflicting_locks(). Can be removed after removal of INSERT DELAYED functionality. h) MDL_context::upgrade_shared_lock(). Needs to be adjusted to take into account both that new lock can be of "DML" type and that lock which is upgraded can be "fast path" lock. i) MDL_lock::visit_subgraph(). For deadlock detection to work properly we need to "materialize" all tickets belonging to context which is about to start waiting in MDL or other cases covered by our deadlock detector. (***) j) MDL_ticket::downgrade_lock(). We downgrade only "DDL" locks so we don't need to do anything about "m_granted" here. 2) The fact that we need to properly reflect number of active and pending "DDL" locks in m_ddl_locks_granted_pending_count might affect the following places where we modify MDL_lock::m_granted and MDL_lock::m_waiting: a) MDL_context::try_acquire_lock_impl() and MDL_context::release_lock() need to increment and decrement this counter when appropriate. b) MDL_lock::reschedule_waiters() is not affected as we only make pending lock granted there. c) MDL_lock::clone_ticket() should increment m_ddl_locks_granted_pending_count if "slow" type of lock ticket is cloned. d) MDL_context::upgrade_shared_lock() should be extended to handle m_ddl_locks_granted_pending_count when upgrading from "slow" ticket. e) MDL_ticket::downgrade_lock() should be extended to decrement m_ddl_locks_granted_pending_count if we are downgrading to "DML" type of lock. f) MDL_context::acquire_lock() should be extended to increment m_ddl_locks_granted_pending_count when we are adding or removing "DDL" type of lock to/from m_waiting list. The above items can be used as a draft plan for unit tests covering the suggested changes.
Copyright (c) 2000, 2016, Oracle Corporation and/or its affiliates. All rights reserved.