WL#7306: Improve MDL performance and scalability by implementing lock-free lock acquisition for DML
Affects: Server-5.7
—
Status: Complete
Since DML statements are most common for normal workload/benchmarks it makes sense to optimize MDL subsystem for such statements. One of possible optimizations is making acquisition/release of locks for DML statement (S, SH, SR and SW locks) lock-free, e.g. by replacing complex construction involving locking mutex, object manipulation and unlocking mutex with construction involving CAS operator updating single object member. Preliminary benchmarks of draft patch implementing this idea has shown good results. In some cases performance was on par with tree built without MDL. User Documentation ================== http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-4.html
There are no functional requirements for this task as user visible functionality is not supposed to be changed by it. NF1) The only non-functional requirement is performance improvement for POINT_SELECT/OLTP_RO tests of 1-table SysBench/InnoDB tables/multi-core machine.
None needed. See LLD.
General idea ============ After WL#7304 and WL#7305 are implemented, acqusition of an unobtrusive metadata lock can be described by the following pseudocode: retry: /* Look-up MDL_lock object in LF_HASH. */ lock= lf_hash_search(pins, &m_locks, mdl_key); wrlock(&lock->m_rwlock); if (lock->m_is_destroyed) { unlock(&lock->m_rwlock); unpin(pins, lock); goto retry; } unpin(pins, lock); if (! lock->m_obtrusive_locks_granted_waiting_count) { /* Increment number of acquired fast path locks of this type. */ lock->m_fast_path_granted_count+= unobtrusive_lock_increment; unlock(&lock->m_rwlock); ... } else { /* Slow path. */ ... } Similarly release of a fast path lock looks like: wrlock(&lock->m_rwlock); /* Decrement number of acquired fast path locks of this type. */ lock->m_fast_path_granted_count-= unobtrusive_lock_increment; if (lock->is_empty()) { /* If it was last ticket for this MDL_lock object remove it from MDL_map. */ ... } else { /* If there are any lock requests waiting for ticket being released to go away try to reschedule them. */ if (lock->m_obtrusive_locks_granted_waiting_count) lock->reschedule_waiters(); unlock(&lock->m_rwlock); } By representing both the MDL_lock::m_is_destroyed member and the value of the (MDL_lock::m_obtrusive_locks_granted_waiting_count != 0) predicate as flags in the MDL_lock::m_fast_path_granted_count packed counter it becomes possible to replace the above steps involving mutex acquisition/checks and packed counter increment with a single atomic compare-and-swap operation on the MDL_lock::m_fast_path_granted_count. Since MDL_lock::m_fast_path_granted_count after this change no longer will contain only lock counts it makes sense to rename this member to MDL_lock::m_fast_path_state. As a result the pseudocode for acquisition of an unobtrusive lock is transformed to: retry: /* Look-up MDL_lock object in LF_HASH. */ lock= lf_hash_search(pins, &m_locks, mdl_key); old_state= lock->m_fast_path_state; do { if (old_state & IS_DESTROYED) { unpin(pins, lock); goto retry; } if (old_state & HAS_OBTRUSIVE) goto slow_path; /* If lock is not waiting to be destroyed and doesn't have active or pending obtrusive lock - increment number of acquired fast path locks of this type using CAS. */ } while (!compare_and_swap(&lock->m_fast_path_state, &old_state, old_state + unobtrusive_lock_increment))); /* We have acquired lock. */ unpin(pins, lock); ... return; slow_path: /* Acquire MDL_lock::m_rwlock and try to acquire lock using slow path. */ wrlock(&lock->m_rwlock); if (lock->m_fast_path_state & IS_DESTROYED) { unlock(&lock->m_rwlock); unpin(pins, lock); goto retry; } unpin(pins, lock); ... And pseudocode for release of a lock which was acquired using fast path: old_state= lock->m_fast_path_state; /* We need to decrement part of counter which hold number of acquired "fast path" locks of this type. */ do { if (((old_state - unobtrusive_lock_increment) == 0) || (old_state & MDL_lock::HAS_OBTRUSIVE)) { /* - Either there is a high chance that this is last ticket for lock, - Or there are pending or active obtrusive locks. In both these cases we need to acquire MDL_lock::m_rwlock and: - Either try to remove object from hash and delete it. - Or try to reschedule pending locks. */ wrlock(&lock->m_rwlock); /* Decrement number of acquired fast path locks of this type. */ atomic_add(&lock->m_fast_path_state, -unobtrusive_lock_increment); if (lock->m_granted.is_empty() && lock->m_waiting.is_empty()) { /* The above check corresponds to part of MDL_lock::is_empty() in old code which can be done without atomic compare-and-swap. This might be last ticket for lock so try atomically check that there are no "fast path" locks in MDL_lock::m_fast_path_state and set MDL_lock::IS_DESTROYED flag. If this step succeeds remove it from LF_HASH. */ ... } else { /* If there are any lock requests waiting for ticket being released to go away. Since this is "fast path" ticket it represents "unobtrusive" type of lock. In this case if there are any waiters for it there should be "obtrusive" type of request among them. */ if (lock->m_obtrusive_locks_granted_waiting_count) lock->reschedule_waiters(); unlock(&lock->m_rwlock); } break; } } while (! compare_and_swap(&lock->m_fast_path_state, &old_state, old_state - unobtrusive_lock_increment)); As a result, acquisition of unobtrusive locks in absence of obtrusive locks becomes equivalent to one CAS which should be twice cheaper than acquire, check, increment and release of mutex. Release of locks acquired on fast path (in absence of obtrusive locks and assuming we are not releasing last lock for this MDL_lock object) also becomes equivalent to CAS which again should be twice cheaper than acquisition of mutex, decrement, checks and release of mutex. Detailed description ==================== What changes are necessary to implement the above idea? Essentially we perform the following three transformations: I) MDL_lock::m_fast_path_granted_count is replaced with atomic MDL_lock::m_fast_path_state member, which in an ideal case of "fast path" acquisition/release is checked and changed using CAS without holding any mutexes. II) Since we would like to check in the same atomic CAS operation that MDL_lock object was not destroyed, its m_is_destroyed member is replaced by IS_DESTROYED bit flag in m_fast_path_state packed counter. III) Similarly, since we also would like to check in the same atomic CAS that there are no granted or pending obtrusive locks, we have to add HAS_OBTRUSIVE bit flag in m_fast_path_state, while keeping MDL_lock::m_obtrusive_locks_granted_waiting_count. This flag should be set when we are about to try acquiring an obtrusive lock and cleared once the last granted or pending obtrusive lock goes away. Note that this flag should be set before we call MDL_lock::can_grant_lock() when trying to acquire an obtrusive lock as otherwise it would be hard to avoid races due to concurrent changes to m_fast_path_state happening on fast path. This is a subtle but important difference from the current handling of MDL_lock::m_obtrusive_locks_granted_waiting_count. Let us take a detailed look at all places which currently use these members and are going to be affected by these transformations: 1) Increment of m_fast_path_state on a fast path in try_acquire_lock_impl() should be done as an atomic compare-and-swap operation (which also checks if IS_DESTROYED or HAS_OBTRUSIVE flags are set) without taking MDL_lock::m_rwlock. 2) Reads of m_fast_path_state in MDL_lock::fast_path_granted_bitmap() can happen in two cases. In the first one we call MDL_lock::can_grant_lock() to figure out if we can grant an unobtrusive lock. Since unobtrusive locks are compatible with all fast path locks per definition we don't really care about the m_fast_path_state value in this case. So reads don't have to be atomic. In the second one we call MDL_lock::can_grant_lock() to figure out if we can grant an obtrusive lock. But this means that the HAS_OBTRUSIVE flag is set so all changes to m_fast_path_state happen under protection of MDL_lock::m_rwlock (this is an invariant which we need to enforce, let us call it [INV1]). Since can_grant_lock() is called only when MDL_lock::m_rwlock is held, it is safe to do an ordinary read of m_fast_path_state in this case as well. 3) Decrement of m_fast_path_state in MDL_context::materialize_fast_path_locks() should be done atomically and under protection of MDL_lock::m_rwlock. This is necessary in order to ensure that decrement and addition of corresponding ticket to m_granted list happen atomically and to enforce [INV1]. 4) Increment of m_fast_path_state in MDL_context::clone_ticket() needs to happen atomically and under protection of MDL_lock::m_rwlock at least when the HAS_OBTRUSIVE flag is set in order to enforce [INV1]. 5) Decrement of m_fast_path_state in MDL_context::upgrade_shared_lock() needs to happen under protection of m_rwlock in order to be atomic with changes to ticket lists. Technically this decrement doesn't have to be atomic since we always upgrade to obtrusive locks, so we have HAS_OBTRUSIVE flag set at the point where decrement happens and thus thanks to [INV1] there can be no concurrent changes to m_fast_path_state. 6) Decrement of m_fast_path_state on a fast path in MDL_context::release_lock() should happen atomically (without protection of MDL_lock::m_rwlock). The same compare-and-swap has to check if the HAS_OBTRUSIVE flag is set or that we are about to release the last fast path ticket. In these cases MDL_lock::m_rwlock has to be acquired. In the former case this is necessary to make decrement and rescheduling of waiting locks atomic. This also enforces [INV1]. In the latter case acquiring m_rwlock is necessary as a pre-requisite for possible deletion of the MDL_lock (we need to ensure that only one thread tries to delete MDL_lock/set IS_DESTROYED flag, let us call it [INV2]). Also this allows us to ensure that we never can have an outstanding reference to MDL_lock which is not pinned, has unlocked m_rwlock and with zero m_fast_path_state (including IS_DESTROYED bit). Failure to ensure this opens the door to a race when the object for which m_fast_path_state was just decremented to 0 is deleted immediately after it, before we even try to destroy the object ourselves. 7) Since checking that m_fast_path_state is 0 and setting of the IS_DESTROYED flag needs to happen atomically, places which use MDL_lock::is_empty() (i.e. MDL_context::release_lock() and MDL_lock::remove_ticket()) need to be changed to check that m_granted and m_waiting are empty directly and then call MDL_map::remove() which will try to check if m_fast_path_state is 0 and set IS_DESTROYED with a single atomic compare-and-swap operation. Indeed, this means that it makes sense to rename MDL_map::remove() to remove_if_no_fast_path(). 1') Checking of the MDL_lock::m_is_destroyed member which now happens in the MDL_map::find_or_insert() method needs to be replaced with a check of the IS_DESTROYED flag in m_fast_path_state and moved out of find_or_insert() and into MDL_context::try_acquire_lock_impl(). On the fast path this needs to happen as a part of an atomic compare-and-swap operation which also increments packed counter in m_fast_path_state. On the slow path it can be done after we have acquired MDL_lock::m_rwlock. Thanks to the fact that we always set IS_DESTROYED with m_rwlock held, we can do this check using the usual non-atomic read (it can return outdated/incorrect values for other parts of m_fast_path_state but this is irrelevant in this case). In both these cases we need to carry out the check before unpinning the MDL_lock object. 2') As mentioned above, MDL_map::remove() needs to be changed to set the IS_DESTROYED flag as part of atomic compare-and-swap operation which also checks if m_fast_path_state is zero and happens under protection of MDL_lock::m_rwlock lock. 1'') Code on the fast path in MDL_context::try_acquire_lock_impl() which checks MDL_lock::m_obtrusive_locks_granted_waiting_count value needs to be adjusted to check for the presence of the HAS_OBTRUSIVE flag instead. This needs to happen as part of the same compare-and-swap construct which increments packed counter in m_fast_path_state. 2'') When we try to acquire an obtrusive lock using the slow path in MDL_context::try_acquire_lock_impl() we need to atomically set the HAS_OBTRUSIVE flag in m_fast_path_state before we call the MDL_lock::can_grant_lock() method. This is necessary to prevent concurrent fast path acquisitions from invalidating the results of this method. Consequently it is more convenient to increment the MDL_lock::m_obtrusive_locks_granted_waiting_count counter in the same place. This means that we also no longer need to increment this counter in the MDL_context::acquire_lock() method, but instead we need to decrement it and clear the HAS_OBTRUSIVE flag in MDL_context::try_acquire_lock(). This also means that there is a subtle change in semantics of m_obtrusive_locks_granted_waiting_count - it now also accounts for obtrusive locks we are about to try to acquire. QQ: So perhaps its name should be changed? OTOH this change in semantics is not strictly necessary, so maybe we should keep semantics as it is now? 3'') Indeed MDL_lock::remove_ticket() needs to be changed to atomically clear the HAS_OBTRUSIVE flag when we are removing the last ticket for an obtrusive lock. 4'') Code in MDL_context::clone_ticket() doesn't require adjustment as we don't decrease the number of active or pending obtrusive locks in this method or change this number from 0 to a non-0 value. 5'') Code in MDL_context::upgrade_shared_lock() can be kept as is since we always upgrade to obtrusive locks and therefore the HAS_OBTRUSIVE flag will be set by acquire_lock(). 6'') Instead of checking the MDL_lock::m_obtrusive_locks_granted_waiting_count value on fast path in MDL_context::release_lock(), we need to check for presence of the HAS_OBTRUSIVE flag and acquire MDL_lock::m_rwlock to do rescheduling of waiting tickets and to enforce [INV1] if this flag is set. This check should be done atomically by the same compare-and-swap construction which decrements packed counter in the MDL_lock::m_fast_path_state. 7'') MDL_ticket::downgrade_lock() should be adjusted to clear the HAS_OBTRUSIVE flag in cases when we have downgraded the ticket for last granted or pending obtrusive lock. This needs to be done atomically in order to avoid lost/partial updates to MDL_lock::m_fast_path_state on 32-bit platforms. To sum up the above - we use three approaches for handling reads and changes to MDL_lock::m_fast_path_state member: A) Write and Read-Modify-Write operations are always carried out atomically, this is necessary to avoid lost updates on 32-bit platforms among other things. B) In some cases Reads can be done non-atomically because we don't really care about value which they will return (for example, if further down the line there will be a compare-and-swap operation which will validate this value and provide the correct value if the validation will fail). C) In other cases Reads can be done non-atomically since they happen under protection of MDL_lock::m_rwlock and there is some invariant which ensures that concurrent updates of the MDL_lock::m_fast_path_state member can't happen until MDL_lock::m_rwlock.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.