WL#13574: Include MDL and ACL locks in MTS deadlock detection infra-structure

Affects: Server-8.0   —   Status: Complete

EXECUTIVE SUMMARY

This worklog aims at integrating the thread serialization infra-structure needed to deliver a multi-threaded replica that keeps the same commit order as the source, with the MDL and ACL access serialization infra-structures. This integration is needed to mitigate three-actors deadlocks that stem from combining waits on the order of commit with waits on MDL and/or ACL locks.

By producing the changes described in this WL, deadlocks that rise from executing client issued statements related with global locks and/or ACL statements on a replica that is actively processing a change-stream, should be mitigated and, eventually, broken. Note that by broken deadlock one should understand that the server will eventually reach a consistent state or a state in which is possible to operate in order to make it reach a consistent state.

Other than retrying to apply the event as many as slave_transaction_retries times in the hope that the one of the actors in the deadlock has exited the critical section, there is no deadlock resolution involving replica multi-threaded applier workers and commit order. This work is not intended to change that mechanism, it's intended that such mechanism is enforced in every identified deadlock, without exception.

USER STORIES

As a user, I wish to enable the multi-threaded applier and commit order preservation on a replica and execute any statement while that same replica is applying events, without the server being indefinitely hang on a deadlock.

GLOSSARY

Terms that may need a definition prior to reading further.

  • lock: in a computer system, a lock is a shared resource that manages the access to a critical execution block. Processes or thread can request to acquire a lock and, once the request is granted, safely proceed to executing the critical block.
  • deadlock: in a multi-process or multi-threaded computer system, a deadlock is a no-progress state where two or more processes or threads acquire ownership of a locking mechanism in a way they end-up waiting on each other, indefinitely.
  • dependency: in a multi-process or multi-threaded computer system, a process or thread P is dimmed dependent on a process or thread Q when Q has the ability to unilaterally determine if P can progress (or not) with it's execution.
  • graph: a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense related. The objects correspond to abstractions called nodes and each of the related pairs of nodes is called an edge.
  • multi-threaded replica: a replica running mode where the SQL applier is composed of several worker threads, applying the transactions from the change-stream in parallel.
  • worker: a thread that is part of a multi-threaded replica set of threads that will apply the transactions coming in the change-stream in parallel.
  • commit order: the order by which transactions are commit in the source. This is order is kept and logged so that it may be used to apply transactions in the exact same order on the replicas.
  • dependency tracking: mechanism in which dependency between transactions is tracked in order to understand if transactions can be applied in parallel on the replica.
  • MDL: metadata locking management infra-structure and API.
  • ACL: access control list.
  • queue: in a computer system, a queue is a FIFO data-structure.
  • lock-free: in a computer system, a lock-free data structure is one that doesn't need locks in order for multiple processes or threads to execute critical sections of code where such data structures may be accessed or changed, concurrently.
  • integral: a type is integral if is bool, char, char8_t, char16_t, char32_t, wchar_t, short, int, long, long long, or any implementation-defined extended integer types, including any signed, unsigned, and cv-qualified variants.