WL#10793: InnoDB: Use CATS for scheduling lock release under high load
Affects: Server-8.0 — Status: Complete
Contention-Aware Transaction Scheduling (CATS) is described in https://arxiv.org/pdf/1602.01871.pdf. The patch was contributed by the researchers and Dimitri's tests confirm the gains claimed by the researchers. See BUG#84266 for the contribution.
NF1 - Should not regress for any load. NF2 - Improve QPS (Queries per second) for Pareto RW load
CATS helps in reducing the lock sys wait mutex contention by granting locks to transactions that have a higher wait in the dependency graph. Keeps track of how many transactions are waiting for locks that are already acquired by a transaction and, recursively, how many transaction are waiting for those waiting transactions in the wait for graph. The waits-for-edge is "weighted" and this weight is used to order the transactions when scheduling the lock release. The weight is a cumulative weight of the dependencies.
When using CATS, split the transactions that are waiting on a lock into two groups, one that is granted immediately and the second into those that may have to wait. Preserve the sequence number in the lock wait queue (the ordinal value). Order the transactions in the not granted list using the following rule: When comparing the weight/priority of two transactions waiting on a lock. LHS -> Left Hand Side and RHS -> Right Hand Side. Check if LHS has higher priority than RHS. 1. If neither of them is wait lock, the LHS one has higher priority. 2. If only one of them is a wait lock, it has lower priority. 3. If both are high priority transactions, the one with a lower sequence number number has higher priority. 4. High priority transaction has higher priority. 5. Otherwise, the one with an older transaction has higher priority. Iterate over the sorted waiting list and check if the transactions in the not yet granted queue/list have to wait for locks granted immediately above. If they don't have to wait then grant them the locks and add them to the granted queue. Track the transactions that were granted in the second pass above. We need to adjust the weight in the dependency graph later. For the transactions that were not granted the lock in the second pass we need to increase their weight/priority and that of the transactions that are in turn waiting for them. To avoid traversing the same sub-trees over and over again during the weight/priority update in the dependency graph we use a "visited" marker in trx_t. trx_t::age_updated The weight/priority is named trx_t::age. The global counter against which update requests are checked is in lock_sys_t::mark_age_updated. The global counter is incremented before the update and when trx_t::age is updated we set the trx_t::age_updated to trx_t::mark_age_updated. Lock scheduling switches to CATS when there are >= 32 threads suspended in the lock wait queue.
Copyright (c) 2000, 2023, Oracle Corporation and/or its affiliates. All rights reserved.