MySQL 8.0.40
Source Code Documentation
rpl_replica_commit_order_manager.h
Go to the documentation of this file.
1/* Copyright (c) 2014, 2024, Oracle and/or its affiliates.
2
3 This program is free software; you can redistribute it and/or modify
4 it under the terms of the GNU General Public License, version 2.0,
5 as published by the Free Software Foundation.
6
7 This program is designed to work with certain software (including
8 but not limited to OpenSSL) that is licensed under separate terms,
9 as designated in a particular file or component or in included license
10 documentation. The authors of MySQL hereby grant you an additional
11 permission to link the program and your derivative works with the
12 separately licensed software that they have either included with
13 the program or referenced in the documentation.
14
15 This program is distributed in the hope that it will be useful,
16 but WITHOUT ANY WARRANTY; without even the implied warranty of
17 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18 GNU General Public License, version 2.0, for more details.
19
20 You should have received a copy of the GNU General Public License
21 along with this program; if not, write to the Free Software
22 Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
23
24#ifndef RPL_REPLICA_COMMIT_ORDER_MANAGER
25#define RPL_REPLICA_COMMIT_ORDER_MANAGER
26#include <stddef.h>
27#include <memory>
28#include <vector>
29
30#include "my_dbug.h"
31#include "my_inttypes.h"
34#include "sql/changestreams/apply/commit_order_queue.h" // Commit_order_queue
35#include "sql/rpl_rli_pdb.h" // get_thd_worker
36
37class THD;
39
40/**
41 On a replica and only on a replica, this class is responsible for
42 committing the applied transactions in the same order as was observed on
43 the source.
44
45 The key components of the commit order management are:
46 - This class, that wraps the commit order management, allowing for API
47 clients to schedule workers for committing, make workers wait for their
48 turn to commit, finish up a scheduled worker task and allow for others
49 to progress.
50 - A commit order queue of type `cs::apply::Commit_order_queue` that holds
51 the sequence by which worker threads should commit and the committing
52 order state for each of the scheduled workers.
53 - The MDL infra-structure which allows for: one worker to wait for
54 another to finish when transactions need to be committed in order;
55 detect deadlocks involving workers waiting on each other for their turn
56 to commit and non-worker threads waiting on meta-data locks held by
57 worker threads.
58
59 The worker thread progress stages relevant to the commit order management
60 are:
61 - REGISTERED: the worker thread as been added to the commit order queue
62 by the coordinator and is allowed to start applying the transaction.
63 - FINISHED APPLYING: the worker thread just finished applying the
64 transaction and checks if it needs to wait for a preceding worker to
65 finish committing.
66 - REQUESTED GRANT: the worker thread waits on the MDL graph for the
67 preceding worker to finish committing.
68 - WAITED: the worker thread finished waiting (either is the first in the
69 commit order queue or has just been grantted permission to continue).
70 - RELEASE NEXT: the worker thread removes itself from the commit order
71 queue, checks if there is any worker waiting on the commit order and
72 releases such worker iff is the preceding worker for the waiting
73 worker.
74 - FINISHED: the worker marks itself as available to take on another
75 transaction to apply.
76
77 The progress of the worker within the stages:
78
79 +-------------------------+
80 | |
81 v |
82 [REGISTERED] |
83 | |
84 v |
85 [FINISHED APPLYING] |
86 | |
87 Worker is |
88 first in the queue? |
89 / \ |
90 yes / \ no |
91 / v |
92 \ [REQUESTED GRANT] |
93 \ / |
94 \ / |
95 \ / |
96 | |
97 v |
98 [WAITED] |
99 | |
100 v |
101 [RELEASE NEXT] |
102 | |
103 v |
104 [FINISHED] |
105 | |
106 +-------------------------+
107
108 Lock-free structures and atomic access to variables are used to manage
109 the commit order queue and to keep the worker stage transitions. This
110 means that there is no atomicity in regards to changes performed in the
111 queue or in the MDL graph within a given stage. Hence, stages maybe
112 skipped and sequentially scheduled worker threads may overlap in the
113 same stage.
114
115 In the context of the following tables, let W1 be a worker that is
116 scheduled to commit before some other worker W2.
117
118 The behavior of W2 (rows) towards W1 (columns) in regards to
119 thread synchronization, based on the stage of each thread:
120+------------+-----------------------------------------------------------------+
121| \ W1 | REGISTERED | FINISHED | REQUESTED | WAITED | RELEASE | FINISHED |
122| W2 \ | | APPLYING | GRANT | | NEXT | |
123+------------+------------+----------+-----------+--------+---------+----------+
124| REGISTERED | | | | | | |
125+------------+------------+----------+-----------+--------+---------+----------+
126| FIN. APPL. | | | | | | |
127+------------+------------+----------+-----------+--------+---------+----------+
128| REQ. GRANT | WAIT | WAIT | WAIT | WAIT | WAIT | |
129+------------+------------+----------+-----------+--------+---------+----------+
130| WAITED | | | | | | |
131+------------+------------+----------+-----------+--------+---------+----------+
132| REL. NEXT | | | | | WAIT | |
133+------------+------------+----------+-----------+--------+---------+----------+
134| FINISHED | | | | | | |
135+------------------------------------------------------------------------------+
136
137 The W2 wait when both worker threads are in the RELEASE NEXT stage
138 happens in the case W2 never entered the REQUESTED GRANT stage. This case
139 may happen if W1 being in RELEASE NEXT removes itself from the queue
140 before W2 enters FINISHED APPLYING and then W2 reaches the RELEASE NEXT
141 stage before W1 exits it:
142
143 [W1] [W2]
144
145 stage = RELEASE NEXT stage = REGISTERED
146 | |
147 v |
148 queue.pop() v
149 | stage = FINISHED_APPLYING
150 | |
151 v v
152 next_worker.stage queue.front() == W2
153 == FINISHED_APPLYING |
154 | |
155 | v
156 | stage = WAITED
157 | |
158 | v
159 | stage = RELEASE NEXT
160 | |
161 v v
162 next_worker.release() queue.pop()
163
164 The commit order queue includes mechanisms that block the popping until
165 the preceding worker finishes the releasing operation. This wait will
166 only be active for the amount of time that takes for W1 to change the
167 values of the MDL graph structures needed to release W2, which is a very
168 small amount of cycles.
169
170 The behavior of W1 (rows) towards W2 (columns)in regards to thread
171 synchronization, based on the stage of each thread:
172+------------+-----------------------------------------------------------------+
173| \ W2 | REGISTERED | FINISHED | REQUESTED | WAITED | RELEASE | FINISHED |
174| W1 \ | | APPLYING | GRANT | | NEXT | |
175+------------+------------+----------+-----------+--------+---------+----------+
176| REGISTERED | | | | | | |
177+------------+------------+----------+-----------+--------+---------+----------+
178| FIN. APPL. | | | | | | |
179+------------+------------+----------+-----------+--------+---------+----------+
180| REQ. GRANT | | | | | | |
181+------------+------------+----------+-----------+--------+---------+----------+
182| WAITED | | | | | | |
183+------------+------------+----------+-----------+--------+---------+----------+
184| REL. NEXT | | GRANT | GRANT | | | |
185+------------+------------+----------+-----------+--------+---------+----------+
186| FINISHED | | | | | | |
187+------------------------------------------------------------------------------+
188
189 The W1 grant to W2 may happen when W2 is either in the FINISHED APPLYING
190 or REQUESTED GRANT stages. W1 must also signal the grant when W2 is in
191 FINISHED APPLYING because W1 has no way to determine if W2 has already
192 evaluated the first element of the queue or not, that is, W1 can't
193 determine if W2 will proceed to the REQUESTED GRANT or to the WAITED
194 stage. Therefore, W1 will signal in both cases.
195
196 */
198 public:
199 Commit_order_manager(uint32 worker_numbers);
200 // Copy logic is not available
203
204 // Copy logic is not available
206
207 /**
208 Initializes the MDL context for a given worker in the commit order queue.
209
210 @param worker The worker to initialize the context for
211 */
212 void init_worker_context(Slave_worker &worker);
213
214 /**
215 Register the worker into commit order queue when coordinator dispatches a
216 transaction to the worker.
217
218 @param[in] worker The worker which the transaction will be dispatched to.
219 */
220 void register_trx(Slave_worker *worker);
221
222 private:
223 /**
224 Determines if the worker passed as a parameter must wait on the MDL graph
225 for other workers to commit and, if it must, will wait for it's turn to
226 commit.
227
228 @param worker The worker to determine the commit waiting status for.
229
230 @return false if the worker is ready to commit, true if not.
231 */
232 bool wait_on_graph(Slave_worker *worker);
233 /**
234 Wait for its turn to commit or unregister.
235
236 @param[in] worker The worker which is executing the transaction.
237
238 @retval false All previous transactions succeed, so this transaction can
239 go ahead and commit.
240 @retval true One or more previous transactions rollback, so this
241 transaction should rollback.
242 */
243 bool wait(Slave_worker *worker);
244
245 /**
246 Unregister the thread from the commit order queue and signal
247 the next thread to awake.
248
249 @param[in] worker The worker which is executing the transaction.
250 */
251 void finish_one(Slave_worker *worker);
252
253 /**
254 Unregister the transaction from the commit order queue and signal the next
255 one to go ahead.
256
257 @param[in] worker The worker which is executing the transaction.
258 */
259 void finish(Slave_worker *worker);
260
261 /**
262 Reset server_status value of the commit group.
263
264 @param[in] first_thd The first thread of the commit group that needs
265 server_status to be updated.
266 */
267 void reset_server_status(THD *first_thd);
268
269 /**
270 Get rollback status.
271
272 @retval true Transactions in the queue should rollback.
273 @retval false Transactions in the queue shouldn't rollback.
274 */
275 bool get_rollback_status();
276
277 /**
278 Set rollback status to true.
279 */
280 void set_rollback_status();
281
282 /**
283 Unset rollback status to false.
284 */
286
287 void report_deadlock(Slave_worker *worker);
288
289 std::atomic<bool> m_rollback_trx;
290
291 /* It stores order commit order information of all workers. */
293
294 /**
295 Flush record of transactions for all the waiting threads and then
296 awake them from their wait. It also calls gtid_state->update_commit_group()
297 which updates both the THD and the Gtid_state for whole commit group to
298 reflect that the transaction set of transactions has ended.
299
300 @param[in] worker The worker which is executing the transaction.
301 */
303
304 public:
305 /**
306 Determines if the worker holding the commit order wait ticket
307 `wait_for_commit is in deadlock with the MDL context encapsulated in
308 the visitor parameter.
309
310 @param wait_for_commit The wait ticket being held by the worker thread.
311 @param gvisitor The MDL graph visitor to check for deadlocks against.
312
313 @return true if a deadlock has been found and false otherwise.
314 */
315 bool visit_lock_graph(Commit_order_lock_graph &wait_for_commit,
317
318 /**
319 Check if order commit deadlock happens.
320
321 Worker1(trx1) Worker2(trx2)
322 ============= =============
323 ... ...
324 Engine acquires lock A
325 ... Engine acquires lock A(waiting for
326 trx1 to release it.
327 COMMIT(waiting for
328 trx2 to commit first).
329
330 Currently, there are two corner cases can cause the deadlock.
331 - Case 1
332 CREATE TABLE t1(c1 INT PRIMARY KEY, c2 INT, INDEX(c2)) ENGINE = InnoDB;
333 INSERT INTO t1 VALUES(1, NULL),(2, 2), (3, NULL), (4, 4), (5, NULL), (6,
334 6)
335
336 INSERT INTO t1 VALUES(7, NULL);
337 DELETE FROM t1 WHERE c2 <= 3;
338
339 - Case 2
340 ANALYZE TABLE t1;
341 INSERT INTO t2 SELECT * FROM mysql.innodb_table_stats
342
343 Since this is not a real lock deadlock, it could not be handled by engine.
344 slave need to handle it separately.
345 Worker1(trx1) Worker2(trx2)
346 ============= =============
347 ... ...
348 Engine acquires lock A
349 ... Engine acquires lock A.
350 1. found trx1 is holding the lock.
351 2. report the lock wait to server code by
352 calling thd_report_row_lock_wait().
353 Then this function is called to check
354 if it causes a order commit deadlock.
355 Report the deadlock to worker1.
356 3. waiting for trx1 to release it.
357 COMMIT(waiting for
358 trx2 to commit first).
359 Found the deadlock flag set
360 by worker2 and then
361 return with ER_LOCK_DEADLOCK.
362
363 Rollback the transaction
364 Get lock A and go ahead.
365 ...
366 Retry the transaction
367
368 To conclude, The transaction A which is waiting for transaction B to commit
369 and is holding a lock which is required by transaction B will be rolled
370 back and try again later.
371
372 @param[in] thd_self The THD object of self session which is acquiring
373 a lock hold by another session.
374 @param[in] thd_wait_for The THD object of a session which is holding
375 a lock being acquired by current session.
376 */
377 static void check_and_report_deadlock(THD *thd_self, THD *thd_wait_for);
378
379 /**
380 Wait for its turn to commit or unregister.
381
382 @param[in] thd The THD object of current thread.
383
384 @retval false All previous transactions succeed, so this transaction can
385 go ahead and commit.
386 @retval true The transaction is marked to rollback.
387 */
388 static bool wait(THD *thd);
389
390 /**
391 Wait for its turn to unregister and signal the next one to go ahead. In case
392 error happens while processing transaction, notify the following transaction
393 to rollback.
394
395 @param[in] thd The THD object of current thread.
396 @param[in] error If true failure in transaction execution
397 */
398 static void wait_and_finish(THD *thd, bool error);
399
400 /**
401 Get transaction rollback status.
402
403 @param[in] thd The THD object of current thread.
404
405 @retval true Current transaction should rollback.
406 @retval false Current transaction shouldn't rollback.
407 */
408 static bool get_rollback_status(THD *thd);
409
410 /**
411 Unregister the thread from the commit order queue and signal
412 the next thread to awake.
413
414 @param[in] thd The THD object of current thread.
415 */
416 static void finish_one(THD *thd);
417
418 /**
419 Determines whether current thread needs to wait for its turn to commit and
420 unregister from the commit order queue. The sql commands ALTER TABLE, DROP
421 TABLE, DROP DB, OPTIMIZE TABLE, ANALYZE TABLE and REPAIR TABLE are allowed
422 to wait for its turn to commit and unregister from the commit order queue as
423 exception in MYSQL_BIN_LOG::ordered_commit(), as these transactions have
424 multiple commits and so not determined if the call is ending transaction.
425
426 @param[in] thd The THD object of current thread.
427
428 @retval true Allow thread to wait for it turn
429 @retval false Do not allow thread to wait for it turn
430 */
432};
433
434/**
435 MDL subgraph inspector class to be used as a ticket to wait on by worker
436 threads. Each worker will create its own instance of this class and will use
437 its own THD MDL_context to search for deadlocks.
438 */
440 public:
441 /**
442 Constructor for the class.
443
444 @param ctx The worker THD MDL context object.
445 @param mngr The Commit_order_manager instance associated with the current
446 channel's Relay_log_info object.
447 @param worker_id The identifier of the worker targeted by this object.
448
449 */
451 uint32 worker_id);
452 /**
453 Default destructor.
454 */
455 virtual ~Commit_order_lock_graph() override = default;
456
457 /**
458 Retrieves the MDL context object associated with the underlying worker.
459
460 @return A pointer to the MDL context associated with the underlying worker
461 thread.
462 */
463 MDL_context *get_ctx() const;
464 /**
465 Retrieves the identifier for the underlying worker thread.
466
467 @return The identifier for the underlying worker thread.
468 */
469 uint32 get_worker_id() const;
470 /**
471 Determines if the underlying worker is in deadlock with the MDL context
472 encapsulated in the visitor parameter.
473
474 @param dvisitor The MDL graph visitor to check for deadlocks against.
475
476 @return true if a deadlock was found and false otherwise,
477 */
478 bool accept_visitor(MDL_wait_for_graph_visitor *dvisitor) override;
479 /**
480 Retrieves the deadlock weight to be used to replace a visitor victim's, when
481 more than one deadlock is found.
482 */
483 uint get_deadlock_weight() const override;
484
485 private:
486 /** The MDL context object associated with the underlying worker. */
488 /**
489 The Commit_order_manager instance associated with the underlying worker
490 channel's Relay_log_info object.
491 */
493 /** The identifier for the underlying worker thread. */
495};
496
497/**
498 Determines whether current thread shall run the procedure here
499 to check whether it waits for its turn (and when its turn comes
500 unregister from the commit order queue).
501
502 The sql commands ALTER TABLE, ANALYZE TABLE, DROP DB, DROP EVENT,
503 DROP FUNCTION, DROP PROCEDURE, DROP TRIGGER, DROP TABLE, DROP VIEW,
504 OPTIMIZE TABLE and REPAIR TABLE shall run this procedure here, as
505 an exception, because these transactions have multiple intermediate
506 commits. Therefore cannot predetermine when the last commit is
507 done.
508
509 @param[in] thd The THD object of current thread.
510
511 @retval false Commit_order_manager object is not initialized
512 @retval true Commit_order_manager object is initialized
513*/
514bool has_commit_order_manager(const THD *thd);
515
516#endif /*RPL_REPLICA_COMMIT_ORDER_MANAGER*/
MDL subgraph inspector class to be used as a ticket to wait on by worker threads.
Definition: rpl_replica_commit_order_manager.h:439
uint get_deadlock_weight() const override
Retrieves the deadlock weight to be used to replace a visitor victim's, when more than one deadlock i...
Definition: rpl_replica_commit_order_manager.cc:545
virtual ~Commit_order_lock_graph() override=default
Default destructor.
MDL_context * get_ctx() const
Retrieves the MDL context object associated with the underlying worker.
Definition: rpl_replica_commit_order_manager.cc:533
Commit_order_lock_graph(MDL_context &ctx, Commit_order_manager &mngr, uint32 worker_id)
Constructor for the class.
Definition: rpl_replica_commit_order_manager.cc:528
bool accept_visitor(MDL_wait_for_graph_visitor *dvisitor) override
Determines if the underlying worker is in deadlock with the MDL context encapsulated in the visitor p...
Definition: rpl_replica_commit_order_manager.cc:539
uint32 get_worker_id() const
Retrieves the identifier for the underlying worker thread.
Definition: rpl_replica_commit_order_manager.cc:535
uint32 m_worker_id
The identifier for the underlying worker thread.
Definition: rpl_replica_commit_order_manager.h:494
MDL_context & m_ctx
The MDL context object associated with the underlying worker.
Definition: rpl_replica_commit_order_manager.h:487
Commit_order_manager & m_mngr
The Commit_order_manager instance associated with the underlying worker channel's Relay_log_info obje...
Definition: rpl_replica_commit_order_manager.h:492
On a replica and only on a replica, this class is responsible for committing the applied transactions...
Definition: rpl_replica_commit_order_manager.h:197
void report_deadlock(Slave_worker *worker)
Definition: rpl_replica_commit_order_manager.cc:348
void init_worker_context(Slave_worker &worker)
Initializes the MDL context for a given worker in the commit order queue.
Definition: rpl_replica_commit_order_manager.cc:53
bool wait(Slave_worker *worker)
Wait for its turn to commit or unregister.
Definition: rpl_replica_commit_order_manager.cc:143
void register_trx(Slave_worker *worker)
Register the worker into commit order queue when coordinator dispatches a transaction to the worker.
Definition: rpl_replica_commit_order_manager.cc:57
Commit_order_manager(uint32 worker_numbers)
Definition: rpl_replica_commit_order_manager.cc:46
static void check_and_report_deadlock(THD *thd_self, THD *thd_wait_for)
Check if order commit deadlock happens.
Definition: rpl_replica_commit_order_manager.cc:332
static void wait_and_finish(THD *thd, bool error)
Wait for its turn to unregister and signal the next one to go ahead.
Definition: rpl_replica_commit_order_manager.cc:379
bool get_rollback_status()
Get rollback status.
Definition: rpl_replica_commit_order_manager.cc:413
void finish_one(Slave_worker *worker)
Unregister the thread from the commit order queue and signal the next thread to awake.
Definition: rpl_replica_commit_order_manager.cc:263
static bool wait_for_its_turn_before_flush_stage(THD *thd)
Determines whether current thread needs to wait for its turn to commit and unregister from the commit...
Definition: rpl_replica_commit_order_manager.cc:508
void unset_rollback_status()
Unset rollback status to false.
Definition: rpl_replica_commit_order_manager.cc:419
cs::apply::Commit_order_queue m_workers
Definition: rpl_replica_commit_order_manager.h:292
Commit_order_manager(const Commit_order_manager &)=delete
bool visit_lock_graph(Commit_order_lock_graph &wait_for_commit, MDL_wait_for_graph_visitor &gvisitor)
Determines if the worker holding the commit order wait ticket `wait_for_commit is in deadlock with th...
Definition: rpl_replica_commit_order_manager.cc:454
bool wait_on_graph(Slave_worker *worker)
Determines if the worker passed as a parameter must wait on the MDL graph for other workers to commit...
Definition: rpl_replica_commit_order_manager.cc:71
void flush_engine_and_signal_threads(Slave_worker *worker)
Flush record of transactions for all the waiting threads and then awake them from their wait.
Definition: rpl_replica_commit_order_manager.cc:201
std::atomic< bool > m_rollback_trx
Definition: rpl_replica_commit_order_manager.h:289
void finish(Slave_worker *worker)
Unregister the transaction from the commit order queue and signal the next one to go ahead.
Definition: rpl_replica_commit_order_manager.cc:302
Commit_order_manager & operator=(const Commit_order_manager &)=delete
void set_rollback_status()
Set rollback status to true.
Definition: rpl_replica_commit_order_manager.cc:417
void reset_server_status(THD *first_thd)
Reset server_status value of the commit group.
Definition: rpl_replica_commit_order_manager.cc:255
Context of the owner of metadata locks.
Definition: mdl.h:1411
An abstract class for inspection of a connected subgraph of the wait-for graph.
Definition: mdl.h:919
Abstract class representing an edge in the waiters graph to be traversed by deadlock detection algori...
Definition: mdl.h:945
Definition: rpl_rli_pdb.h:498
For each client connection we create a separate thread with THD serving as a thread/connection descri...
Definition: sql_lexer_thd.h:34
Queue to maintain the ordered sequence of workers waiting for commit.
Definition: commit_order_queue.h:55
Some integer typedefs for easier portability.
uint32_t uint32
Definition: my_inttypes.h:67
Instrumentation helpers for conditions.
ABI for instrumented mutexes.
bool has_commit_order_manager(const THD *thd)
Determines whether current thread shall run the procedure here to check whether it waits for its turn...
Definition: rpl_replica_commit_order_manager.cc:503
unsigned int uint
Definition: uca9-dump.cc:75