MySQL :: STOP SLAVE Improvements for Multi-Threaded Slaves

When using a multi-threaded slave, the STOP SLAVE command could take a long time to return because the slave waited for workers to catch up processing the queue. This blog post is about improvements that have been made in MySQL 5.6.26 and later to ensure that STOP SLAVE returns quickly when using a multi-threaded slave (MTS).

Background

Multi-threaded slave here refers to inter-transactional parallelization while applying transactions to the slave’s databases. This makes replication more scalable, especially on multi-core architectures. In MTS mode a receiver thread (IO thread) receives events/transactions from the master and queues them in the local relay log; an applier thread (SQL coordinator) reads these events and schedules them to workers for execution. The following image depicts how the worker’s queue looks in the case of two workers.

Stopping workers abruptly (like KILL will do) might generate gaps in the slave execution stream. If workers are at a state as shown in Image1 above and a KILL command is executed to KILL both coordinator and workers, then the transactions which are under execution will be completed and the slave will come to a stop. In this scenario transaction t5 will not be executed and this forms a gap in the MTS execution sequence. Filling of these gaps is efficiently handled by GTIDs. The auto positioning replication protocol will take care of retrieving events from the first transaction that MTS didn’t apply, but it will also retrieve all of the transactions after that, even those that have already been applied. Fortunately, when GTIDs are enabled there is no problem in applying a transaction twice (as the second attempt will not change anything, it will simply be skipped). If GTIDs are not enabled then transactions will be reapplied which will lead to data inconsistencies.

Executing the STOP SLAVE command in MTS always guarantees that the slave stops at a gap-less state. In order to achieve this state prior to MySQL 5.6.26, the STOP SLAVE command will wait untill all workers complete their assigned tasks. Let us consider the Image1 scenario. Executing the STOP SLAVE command in this scenario will result in a state as shown in Image3 below.

The STOP SLAVE command waited until transactions T8 and T9 are executed. The above process ensures a gap-less stopping point, but the time taken for the STOP SLAVE command to complete is very high. In the case of large worker queues the command can take even more time to return.

In MySQL 5.6.26 and later, however, the STOP SLAVE command does not wait for the worker’s entire queue to be consumed, instead it identifies the nearest gap-less state and stops at that point. The following image (Image4) depicts the state of worker’s queue after this improvement–where the STOP SLAVE command did not wait until the completion of transactions T8 and T9.

How the improved STOP SLAVE algorithm works

When the MTS coordinator thread receives a STOP command, it will notify all workers to STOP by setting their current running_status to STOP. Once this STOP command is received, all workers will try to identify a maximum group index which has been already executed (or is being executed). A group index is a counter that is incremented with each new group/transaction being assigned to a worker. If a worker has events which still need to be executed, then the current group index is used, and if the worker has an empty queue then the last executed group’s index is used for calculation of max_group_index.

Once the max_group_index value is determined, then the workers who have a group index value less than or equal to the maximum group index value will apply their groups. Workers who have group index values above the maximum group index value, or if they have an empty queue, will simply exit. Note that each worker will contribute only once to the max_group_index. With this process we can ensure that the slave will stop at the nearest gap-less state. The following example will demonstrate how the improved STOP SLAVE algorithm works.

An example

We have two databases — d1 and d2 — and the following tables in these databases. The transactions and their respective group index values (g_idx) on the slave:

--On connection to master
CREATE DATABASE d1; -- g_idx:1
CREATE DATABASE d2; -- g_idx:2
CREATE TABLE d1.t (a INT PRIMARY KEY, name text) ENGINE=INNODB; --g_idx:3
CREATE TABLE d2.t (a INT PRIMARY KEY, name text) ENGINE=INNODB; --g_idx:4
--sync slave with master here

--On connection to master

CREATE DATABASE d1; -- g_idx:1

CREATE DATABASE d2; -- g_idx:2

CREATE TABLE d1.t (a INT PRIMARY KEY, name text) ENGINE=INNODB; --g_idx:3

CREATE TABLE d2.t (a INT PRIMARY KEY, name text) ENGINE=INNODB; --g_idx:4

--sync slave with master here

On the slave we begin a transaction and directly insert two rows:

--On connection to slave
BEGIN;
INSERT INTO d2.t VALUES (2, 'Slave local'); # To hold T3 which will come from master
INSERT INTO d1.t VALUES (3, 'Slave local'); # To hold T6 which will come from master

--On connection to slave

BEGIN;

INSERT INTO d2.t VALUES (2, 'Slave local'); # To hold T3 which will come from master

INSERT INTO d1.t VALUES (3, 'Slave local'); # To hold T6 which will come from master

On the master we execute the following transactions, which then get replicated to the slave. The slave will then block transactions T3 and T6 because we have a primary key constraint on d1.t and d2.t.

--On connection to master
INSERT INTO d1.t VALUES (1, 'T1'); g_idx:5
INSERT INTO d2.t VALUES (1, 'T2'); g_idx:6
INSERT INTO d2.t VALUES (2, 'T3'); g_idx:7 # Will be blocked on Slave
INSERT INTO d2.t VALUES (3, 'T4'); g_idx:8
INSERT INTO d1.t VALUES (2, 'T5'); g_idx:9
INSERT INTO d1.t VALUES (3, 'T6'); g_idx:10 # Will be blocked on Slave #GAP LESS STATE --#1
INSERT INTO d2.t VALUES (4, 'T7'); g_idx:11 # This should not be executed after STOP SLAVE
INSERT INTO d2.t VALUES (5, 'T8'); g_idx:12 # This should not be executed after STOP SLAVE
INSERT INTO d1.t VALUES (4, 'T9'); g_idx:13 # This should not be executed after STOP SLAVE #GAP LESS STATE--#2

--On connection 1 to slave
--send STOP SLAVE, which will wait for workers to exit

--On connection 2 to slave
ROLLBACK #The above command will release the temporary hold on T3 and T6

--On connection to master

INSERT INTO d1.t VALUES (1, 'T1'); g_idx:5

INSERT INTO d2.t VALUES (1, 'T2'); g_idx:6

INSERT INTO d2.t VALUES (2, 'T3'); g_idx:7 # Will be blocked on Slave

INSERT INTO d2.t VALUES (3, 'T4'); g_idx:8

INSERT INTO d1.t VALUES (2, 'T5'); g_idx:9

INSERT INTO d1.t VALUES (3, 'T6'); g_idx:10 # Will be blocked on Slave #GAP LESS STATE --#1

INSERT INTO d2.t VALUES (4, 'T7'); g_idx:11 # This should not be executed after STOP SLAVE

INSERT INTO d2.t VALUES (5, 'T8'); g_idx:12 # This should not be executed after STOP SLAVE

INSERT INTO d1.t VALUES (4, 'T9'); g_idx:13 # This should not be executed after STOP SLAVE #GAP LESS STATE--#2

--On connection 1 to slave

--send STOP SLAVE, which will wait for workers to exit

--On connection 2 to slave

ROLLBACK #The above command will release the temporary hold on T3 and T6

Now let’s assume that we have executed the above sequence of operations on the master and a multi-threaded slave. We’ll then wait for some time, so that the coordinator on the slave populates all of the above transactions into its worker’s queues. Now we’ll execute the STOP SLAVE command and thus release the hold on T3 and T6. Workers W1 and W2 will proceed and participate in the calculation of the max_group_index value. W1 will provide 10 and W2 will provide 7 as W1 is currently executing transaction T6 (g_idx:10) and W2 is executing T3 (g_idx:7). From these values max_group_index is identified and is set to ’10’. Then the slave stops at the first gap-less state (group index 10) with this improved algorithm.

If we allow the workers to continue and participate once again, in the case where W2 is busy, if W1 after completing T6 picks T9 and contributes once again to max_group_index then the gap-less window will be extended up to T9. Hence the slave will not be able to guarantee that it stopped at the nearest gap-less point. This will once again make the STOP SLAVE command slow. So each worker should participate only once during the calculation of max_updated_index. The new algorithm ensures that the slave will stop at #1.

The above changes were done as part of Bug#75525 in MySQL 5.6.26.

A quick demonstration

Enable MTS with slave_parallel_workers=2.
Step 1: On the Master – Create two databases and two tables.

--On connection to master
mysql> CREATE DATABASE d1; CREATE TABLE d1.a(i int) ENGINE=INNODB;
Query OK, 1 row affected (0.00 sec)

Query OK, 0 rows affected (0.08 sec)

mysql> CREATE DATABASE d2; CREATE TABLE d2.a(i int) ENGINE=INNODB;
Query OK, 1 row affected (0.00 sec)

Query OK, 0 rows affected (0.09 sec)

--On connection to master

mysql> CREATE DATABASE d1; CREATE TABLE d1.a(i int) ENGINE=INNODB;

Query OK, 1 row affected (0.00 sec)

Query OK, 0 rows affected (0.08 sec)

mysql> CREATE DATABASE d2; CREATE TABLE d2.a(i int) ENGINE=INNODB;

Query OK, 1 row affected (0.00 sec)

Query OK, 0 rows affected (0.09 sec)

Step 2: On the Slave – Create a trigger with a sleep to introduce some delay on the slave for each INSERT.

--On connection to slave
mysql> CREATE TRIGGER d1.iai AFTER INSERT ON d1.a FOR EACH ROW DO SLEEP(1);
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TRIGGER d2.iai AFTER INSERT ON d2.a FOR EACH ROW DO SLEEP(1);
Query OK, 0 rows affected (0.02 sec)

--On connection to slave

mysql> CREATE TRIGGER d1.iai AFTER INSERT ON d1.a FOR EACH ROW DO SLEEP(1);

Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TRIGGER d2.iai AFTER INSERT ON d2.a FOR EACH ROW DO SLEEP(1);

Query OK, 0 rows affected (0.02 sec)

Step 3: On the Master: Execute a small procedure which INSERTs 1000 tuples in two databases: d1 and d2.

--On connection to master
mysql> USE d1;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql>DELIMITER //
mysql>CREATE PROCEDURE simple_insert()
    -> BEGIN
    -> DECLARE id INT;
    -> SET id = 1;
    -> WHILE id  INSERT INTO d1.a (i) VALUES (id);
    -> INSERT INTO d2.a (i) VALUES (id);
    -> SET id =  id + 1;
    -> END WHILE;
    -> END//
Query OK, 0 rows affected (0.00 sec)

mysql> DELIMITER ;
mysql> CALL simple_insert();
Query OK, 1 row affected (11.68 sec)

--On connection to master

mysql> USE d1;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

mysql>DELIMITER //

mysql>CREATE PROCEDURE simple_insert()

-> BEGIN

-> DECLARE id INT;

-> SET id = 1;

-> WHILE id INSERT INTO d1.a (i) VALUES (id);

-> INSERT INTO d2.a (i) VALUES (id);

-> SET id = id + 1;

-> END WHILE;

-> END//

Query OK, 0 rows affected (0.00 sec)

mysql> DELIMITER ;

mysql> CALL simple_insert();

Query OK, 1 row affected (11.68 sec)

Step 4: On the Slave – Immediately after the completion of step #3 on the master, execute STOP SLAVE on the Slave.

Before the fix:

--On connection to slave
mysql> STOP SLAVE;
Query OK, 0 rows affected (4 min 18.66 sec)

--On connection to slave

mysql> STOP SLAVE;

Query OK, 0 rows affected (4 min 18.66 sec)

After the fix:

--On connection to slave
mysql> STOP SLAVE;
Query OK, 0 rows affected (0.77 sec)

--On connection to slave

mysql> STOP SLAVE;

Query OK, 0 rows affected (0.77 sec)

Summary

Prior to the fix for Bug#75525, issuing a STOP SLAVE command on a multi-threaded slave would always wait for all workers to finish their queues and then STOP. This meant the STOP SLAVE command could take a long time to complete. Now after the Bug#75525 fix in MySQL 5.6.26, the STOP SLAVE command will identify the nearest gap-less state and it will stop at that point. Please try it out and let us know your feedback!