WL#344: Time-delayed Replication
Affects: Server-5.6
—
Status: Complete
Pushed to next-mr on 2010-06-26. Summary ======= Make replication slave to lag a specified amount of time behind the master. Purposes and use cases ====================== This feature can be used for several purposes: (P1) Protect against user mistakes on master. A DBA that makes a disaster on a master that has a delayed slave can roll back to the time just before the disaster. Example: assume there is one master and one slave, the slave is running with a delay, and the DBA accidentally drops all tables on the master. The mistake can be undone with minimal downtime, as follows: slave> STOP SLAVE Replace the master by the slave and shut down the old master. slave> CHANGE MASTER TO MASTER_DELAY = 0 slave> START SLAVE SQL_THREAD UNTILVariations of the above scheme can be used depending on the precise setup and the application's requirements. Cf. BUG#21639, BUG#22072 (P2) Test how the system behaves when there is a lag. For example, in a real application a lag might be caused by a big load on the slave. However, it can be difficult to generate a big load. This feature comes in handy to simulate the lag without having to simulate the load. (P3) Debug conditions related to a lagging slave. For example, in BUG#28760, the reporter had problems which they suspected were caused by a network lag. Debugging these problems was greatly simplified using an early version of this worklog. (P4) Inspect what the database looked like long ago, without having to reload a backup. For instance, if the delay is one week, and the DBA did development for a couple of days, and the DBA needs to go back and see what the database used to look like before the development, then the delayed slave can be inspected. Cf. BUG#22072 ("how to repeat")
Requirements ============ CHANGE MASTER: new option to set the delay ------------------------------------------ Introduce the following new syntax in CHANGE MASTER: CHANGE MASTER TO MASTER_DELAY =; is of type INTEGER and specifies the number of seconds to delay. If a non-integer value is specified, or if a negative value is specified, then an error shall be generated and the statement shall not execute. Semantics: how the option affects the slave threads --------------------------------------------------- The slave IO thread is not affected by the delay. The master is not affect by the delay either. After the slave SQL thread has read an event and before it has executed the event, the slave SQL thread shall read the event's timestamp (see "Chained replication and time differences" below for a precise definition of what timestamp to use). If the event's timestamp is greater than the current time minus the delay, then the SQL thread sleeps until the delay is reached before executing the event. Exception: the SQL thread shall not wait for the following event types: START_EVENT_V3 ROTATE_EVENT FORMAT_DESCRIPTION_EVENT There are two reasons why we should not wait for these events: (1) These events don't affect anything else than the internal state of the SQL thread; (2) ROTATE_EVENT sometimes has wrong timestamp. If the slave SQL thread is stopped or killed while it is sleeping, or if the slave server is killed, then: - the sleep shall be interrupted immediately (it shall not finish sleeping) - the slave SQL thread shall not execute the event it was trying to delay (nor shall it execute any other event). Chained replication and time differences ---------------------------------------- In a chained replication setup, the event's timestamp shall reflect the time that the event was executed on the immediate master, not the time it was executed on the master that first executed the event. For example, if the replication topology is master_1 -> master_2 -> master_3, then the timestamp of an event received by master_3 is equal to the time when the event was executed on master_2, not the time when the event was executed on master_1. If the master and slave have different system times, then the difference shall be compensated: the delay time shall be independent of any unsynchronized clocks and reflect the absolute time after the master executed the event. Note: due to BUG#52308, the timestamp of the immediate master is not known to the slave. Only the timestamp of the originating master is known. So before BUG#52308 is fixed, the timestamp will (incorrectly) be the timestamp of the originating master. RESET SLAVE: clear the delay ---------------------------- RESET SLAVE shall set the delay to 0. SHOW PROCESSLIST: new state for the SQL thread ---------------------------------------------- If the SQL thread has read an event that is newer than the current time minus the delay, then the SQL thread has the state: "Waiting until MASTER_DELAY seconds after master executed event" SHOW SLAVE STATUS: three new fields ----------------------------------- Introduce three new fields to the output from SHOW SLAVE STATUS: SQL_Delay The number of seconds that the slave must be behind the master: a nonnegative integer. Slave_SQL_Running_State The state of the SQL thread (similar to Slave_IO_State): a string. The value is identical to the "State" of the SQL thread as displayed by SHOW PROCESSLIST. The name Slave_SQL_Running_State was chosen instead of the name Slave_SQL_State (which would rhyme better with the existing Slave_IO_State) to avoid possible confusion with SQLSTATE. SQL_Remaining_Delay When Slave_SQL_Running_State is "Waiting until MASTER_DELAY seconds after master executed event", then this field contains the number of seconds left of the delay: an integer. In other cases, this field is NULL. Durability of option: relay-log.info ------------------------------------ When the server is restarted, the previous delay value shall be restored. This is achieved by saving the delay in relay-log.info. Note: the format of relay-log.info has to be modified in a non-trivial manner. The previous format of relay-log.info was as follows: rli->group_relay_log_name [\n-terminated ascii string] rli->group_relay_log_pos [\n-terminated ascii-encoded number] rli->group_master_log_name [\n-terminated ascii string] rli->group_master_log_pos [\n-terminated ascii-encoded number] possibly a string of "garbage" Note that the file may end with garbage. The reason is an optimization in writing the relay log: when a new relay log is written, the old relay log is overwritten in place. If the new relay log is shorter than the old relay log, the new relay log is not truncated, so the end of the relay log may contain parts of previous relay logs. This means that we cannot add the delay as just another \n-encoded ascii string at the end of relay-log.info, because there would be no way to distinguish a relay log in the old format that has garbage after the fourth field from a new relay log that has a delay as a fifth field. Suppose the DBA executes the following actions: (1) run a slave using a version before this worklog; (2) shut down the old slave server when there is garbage at the end of relay-log.info; (3) upgrade the old slave server to a version after this worklog; (4) start the new slave server. Then the new slave server would try to read the garbage as a delay. In the worst case it would use a previous very big binlog position as a delay, effectively preventing the slave SQL thread from running. So instead, we will use a new format for relay-log.info, as follows: number_of_fields_in_file [\n-terminated ascii-encoded integer] rli->group_relay_log_name [\n-terminated ascii string] rli->group_relay_log_pos [\n-terminated ascii-encoded integer] rli->group_master_log_name [\n-terminated ascii string] rli->group_master_log_pos [\n-terminated ascii-encoded integer] rli->delay [\n-terminated ascii-encoded integer] possibly a string of "garbage" Note: number_of_fields_in_file will be 5 after this worklog, but can be increased in the future if more fields are needed. When the new server reads the file, it shall first read just the first line. If the first line is an integer, then: (1) Check that the integer value is 5 or more; if not, generate an error. (2) Assume the file uses the new format and read it accordingly. If the first line is not a number, then read the old format file. This is analogous to the format of master.info. It is not possible to confuse the two formats, because rli->group_relay_log_name always contains a dot ('.'); the integer does not. CHANGES THAT NEED TO BE DOCUMENTED ================================== New syntax for CHANGE MASTER statement -------------------------------------- For http://dev.mysql.com/doc/refman/5.1/en/change-master-to.html CHANGE MASTER now supports this syntax: CHANGE MASTER TO MASTER_DELAY = N N is an integer, 0 <= N < 2^31. After a subsequent execution of START SLAVE [SQL_THREAD], the slave will guarantee that it does not execute statements until N seconds later than they were executed on the master. The default is 0. New restriction in BINLOG statement ----------------------------------- For http://dev.mysql.com/doc/refman/5.1/en/binlog.html Before this patch, the BINLOG statement could execute all types of events. After this patch, the BINLOG statement can only execute format_description_log_events and row events (including TABLE_MAP_EVENT, WRITE_ROWS_EVENT, UPDATE_ROWS_EVENT, DELETE_ROWS_EVENT, PRE_GA_WRITE_ROWS_EVENT, PRE_GA_UPDATE_ROWS_EVENT, PRE_GA_DELETE_ROWS_EVENT). It means that the output from mysqlbinlog --base64-output=always cannot be parsed (--base64-output=always is documented as "only for debugging and should not be used in a production system"). (Note [2011-01-11]: After WL#5404, BINLOG will also be able to execute ROWS_QUERY_LOG_EVENT. /Sven) New fields in the output from SHOW SLAVE STATUS ----------------------------------------------- For http://dev.mysql.com/doc/refman/5.1/en/show-slave-status.html Three new fields have been added to SHOW SLAVE STATUS. See above, header "SHOW SLAVE STATUS: three new fields" New functionality of RESET SLAVE -------------------------------- For http://dev.mysql.com/doc/refman/5.1/en/reset-slave.html RESET SLAVE now also does the equivalent of "CHANGE MASTER TO MASTER_DELAY = 0".
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.