WL#4621: Recovery process to synchronize the master.info, the relay-log.info and the relay-log after a failure.
Affects: Server-6.0
—
Status: Un-Assigned
SUMMARY ------- Recovery process to synchronize the master.info, the relay-log.info and the relay-log after a failure. MOTIVATION ---------- The master.info and the relay log must be atomically updated to guarantee that there is no missing event in the slave and events are not processed twice. If for some reason, they get out of sync, the slave may diverge from the master. For example, a failure may lead to corrupted events in the relay log, and during initialization such events are identified and removed from it (WL#5493). If during initialization the master.info is not synchronized with the relay log, the non-corrupted versions of the removed events may not be retrieved from the master thus making the slave to diverge from the master. RELATED WORKLOGS ---------------- WL#2775 System tables for master.info, relay_log.info. WL#5493 Binlog crash-safe when master crashed. WL#3584 Global Transaction Identifiers (GTIDs) WL#7353 Relay log recovery should truncate partial transactions - May be a sub-worklog of the present worklog
BACKGROUND ---------- ---------- ----------------------------------- | | | SLAVE | | | | ------------- -------------- | | |-->| | IO Thread |--->| SQL Thread | | | | | ------------- -------------- | | Master | | | | | | | | | | V v V v | | | | ----- ----- ----- ----- | | | | |(1)| |(2)| |(3)| |(4)| | | | | ----- ----- ----- ----- | ---------- ----------------------------------- . The IO Thread is responsible for retrieving from the master the set of events that shall be applied at the slave: . An event is stored at the relay log (1) and information on the next event that shall be retrieved, i.e. binary log and offset within it, is updated and stored in the master.info (2). . If the information on the next event is not accurate in the master.info, the IO Thread may read events twice or not read some events thus making the slave to diverge from the master. . The SQL Thread is responsible for reading events from the relay log and apply them: . After applying (3) an event, the relay log.info (4) is updated with information on the next event that the SQL Thread will read from the relay log, i.e. the relay log and offset within it. . If the information on the next event is not accurate in the relay log.info, the SQL Thread may read events twice or not read some events thus making the slave to diverge from the master. We can draw the following conclusions, . The SQL Thread must atomically update the relay log.info and the database. To accomplish this, we have implemented WL#2755. Further details on how the SQL Thread handle failures can be found at WL#2775. . The IO Thread must atomically update the master.info and the relay log. In this case, however, guaranteeing atomicity in the presence of failures requires a 2-PC which has a prohibitive cost for us. PROPOSED SOLUTION ----------------- We propose to use a loose synchronization approach where inconsistencies are detected during startup. This idea is outlined through the following steps: . The master.info and the relay log will be asynchronously written to disk. . In fact, we will deprecate the information on the next event in the master.info, i.e. binary log and offset within it. . Create an information schema to show the information on the next event. . Invalid events, i.e. possibly corrupted events, will be removed from the relay log as described in WL#5493. . After a normal shutdown or a failure, the master.info and relay log will be synchronized through a recovery routine. The recovery routine exploits the fact that the information on the next event to be retrieved from the master, i.e. binary log and offset within it, can be calculated from an event. This is possible because an event has information on the binary log which it came from, the offset within it and its own size. So given an event (e), the information on the next event (e') can be easily obtained as follows: . (e' . binary log, e' . offset) = (e . binary log, e . offset + e . size) In the case, that the next event is stored in a new binary log, the master will receive a request to read after the end of the previous binary log, and will redirect it to the new one. SIDE EFFECTS ------------ . "sync-master-info" becomes unnecessary as the information on the next event to be retrieved from the master can be easily retrieved from the relay log after recovery. In the future, we may deprecate it. . "sync-relay-log" becomes unnecessary as if there is a corrupted event, it will be removed from the relay log and the master.info and relay log.info will be synced as aforementioned. In the future, we may deprecate it. . "relay-log-recovery" becomes unnecessary and shall be automatically done. In the future, we may deprecate it.
RECOVERY ROUTINE ---------------- The key point of the proposed solution is a recovery routine that will: . Truncate possible corrupted events from the most recent relay log as similar implemented in WL#5493. See WL#5493 for further details on how corrupted events are detected. . Get the most recent valid event and calculate the information on the next event to be retrieved from the master, i.e., (e' . binary log, e' . offset). . If (e' . binary log, e' . offset) is lower than (e'' . binary log, e'' . offset) that represents the most recent event applied by the SQL Thread, the missing events are retrieved and (e'' . binary log, e'' . offset) is used. The IO Thread and the SQL Thread share a buffer, i.e. IO CAHCE. The IO Thread reads events from the master, stores them in the buffer and flush the buffer to disk. So the SQL Thread may read events from the buffer and process them before they are flushed to disk. After a recovery, the SQL Thread might have processed events that are not in the relay log because they were not flushed to or were flushed to but due to any issue were corrupted and then were removed. In such cases, the missing events are retrieved and the SQL Thread start processing from the position it stopped.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.