WL#4621: Recovery process to synchronize the master.info, the relay-log.info and the relay-log after a failure.

Affects: Server-6.0   —   Status: Un-Assigned   —   Priority: High

SUMMARY
-------
Recovery process to synchronize the master.info, the relay-log.info and the
relay-log after a failure.

MOTIVATION
----------
The master.info and the relay log must be atomically updated to guarantee that
there is no missing event in the slave and events are not processed twice. If
for some reason, they get out of sync, the slave may diverge from the master.

For example, a failure may lead to corrupted events in the relay log, and during
initialization such events are identified and removed from it (WL#5493). 

If during initialization the master.info is not synchronized with the relay log,
the non-corrupted versions of the removed events may not be retrieved from the
master thus making the slave to diverge from the master.

RELATED WORKLOGS
----------------
WL#2775 System tables for master.info, relay_log.info.
WL#5493 Binlog crash-safe when master crashed.
WL#3584 Global Transaction Identifiers (GTIDs)
WL#7353 Relay log recovery should truncate partial transactions
- May be a sub-worklog of the present worklog
BACKGROUND
----------
             ----------   -----------------------------------
             |        |   |              SLAVE              |
             |        |   | -------------    -------------- |  
             |        |-->| | IO Thread |--->| SQL Thread | |
             |        |   | -------------    -------------- |
             | Master |   |   |      |         |       |    |
             |        |   |   V      v         V       v    |
             |        |   | -----  -----     -----   -----  |
             |        |   | |(1)|  |(2)|     |(3)|   |(4)|  |
             |        |   | -----  -----     -----   -----  |
             ----------   -----------------------------------

. The IO Thread is responsible for retrieving from the master the set of
  events that shall be applied at the slave:

  . An event is stored at the relay log (1) and information on the next event 
    that shall be retrieved, i.e. binary log and offset within it, is updated 
    and stored in the master.info (2).

  . If the information on the next event is not accurate in the master.info, 
    the IO Thread may read events twice or not read some events thus making the
    slave to diverge from the master.

. The SQL Thread is responsible for reading events from the relay log and apply 
  them:

  . After applying (3) an event, the relay log.info (4) is updated with 
    information on the next event that the SQL Thread will read from the relay
    log, i.e. the relay log and offset within it.
 
  . If the information on the next event is not accurate in the relay log.info, 
    the SQL Thread may read events twice or not read some events thus making the
    slave to diverge from the master.

We can draw the following conclusions, 

  . The SQL Thread must atomically update the relay log.info and the database. 
    To accomplish this, we have implemented WL#2755. Further details on how
    the SQL Thread handle failures can be found at WL#2775.

  . The IO Thread must atomically update the master.info and the relay log.
    In this case, however, guaranteeing atomicity in the presence of failures
    requires a 2-PC which has a prohibitive cost for us. 

PROPOSED SOLUTION
-----------------
We propose to use a loose synchronization approach where inconsistencies are
detected during startup. This idea is outlined through the following steps:   

  . The master.info and the relay log will be asynchronously written to disk. 

     . In fact, we will deprecate the information on the next event in the 
        master.info, i.e. binary log and offset within it.

     . Create an information schema to show the information on the next event.

  . Invalid events, i.e. possibly corrupted events, will be removed from the 
    relay log as described in WL#5493.

  . After a normal shutdown or a failure, the master.info and relay log will be 
    synchronized through a recovery routine.


The recovery routine exploits the fact that the information on the next event to
be retrieved from the master, i.e. binary log and offset within it, can be
calculated from an event.

This is possible because an event has information on the binary log which it
came from, the offset within it and its own size. So given an event (e), the
information on the next event (e') can be easily obtained as follows:

  . (e' . binary log, e' . offset) = (e . binary log, e . offset + e . size) 

In the case, that the next event is stored in a new binary log, the master will
receive a request to read after the end of the previous binary log, and will
redirect it to the new one.

SIDE EFFECTS
------------
. "sync-master-info" becomes unnecessary as the information on the next event to
be retrieved from the master can be easily retrieved from the relay log after
recovery. In the future, we may deprecate it.
 
. "sync-relay-log" becomes unnecessary as if there is a corrupted event, it will
be removed from the relay log and the master.info and relay log.info will be
synced as aforementioned. In the future, we may deprecate it.

. "relay-log-recovery" becomes unnecessary and shall be automatically done. In
the future, we may deprecate it.
RECOVERY ROUTINE
----------------
The key point of the proposed solution is a recovery routine that will:

  . Truncate possible corrupted events from the most recent relay log as similar 
    implemented in WL#5493. See WL#5493 for further details on how corrupted
    events are detected.

  . Get the most recent valid event and calculate the information on the next 
    event to be retrieved from the master, i.e., (e' . binary log, e' . offset).

  . If (e' . binary log, e' . offset) is lower than (e'' . binary log, e'' . 
    offset) that represents the most recent event applied by the SQL Thread, the
    missing events are retrieved and (e'' . binary log, e'' . offset) is used.

The IO Thread and the SQL Thread share a buffer, i.e. IO CAHCE. The IO Thread
reads events from the master, stores them in the buffer and flush the buffer to
disk. So the SQL Thread may read events from the buffer and process them before
they are flushed to disk.

After a recovery, the SQL Thread might have processed events that are not in the
relay log because they were not flushed to or were flushed to but due to any
issue were corrupted and then were removed.

In such cases, the missing events are retrieved and the SQL Thread start
processing from the position it stopped.