WL#6964: MTS: Support SLAVE_TRANSACTION_RETRIES in MTS mode

Affects: Server-5.7   —   Status: Complete   —   Priority: Medium

Summary
=======

This worklog enables the feature that attempts to retry a transaction
after a temporary failure in MTS replication slave servers.

Currently, only non-MTS replication slave servers attempts to retry a
transaction after a temporary failure.

References
==========

This WL addresses a feature request of: 
 BUG#68465 MAKE SLAVE_TRANSACTION_RETRIES WORK WITH MTS
Functional Requirements
=======================

F-1: When the slave worker encounters an temporary error, it should rollback the 
     transaction and try to apply the failed transaction again 
     if SLAVE_TRANSACTION_RETRIES is greater than zero. And the slave worker
     should retry the transaction at most SLAVE_TRANSACTION_RETRIES times.

     If the transaction still fails with temporary error after retried 
     SLAVE_TRANSACTION_RETRIES times, the worker will be stopped with an error.  
     Coordinator and all other worker threads will be stopped too.

F-2: Temporary errors are same to the definition of temporary error in single 
     threaded transaction retry feature. They are ER_LOCK_DEADLOCK and 
     ER_LOCK_WAIT_TIMEOUT.


F-3: Two or more workers should be able to retry their transactions at the same 
     time.

Non-Functional Requirements
===========================
NF-1: All other workers should keep running while one or more workers are  
      retrying their transactions.

LIMITATION
==========
Users can change relay log's name through setting relay-log option. so two 
existing relay log may have different relay log base name. However, this design 
suppose all relay log has same basename which is same to 'relay_log_basename' 
variable.

So users should guarantee that all relay log's base names are same to 
'relay_log_basename' before opening this feature. Otherwise, slave may stop with 
an error while retrying a transaction. Anyway, it happens rarely.
High-Level Specification
========================
I-1: Remove the warning that SLAVE_TRANSACTION_RETRIES is not supported by MTS.

No new interface is introduced.
Basic Idea
==========
Coordinator will dispatch events as before this worklog is implemented, each 
event is dispatched as a Slave_job_item. The only different thing is that it 
will store each event's start position(binlog name and offset) into  
Slave_job_item.

For workers, they will apply events as transaction groups. The first event's 
position(in Slave_job_item) will be recorded when it starts to apply a 
transaction group. So it is able to know where it should read the transaction's 
events from relay log if a retry is needed.

When an temporary error happens, the transaction's events can be divided into
three parts.
1 - The events that have been applied successfully. They were also released.
2 - The event that encounters the temporary error.
3 - The events not yet applied. They are still in the job queue.

Workers don't need to read the whole transaction from relay log. They just need 
to read the events in parts 1 and 2 and apply them. And then apply the events in 
part 3 as normal.

Storing Event's Start Position
==============================

* event_relay_log_number
  It records current relay log file's suffix number. It is stored into each
  Slave_job_item. It uses less memory than storing string name of the relay log.

  DEFINITION:
  uint Relay_log_info::event_relay_log_number

  Note: It suppose all relay log have same base name which is same to 
        relay_log_basename variable. Otherwise, it may read a wrong relay log
        or a relay log that doesn't exist. The problem will not be solved in
        this worklog.

* Slave_job_item
  When dispatching events, each event is wrapped into a Slave_job_item 
  structure. Added two variables to store the event's start position.

  DEFINITION:

  typedef struct slave_job_item
  {
    void *data;
+   uint relay_number;
+   my_off_t relay_pos;
  } Slave_job_item;


Storing Start Position of Transaction
=====================================
Worker should recognize the boundary of transactions and store the first event's
position of each transaction when it starts to apply the transaction.

- slave_worker_exec_job_group()
  It applies Slave_job_items of the first transaction in jobs queue.

  DEFINITION:
  int slave_worker_exec_job_group(Slave_worker *worker, Relay_log_info *rli);

  LOGIC:
  job_item= pop_jobs_item(worker, job_item); // pop one Slave_job_item
  store the event's position. It is the first event of a transaction.

  while (1)
  {
    apply the event in job_item

    If any error happens on above step, then it calls
    worker->retry_transaction() to handle the error. If it is not an temporary 
    error then retry_transaction() returns true, so this function reports an
    error and exit with the error generated on above step. Otherwise,
    retry_transaction() will rollback and retry the events until this event. So 
    after retry_transaction() is called successfully(return false), it looks 
    like the event is applied successfully and no temporary error happens.

    break the loop if it is the last event of the transaction.

    job_item= pop_jobs_item(worker, job_item);
  }

  NOTE: error handling process is ignored here.


Retry Process
=============

* retry_transaction()
  Check if the transaction should be retried and retry it if it is required.

  DEFINITION:
  bool Slave_worker::retry_transaction(uint start_relay_number,
                                       my_off_t start_relay_pos,
                                       uint end_relay_number,
                                       my_off_t end_relay_pos);

  Return false if retry succeeds, otherwise return true.

  LOGIC:

  if (slave_trans_retries == 0)
    return true;

  do
  {
    return true if the error is not temporary error or the transaction cannot be
    rolled back safely.

    report an error and return true if the transaction has already been retried
    slave_trans_retries times.

    Increase the transaction's retry count. It is used in above step.

    rollback the transaction by calling cleanup_context(). Rollback is
    necessary before each transaction retry. The original code just
    call rollback when stopping the worker. So we call it here.

    sleep the transaction's retry count seconds. When the transaction's retry
    count is greater than MAX_SLAVE_RETRY_PAUSE, then it just sleep
    MAX_SLAVE_RETRY_PAUSE seconds. The logic is same to the logic of single  
    threaded slave.

    call read_and_apply_events() to read and apply the events between
    start_relay_pos and end_relay_pos.
  } while (read_and_apply_events() encounters an error);
  return false;

  Note: MAX_SLAVE_RETRY_PAUSE==5 and it is a constant in the C++ source code.

* read_and_apply_events()
  Read and apply the events between given positions.

  DEFINITION:
  bool Slave_worker::read_and_apply_events(uint start_relay_number,
                                           my_off_t start_relay_pos,
                                           uint end_relay_number,
                                           my_off_t end_relay_pos);

  Return false if retry succeeds, otherwise return true.

  LOGIC:
  Generate relay log file name from start_relay_number.

  while (current position < end_relay_pos)
  {
    Initialize an IO_CACHE for the relay log if it is not opened.
    call read_log_event() to read an event.
    if (read_log_event() return a valid event)
    {
      apply the event if it should not be skipped. It will return true if
      any error happens at applying. The error will be return to the caller 
      directly through error environment variables if any error(including 
      temporary errors) happens. The caller will decide if to retry the 
      transaction again.
    }
    else
    {
      close IO_CACHE and find next relay log for relay log index if it reaches 
      the end of current relay log. Otherwise return true.
      Note: Transactions may be split and stored into two or more relay logs.
    }
  }
  return false.