WL#5675: Make a replication library for 5.6 that can be linked to 5.5

Affects: Server-Prototype Only   —   Status: On-Hold

This is a step to modularize replication.

The goal is to create an interface between the core server ("core") and the 
replication library ("rpl-lib") that is identical in 5.5 and 5.6, such that the 
5.6 rpl-lib can be linked to the 5.5 core. Hence, the interface has to be 
flexible enough that it can handle both the 5.5 rpl-lib and the 5.6 rpl-lib.

In this worklog, it suffices to make 5.6 rpl-lib link statically to 5.5 core. It 
is not a goal to make replication a plugin; that will be done in future worklogs.

It is an open question whether the 5.6 rpl-lib for 5.5 core should be compiled 
with 5.6 core headers or with 5.5 core headers. We will do whatever turns out to 
be easiest.
The interface will be partitioned in three logically separated pieces:

  Binlog
    Functionality used by core when it writes to the binary log, as well as
    SQL statements for maintenance of binary logs: FLUSH BINARY LOGS,
    SHOW BINARY LOGS, SHOW BINLOG EVENTS, PURGE BINARY LOGS, RESET MASTER.

  Master
    Functionality used by the master dump thread, as well as the related SQL
    statement: SHOW SLAVE HOSTS.

  Slave
    Functionality used by the slave threads, as well as the related SQL
    statements: START SLAVE, STOP SLAVE, CHANGE MASTER, SHOW SLAVE STATUS,
    FLUSH RELAY LOGS, SHOW RELAYLOG EVENTS, BINLOG; as well as the function
    MASTER_POS_WAIT().

Each of these pieces will be represented in C++ by a class. The core will not 
refer to any other symbols than members of these classes.

We will do the work in the following steps:

(1) Create the three classes and add the class functions that we need. To
    minimize the amount of changes, the interface will include function
    arguments of unstable types like THD* and TABLE*. Later steps will replace
    such arguments by something more stable. This step has three sub-steps:
     (1.1) Binlog class [WL#5778]
     (1.2) Slave class [WL#5779]
     (1.3) Master class [WL#5789]

(2) Remove references to server variables and user variables (through the THD
    object) from the implementation of Binlog.log_statement. Instead, add an
    argument to Binlog.log_statement that contains a list of all server
    variables to be replicated. This step has two sub-steps:
     (2.1) server variables [WL#5790]
     (2.2) user variables [WL#5791]

(3) Replace THD* by MYSQL_THD in all functions that are part of the interface.
    Add thd_* functions to core through which we can get the relevant data.
    This step has four sub-steps:
     (3.1) Binlog.log_statement [WL#5792]
     (3.2) Other functions in Binlog class [WL#5793]
     (3.3) Slave class [WL#5794]
     (3.4) Master class [WL#5795]

(4) The function for writing a block of data loaded by LOAD DATA INFILE to the
    binary log currently has the prototype
    Binlog::log_loaded_block(IO_CACHE *io_cache)
    This is bad because: (1) it assumes that IO_CACHE is used to load data, but
    we may want to replace that; (2) it assumes IO_CACHE is stable; (3) it
    assumes that the generic datatype IO_CACHE contains information specific to
    LOAD DATA INFILE statements, which it currently does but that is just bad
    design. So we should change the prototype to:
    Binlog::log_loaded_block(MYSQL_THD thd, int length, char *data)
    [WL#5798]

(5) Move definitions of replication-specific server variables to the library.
    [WL#5796]

(6) Copy error codes from 5.6 to 5.5. [WL#5802]

(7) Refactor first rpl_master_has_bug and then Field::compatible_field_size.
    [WL#5805, WL#5815]

(8) Use Binlog::log_statement instead of other ways to do the same thing.
    [WL#5816]

(9) Make it possible for the plugin to dynamically add new clauses to CHANGE
    MASTER without modifying the parser. [WL#5755]

(10)Create a new sub-directory 'rpl' in the 'sql' directory in the source tree.
    Move rpl_* to the new sub-directory and remove the prefix rpl_ from the
    files. [WL#5797]

(11)Create interface for replication filters. [WL#5817]

(12)Create interface for writing LOAD DATA to binary log. [WL#5818]

(13)Remove miscellaneous cross-references between rpl-lib and core. [WL#5819]

(14)Separate replication plugin interfaces from core plugin interfaces.
    [WL#5820]

(99)Link replication as a library. [WL#5814]

More tasks may need to be added here. We cannot determine all tasks in advance, 
because that would require inspecting the entire replication codebase. We will 
likely discover the tasks during the coding phase.

Note: some of the above work can be done in parallel. The only dependencies are:
 - (1.1) must be pushed before (2.1) and (2.2) can be pushed, however (2.1) and
   (2.2) can be started before (1.1) is pushed (we can create internal
   mechanisms to pass server and user variables to Binlog::log_statement without
   using the Binlog class).
 - (2.1) and (2.2) must be done before (3.1) and (3.2). (The easiest way to do
   (3.1) and (3.2) is to replace THD by MYSQL_THD in the function prototype,
   then compile and see what needs to be done.)
 - (1.2) must be pushed before (3.3) can be started.
 - (1.3) must be pushed before (3.4) can be started.

==== Open questions ====

The following is an unsorted list of miscellaneous things we need to do before 
core and rpl-lib are separated. Most of them should probably be moved into new 
worklogs.

 - What is MYSQL_BIN_LOG::start_union_events and friends? How do we expose
   this functionality in the interface?
   Preliminary decision: this function seems misplaced. We should probably
   move it to THD and the functionality it provides should be entirely in core.
   (It's not completely clear what this does, but it seems to be related to
   logging of SP invocations.)

 - THD::rli_fake is of type Relay_log_info, which is defined in rpl-lib. This is
   no good. We will need to add the following:
    - core should expose an interface to attach custom data to THD.
    - core should expose hooks that will be invoked when THD is created and
      destroyed.
    - rpl-lib should register a callback for the "THD::destroy" hook. The
      callback should free rli_fake.

 - The following functions need to be moved into rpl-lib:
     THD::binlog_write_row
     THD::binlog_update_row
     THD::binlog_delete_row
   In fact, Binlog::log_write_row is a wrapper around THD::binlog_write_row,
   so we can just move the body of THD::binlog_write_row into
   Binlog::log_write_row.

 - The following THD member functions are only used internally by rpl-lib and
   can just be moved into rpl-lib:
     THD::binlog_setup_trx_data
     THD::binlog_set_stmt_begin
     THD::binlog_get_pending_rows_event
     THD::binlog_set_pending_rows_event
     THD::binlog_prepare_pending_rows_event

 - The following THD member functions are used by both core and rpl-lib. We need
   to figure out the best interface for this.
     THD::flush_pending_rows_event
     THD::binlog_start_trans_and_stmt
     THD::binlog_write_table_map

 - ha_ndbcluster.cc uses active_mi to get binlog positions. This is not nice,
   it would be better if binlog positions were internal to rpl-lib (e.g.,
   because of WL#3584). We need to determine why ndb needs to know binlog
   positions. Then, either find a way for ndb to not rely on binlog positions,
   or expose binlog positions from rpl-lib to core via Binlog, and from core to
   ndb via an interface in core.

 - How can we get rid of declarations of replication-specific mutexes, condition
   variables and files from mysqld.cc?
   Examples: key_BINLOG_LOCK_index, key_BINLOG_COND_prep_xids, key_file_binlog