WL#5139: Test replication with an intermediate BLACKHOLE slave

Affects: Server-Prototype Only   —   Status: Complete

==== Background ====

The BLACKHOLE storage engine is intended to be used as an intermediate
replication slave in topologies like this:

  master -> blackhole_slave -> slave

The idea is that blackhole_slave can use, e.g., slave filtering rules, so that
the real slave sees it as if the rules were applied on the master. This is
useful when the data that is filtered out is confidential (so the slave should
not have it in its binlogs).

Details are in the manual:
http://dev.mysql.com/doc/refman/5.1/en/blackhole-storage-engine.html

==== Purpose ====

We currently do not test that replication works in the
master->blackhole_slave->slave topology. It would be great if there was an
option to run the entire rpl and rpl_ndb suites with an intermediate slave where
all tables use the blackhole storage engine.

==== Requirements ====
1. No (or minimal changes) in MTR/MTR libs/mysqltest
2. Blackhole slave should be transparent for test cases. No changes for behavior
of test case with and without blackhole slave.
3. Skip tests that can not run properly with blackhole slave (or need changes in
algorithm).
4. Use a test case for blackhole slave testing should require minimal changes
for test code (or w/o them).
5. Blackhole slave testing should be easy for adding to PB2.

Use cases
=========

Coupled with the worklogs for supporting more (WL#3259) and fewer columns
(WL#3915) on the slave and the extension of Blackhole to support row-based
logging (BUG#38360) the Blackhole engine can now be used to filter the binary
log in various ways.

Apart from the standard tests used for testing more/fewer columns support for
other engines, these cases require special attention:

Filter out certain columns from the binary log for confidentiality reasons. This
means that the extra columns may not under any circumstances be written to the
binary log.

Filter out certain tables from the binary log for confidentiality reasons.
Again, this means that under no circumstances should data for the tables be
available in the binary log.

Adding an extra timestamp column to track the last change of that row. Note that
I do not think that we have any support for INSERT ... ON DUPLICATE KEY
UPDATE... for row-based replication.

Adding an extra column to provide a primary key for efficiency reasons, which
would then be a auto_increment primary key column for a table that does not have
a primary key.


Specification (28.05.10)
=================================

The idea of patch (after discuss with Luis) based on following items:

1. Create new rpl suite ./mysql-test/suite/rpl_bhs by copying from
./mysql-test/suite/rpl at the testing moment ("on-the-fly")

2. Update default suite configuration (my.cnf) with with new one where blackhole
slave support added (3rd server for regular master-slave configuration).

3. Update test cases for adding blackhole slave support.

4. Create 'disabled.def' with list of test cases that difficult or unable test
under blackhole slave: test cases with 3 or more servers, with own configuration
file, etc.

5. Run 'mysql-test-run.pl --suite=rpl_bhs' with options as usual for regular
'rpl' suite

6. Set run of rpl_bhs for rpl tree mysql-next-mr-rpl-merge only via collections
files

The items 1-4 above should be implemented as standalone Perl script and run in
PB2 before mysql-test-run.pl 
I wrote preliminary versions of include/master-slave.inc (based on
include/circular_rpl_for_4_hosts_init.inc):

  https://intranet.mysql.com/secure/paste/displaypaste.php?codeid=9175

and sync_slave_with_master (based on include/circular_rpl_for_4_hosts_sync.inc):

  https://intranet.mysql.com/secure/paste/displaypaste.php?codeid=9176

We could integrate this with mtr in the following manner:

 (1) If $RPL_BLACKHOLE is set, include/master-slave.inc sets up replication via
blackhole instead of normal replication.

 (2) If $RPL_BLACKHOLE is set, sync_slave_with_master syncs first the blackhole
slave and then the real slave. This can be done in three steps:

   (2a) Implement the built-in commands save_master_pos, sync_with_master, and
sync_slave_with_master in the test language (say, using the .inc files
include/sync_slave_with_master.inc, include/save_master_pos.inc,
include/sync_with_master.inc).

   (2b) Make mtr use the test files instead of the built-in commands.

   (2c) Make the new .inc files sync the blackhole slave if $RPL_BLACKHOLE is set.


Implementation 28.05.2010
==========================

The patch located in ./mysql-test/extension and contains following items:

1. ./mysql-test/extension/bhs.pl - main script.

2. ./mysql-test/extension/bhs - directory with additional files:

  a) my.cnf, rpl_1slave_base.cnf - rpl_bhs suite configuration (main script
copies them into rpl_bhs directory and replaces existed files with same names)

  b) master-slave.inc - replace in test cases: 
    source include/master-slave.inc -> source extension/bhs/master-slave.inc

  c) disabled.def -> add content to ./mysql-test/suite/rpl_bhs/disabled.def

  d) update_test_cases - the configuration tells main script how to update test
cases. The format is : where
 can be '*'. Main script reads list of test case and seeks them in
that file. If the one found then this test case will be updated. The line
'/path/to/*:....' must be placed at end of file otherwise all items after it
will be skipped.

  e) *.rules. The test case configuration for update. The format is following:
  [command for replace]
  new command 1
  new command 2
  ....
  new command X
  
  Main script finds  in a test case and replace one with
command from block

3. ./mysql-test/collections/mysql-next-mr-rpl-merge 
This file tells PB2 to run rpl_bhs suite only for mysql-next-mr-rpl-merge tree

Suggested path
=======================
http://lists.mysql.com/commits/110387