WL#3368: Mixed format as default

Affects: Server-5.1   —   Status: Complete

SUMMARY
-------
There are so many bugs in replication and solving them in SBR 
is very difficult.  RBR is often the simple solution to many problems
in replication.  To remove all of this problems for our customer once
and for all, we propose to make MIXED the default binlogging format.
This way, anything that is not possible to replicate with SBR will be 
replicated by RBR instead.

DECISIONS REGARDING THIS WL
--------------------------------------
- 2006-07-04 On the development call with Zack and Elliot it was 
  decided that this WL will be part of MySQL 5.1.

  We also agreed that this change would be done in a much less
  time-consuming manner. That is, WL#3368 will be amended to no longer
  call for the entire test suite to be converted from testing SBR to 
  MIXED. Instead, the rpl team will work with Jeb to ensure that a few
  tests check the code which handles dynamic switching between replication
  formats, and the majority of the test suite will be left to test SBR and
  RBR as they do today.

- Mats suggested an enhancement where the original statement could be
  saved into a comment field in the binary log along with the row events,
  this change is not included in this WL.
TASK
----
1. Change so that MIXED is default format
2. Change test suite so that more testing is done by MIXED format.
3. If a 5.0 mysqlbinlog program is fed a 5.1 log it needs to throw a
    good error message.

BACKGROUND FROM GUILHEM
------------------------------------
There are quite a few bugs in statement-based replication (I find
them just by looking at code); the last ones I found are:
BUG#20633 INSERT DELAYED RAND() or @user_var does not replicate statement-based
BUG#20339 stored procedure using LAST_INSERT_ID() does not
 replicate statement-based
BUG#19630 stored function inserting into two
 auto_increment breaks statement-based binlog
BUG#20188 REPLACE or ON DUPLICATE KEY UPDATE in
 auto_increment breaks binlog

Not counting the well-known problems of statement-based replication of
a stored function which has two RAND(), of SYSDATE()...

All this makes statement-based replication of routines a complicated
topic for our users, I'd bet.

19630, I'm fixing by implementing that in the mixed mode, such
functions will be binlogged row-based (ie. I'm not fixing
statement-based mode, too hard)
20633, I propose to fix it by saying that in the mixed mode, the
delayed thread does all its job in row-based mode (as INSERT DELAYED
is limited to INSERT VALUES, row-based should not eat more disk space
than statement-based). I have a 4-line patch which does it all fine.
While a fix for statement-based would require transmission of the user
variables from mysql_insert() to the delayed insert thread, more
developer time.

Overall I have the impression that to decrease exposition of our users
to statement-based bugs which we'll never fix, we have to consider
making the mixed mode default in some soon branch (5.1??), and, at the
same time, extend the mixed mode to all known cases where
statement-based doesn't work.

Please think about it and decide.
It's not a "trivial" work item: the testsuite currently has a
statement/row dichotomy, making the mixed mode as default would
shuffle this - we would need to run the tests in statement-based,
mixed and row-based...

Please note that the work would not be done by me.


REFERENCES
----------------
See also testing of mixed format in WL#3308.
First note that row-based-binlog tests are unmoved. For exclusive testing
the mixed there are two tests rpl_rbr_to_sbr.test (name is ambiguous imho) and
rpl_switch_stm_row_mixed.test which both are objects of interest for WL#3308.

Here as the mixed is the default instead of the statement we need to substitute
statement requirement with the mixed. 
The former expressed in *.test as 
    -- source include/have_binlog_format_statement.inc
    where the included has
              --require r/have_binlog_format_statement.require
The substitution in *.test
    -- source include/have_binlog_format_mixed
    where the new included would have
             --require r/have_binlog_format_mixed.require

Since there can be quite a lot tests neede the substition I wonder if it's
better to be done within WL#3308 framework.

Considering HLS TASKs there are the following expansions or the items.

1. Change so that MIXED is default format
    1.1 to change the default for command line --binlog-format
    1.2 to alter global_system_variables.binlog_format calculation
                basing on command line --binlog-format parameter and 
                its default.
2. Change test suite so that more testing is done by MIXED format.
    2.1 to check if there are test cases requiring --binlog-foramt=statement via
        `source include/have_binlog_format_statement.inc' and affected by 
         altering the latter to be "mixed".
    2.2  to check the content of such vulnerable cases to find if
         extending to the mixed does not modify results. In that case simply
         substite source arguments as explained.
    2.3  if a statement requiring test deals with features triggering
         row-binloggin in mixed mode then if necessary we can switch explicitly
	 to statement mode.

3. If a 5.0 mysqlbinlog program is fed a 5.1 log it needs to throw a
   good error message.
   
   Looks to be the guideline for implementation though to be checked.