WL#2516: Filtering semantics change

Affects: Server-7.1   —   Status: Un-Assigned

SUMMARY
-------
Fix the semantics of the filtering to consider 
*effective* database (that the statement effects) instead of 
*current* database (defined by USE statements).

Since we in 5.1 have a class that is used for filtering, the 
same new semantics will apply to both binlog and replication
slave filtering (and replication master filtering WL#2387, 
when that is implemented).

DECISIONS/OPINIONS
---------------------
- Lars thinks that it is unfortunate for RBR that DDL is filtered on
  current database and DML on effective database
- Guilhem is ok with this as long as it is documented in 
  update section of manual
- Mats thinks: "I was thinking of considering this a bug 
  when using RBR only. I see no problem with adding option 
  of using the same filtering for SBR, but the default 
  behaviour should be the same as before.

The problem with using this filtering for SBR is the same as always:
what to do with multi-update statements where one of the tables are
filtered and one is not. To have complicated rules for deciding this
under SBR is not a good thing; it's better to keep the existing (and
very simple) rule of always using the current database."

- This WL might be implemented together with scripting 
  support, WL#4008.  -- Lars, 2007-09-12

MOTIVATION
----------
1. There are bug reports (e.g. BUG#8791) on the semantics of 
   filtering and the semantics seems a bit tricky to grasp.
2. This is much simpler for user, since:
   1. There is only three parameters to change 
      (DO, IGNORE and REWRITE)
   2. Rules are simpler
   3. Uniform treatment between binlog and replication
3. It is unnatural to consider only the current database (Lars 
   thinks. Mats? Guilhem?) when doing filtering.
4. For RBR, the rows are already filtered on effective database

Guilhem (old) comments:
I think that this topic is not the most urgent and should not consume
too much energy, that there are now rather few users/customers
having problems with the current filtering rules, and that this change
will cause some of them some digging into their apps to find out why
it does not work anymore. Some big customers with whom I'm dealing
presently use the options you propose to remove.

And the drawback of "effective" is that it requires parsing of the
query, whereas "current" does not and is hence faster.

So, if we need to change things is still an open question to me; if we
do need, I agree that a new release branch is the place to change things.

Given the nature of affected customers, I am not sure I am ok, so I
would like Brian to explicitely give the ok on this topic.

OLD SYNTAX
----------
The options:

  --binlog-do-db
  --binlog-ignore-db
  --replicate-do-db
  --replicate-do-table
  --replicate-ignore-db=DB_NAME
  --replicate-ignore-table=DB_NAME.TBL_NAME
  --replicate-wild-do-table=DB_NAME.TBL_NAME
  --replicate-wild-ignore-table=DB_NAME.TBL_NAME
  --replicate-rewrite-db=FROM_NAME->TO_NAME

are all removed in this WL and cause an "unknown option" error.

SUGGESTED NEW SYNTAX
--------------------
Instead we have the following commands for replication:

  CHANGE MASTER 'foo' TO 
    DO='db1.%',
    IGNORE='%.table2',
    REWRITE=db1.%->db2.%;

The dot between database and table is mandatory.  If there
is a '%' as table or db, then this corresponds to the old 
options --binlog-do-db etc.  (For REWRITE, first 
implementation can allow simple regular expressions only.)

The following command can be implemented for binlog:

  CHANGE BINLOG 'bar' TO DO='db1.%', IGNORE='%_backup.table2'; 


NEW SEMANTICS
-------------
For each statement having an effect on a database and/or
a table, the following are the rules to see if it should 
be written or ignored:

1. Are there any rules defined?
     Yes: Go to next step
     No:  Write statement and exit

2. Is there any do expression that matches effective 
   database and table?  
     Yes: Write statement, and exit
     No:  Go to next step

3. Is there any ignore expression that matches effective 
   database and table?  
     Yes: Do not write statement, and exit
     No:  Write statement, and exit

NOTES
-----
- Q: What if customers use USE for binlog on/off?
  A: If there are many customers that use USE to turn on/off 
  binlogging, we should encourage them to instead use:
    SET SQL_LOG_BIN = {0|1}
- Q: Isn't is slower?
  A: To get the same speed for IGNORE=db.% as for 
  --replicate-ignore-db we should separately store all
  ignored databases so that there is no need to do 
  real matching each time it is checked.  This can be
  done in the rpl_filter object.
- Q: Thought which arose from a question of a big customer "I want to include
  all tables like "abc.%" except the "%.EFG"". This can't be done now. If we
  supported Perl regexps we could do it with this pattern: /^abc\.(?!EFG)/
  (I could not find an equivalent in the regex library MySQL uses).
   A: With the new semantics it is almost as powerful as having 
   real regexps with negation.

CURRENT SEMANTICS (4.1 and 5.0, but 4.0 does not completely follow it)
----------------------------------------------------------------------
(As copied from manual.)

Binlog filtering
----------------
  1. Are there `binlog-do-db' or `binlog-ignore-db' rules?
        * No: Write the statement to the binary log and exit.
        * Yes: Go to the next step.

  2. There are some rules (`binlog-do-db' or `binlog-ignore-db' or
     both). Is there a current database (has any database been selected
     by `USE'?)?
        * No: Do _not_ write the statement, and exit.
        * Yes: Go to the next step.

  3. There is a current database. Are there some `binlog-do-db' rules?
        * Yes: Does the current database match any of the `binlog-do-db'
          rules?
             * Yes: Write the statement and exit.
             * No: Do _not_ write the statement, and exit.
        * No: Go to the next step.

  4. There are some `binlog-ignore-db' rules.  Does the current
     database match any of the `binlog-ignore-db' rules?
        * Yes: Do not write the statement, and exit.
        * No: Write the query and exit.

Replication slave filtering
---------------------------
  1. Are there some `--replicate-do-db' or `--replicate-ignore-db'
     rules?

        * Yes: Test them as for `--binlog-do-db' and
          `--binlog-ignore-db' (*note Binary log::). What is the result
          of the test?

             - Ignore the statement: Ignore it and exit.

             - Execute the statement: Don't execute it immediately,
               defer the decision, go to the next step.

        * No: Go to the next step.

  2. Are there some `--replicate-*-table' rules?

        * No: Execute the query and exit.

        * Yes: Go to the next step. Only tables that are to be updated
          are compared to the rules (`INSERT INTO sales SELECT * FROM
          prices': only `sales' are compared to the rules). If several
          tables are to be updated (multiple-table statement), the
          first matching table (matching "do" or "ignore") wins.  That
          is, the first table is compared to the rules. Then, if no
          decision could be mad, the second table is compared to the
          rules, and so forth.

  3. Are there some `--replicate-do-table' rules?

        * Yes: Does the table match any of them?
             - Yes: Execute the query and exit.

             - No: Go to the next step.

        * No: Go to the next step.

  4. Are there some `--replicate-ignore-table' rules?

        * Yes: Does the table match any of them?
             - Yes: Ignore the query and exit.

             - No: Go to the next step.

        * No: Go to the next step.

  5. Are there some `--replicate-wild-do-table' rules?

        * Yes: Does the table match any of them?

             - Yes: Execute the query and exit.

             - No: Go to the next step.

        * No: Go to the next step.

  6. Are there some `--replicate-wild-ignore-table' rules?

        * Yes: Does the table match any of them?

             - Yes: Ignore the query and exit.

             - No: Go to the next step.

        * No: Go to the next step.

  7. No `--replicate-*-table' rule was matched.  Is there another table
     to test against these rules?

        * Yes: Loop.

        * No: We have tested all tables to be updated and could not
          match any rule.  Are there `--replicate-do-table' or
          `--replicate-wild-do-table' rules?

             - Yes: Ignore the query and exit.

             - No: Execute the query and exit.

DISCUSSION TO TAKE INTO ACCOUNT
-------------------------------------------------
On Tue, Jan 31, 2006 at 09:45:40AM +0100, Guilhem Bichot wrote:
> Hi Lars,
>
> On Mon, Jan 30, 2006 at 11:47:24PM +0100, Lars Thalmann wrote:
> > On Mon, Jan 30, 2006 at 04:38:30PM +0100, Mats Kindahl wrote:
> > > Stefan Hinz wrote:
> > >
> > > >Mats,
> > > >
> > > >>
> > > >>>        Row-based replication:
> > > >>>        All table row changes are filtered individually. For
> > > >>>        multi-table updates, each table is filtered separately
> > > >>>        according to the rules. Some may be changed and some not,
> > > >>>        depending on the rules (and the actual changes, of course).
> > > >>
> > > >>
> > > >>A minor thing, just as a clarification (this might need to be added
> > > >>elsewhere):
> > > >>
> > > >>   In addition, observe that the changes are filtered based on the
> > > >>actual database as well, not based on the currently selected database.
> > > >
> > > >
> > > >Good point. Let me see if I get this right. Assuming we replicate the
> > > >'foo' database only, and the current (or default) database on the
> > > >master is 'bar', this happens when I issue statements:
> > > >
> > > >- INSERT INTO foo.sometable VALUES (1)
> > > >  - SBR: not replicated
> > > >  - RBR: replicated
> > > >
> > > >- CREATE TABLE foo.sometable (i INT)
> > > >  - SBR: not replicated
> > > >  - RBR: replicated (sure ?)
> > >
> > >
> > > Ugh... got me there (I think). Actually, it's only the binrow events
> > > that are filtered based on the actual tables changed. The latter one,
> > > the create statement, is replicated using statement-based replication,
> > > so it is not replicated. Hmmm.... maybe we should consider this a bug to
> > > get consistent behaviour for all row-based replication...
> >
> > Yes, I think that is a good idea.  We need to support the old
> > filtering too and it would be nice if this
> > filter-by-actual-tables-changed would be possible also for SBR
> > (perhaps with an option).
> >
> > This work actually already has a WL, namely WL#2516.  Guilhem was a
> > bit skeptical before, but now with RBR I think this becomes even more
> > important.
>
> Well it becomes impossible to resist it, as the Rows_log_event don't
> contain the default (USE'd) database...
> As long as the big change is documented in the Upgrade section...