WL#3719: Replication table checksums

Affects: Server-6.x   —   Status: Assigned   —   Priority: Low

User wants to know that master and slave servers have same information.
User wants to check this using an online algorithm.

- When executing CHECK REPLICATION FOR TABLE t, a checksum of each
  table is generated, it is sent over in a separate event, and the
  slave checks that the table contents are consistent with what the
  master had.

- This was discussed on the Stockholm server lead meeting and 
  Brian and Monty really wanted it to ensure that row-based 
  is well-tested and that some customers get happy.

- Master calls checksum function in master for a certain table.
- There is a new event "TABLE CHECKSUM FOR TABLE t".
- When slave gets the event, it executes a check of the checksum.
- If the checksum is wrong, give and error and stop replication.

- When the table has been verified on the slave, a note is printed
  in error log saying "Table t1 verified to be consistent with master".

Related community work
These Perl scripts do the same thing:

"MySQL Table Checksum 1.1.0 adds many improvements, but the most important is
a new way to ensure slaves have the same data as their master. Instead of
checksumming the slave and the master, it can now insert the checksum
results directly on the master via an INSERT.. SELECT statement. This
statement will replicate to the slave, where a simple query can find tables
that differ from the master. This makes a consistent, lock-free checksum
trivially easy:
Maintaining a hash incrementally

Hash functions

Let *Record* be the domain of all records for a table, and let *Hash* denote a
finite set of all record hashes. Let **compute :: Record -> Hash** be a function
from records to record hashes such that the probability of collisions is minimal.

Furthermore, let *Digest* be the domain of table hashes/digests (we use the word
*digest* to differentiate it from the record hash) and let *insert :: (Digest,
Hash) -> Digest* and *remove :: (Digest, Hash) -> Digest* be functions such that
for any table digest *digest* and record hash *hash* the following properties hold:

    delete(insert(digest,hash),hash) = digest

    insert(delete(digest,hash),hash) = digest

In addition, we have function *combine :: Digest x Digest -> Digest* such that
for any two functions *f* and *f'* being either *insert* or *delete*

    combine(f(digest,hash), f'(digest,hash')) = f'(f(digest, hash), hash')

We also assume that there is a unique element 0 in *Digest* (called the
*identity*) such that:

    insert(0,hash) = hash

It is assumed that *delete* and *insert* have a minimal risk for collision,
i.e., given a table digest *digest* and two record hashes *hash* and *hash'*, it
should be improbable that:

  insert(digest, hash) = insert(digest, hash') 
  delete(digest, hash) = delete(digest, hash') 

Typical examples of the record hash function are MD5 or any of the SHA
functions. Finding a pair of good 'insert' and 'delete' requires research.

Data structures

In order to describe the functions below, we define the following structure:

  struct Table {
    Digest digest;
    Digest trans;

Bootstrapping and Recovery

On starting the server, a hash is computed by iterating over the records of the
table that should be tracked and computing the hash for the table according to
the following algorithm::

  function bootstrap(table)
    table.digest = 0
    for rec in table.records() do
      table.digest = insert(table.digest, compute(rec))

Maintaining hash

In order to maintain the digest, it is necessary to implement the following
functionality for each of the functions **ha_write_row()**, **ha_update_row()**,
and **ha_delete_row()**:

    function ha_begin(table)
       table.trans = table.digest

    function ha_commit(table)
      table.digest = combine(table.digest, table.trans)

    function ha_rollback(table)
      table.trans = 0

    function ha_delete_row(table, rec)
      table.trans = delete(table.trans, rec)

    function ha_write_row(table, rec)
      table.trans = insert(trans.trans, rec)

    function ha_delete_row(trans, rec)
      table.trans = delete(trans.trans, rec)
      table.trans = insert(trans.trans, rec)
In order to transfer the table checksum to the slave in such a manner that older
versions of the slave does not stop because it is receiving an unrecognized
event, the table checksum will be added to the variable header of the

With the introduction of an explicit Begin_stmt_log_event (introduced in
WL#????), the position of the Table map log events are free and is not tightly
connected with a statement. The table map log event then serves to transfer
information about table to the slave: for example, checksums.