WL#3719: Replication table checksums
Affects: Server-6.x
—
Status: Assigned
MOTIVATION ---------- User wants to know that master and slave servers have same information. User wants to check this using an online algorithm. POSSIBLE IMPLEMENTATION ----------------------- - When executing CHECK REPLICATION FOR TABLE t, a checksum of each table is generated, it is sent over in a separate event, and the slave checks that the table contents are consistent with what the master had. DECISION REGARDING THIS WL -------------------------- - This was discussed on the Stockholm server lead meeting and Brian and Monty really wanted it to ensure that row-based is well-tested and that some customers get happy. IMPLEMENTATION -------------- - Master calls checksum function in master for a certain table. - There is a new event "TABLE CHECKSUM FOR TABLE t". - When slave gets the event, it executes a check of the checksum. - If the checksum is wrong, give and error and stop replication. OPTIONAL -------- - When the table has been verified on the slave, a note is printed in error log saying "Table t1 verified to be consistent with master". Related community work ---------------------- These Perl scripts do the same thing: "MySQL Table Checksum 1.1.0 adds many improvements, but the most important is a new way to ensure slaves have the same data as their master. Instead of checksumming the slave and the master, it can now insert the checksum results directly on the master via an INSERT.. SELECT statement. This statement will replicate to the slave, where a simple query can find tables that differ from the master. This makes a consistent, lock-free checksum trivially easy: http://www.xaprb.com/blog/2007/05/05/mysql-table-checksum-110-released/"
Maintaining a hash incrementally ================================ Hash functions -------------- Let *Record* be the domain of all records for a table, and let *Hash* denote a finite set of all record hashes. Let **compute :: Record -> Hash** be a function from records to record hashes such that the probability of collisions is minimal. Furthermore, let *Digest* be the domain of table hashes/digests (we use the word *digest* to differentiate it from the record hash) and let *insert :: (Digest, Hash) -> Digest* and *remove :: (Digest, Hash) -> Digest* be functions such that for any table digest *digest* and record hash *hash* the following properties hold: delete(insert(digest,hash),hash) = digest insert(delete(digest,hash),hash) = digest In addition, we have function *combine :: Digest x Digest -> Digest* such that for any two functions *f* and *f'* being either *insert* or *delete* combine(f(digest,hash), f'(digest,hash')) = f'(f(digest, hash), hash') We also assume that there is a unique element 0 in *Digest* (called the *identity*) such that: insert(0,hash) = hash It is assumed that *delete* and *insert* have a minimal risk for collision, i.e., given a table digest *digest* and two record hashes *hash* and *hash'*, it should be improbable that: insert(digest, hash) = insert(digest, hash') delete(digest, hash) = delete(digest, hash') Typical examples of the record hash function are MD5 or any of the SHA functions. Finding a pair of good 'insert' and 'delete' requires research. Data structures --------------- In order to describe the functions below, we define the following structure: struct Table { Digest digest; Digest trans; }; Bootstrapping and Recovery -------------------------- On starting the server, a hash is computed by iterating over the records of the table that should be tracked and computing the hash for the table according to the following algorithm:: function bootstrap(table) table.digest = 0 for rec in table.records() do table.digest = insert(table.digest, compute(rec)) end end Maintaining hash ---------------- In order to maintain the digest, it is necessary to implement the following functionality for each of the functions **ha_write_row()**, **ha_update_row()**, and **ha_delete_row()**: function ha_begin(table) table.trans = table.digest end function ha_commit(table) table.digest = combine(table.digest, table.trans) end function ha_rollback(table) table.trans = 0 end function ha_delete_row(table, rec) table.trans = delete(table.trans, rec) end function ha_write_row(table, rec) table.trans = insert(trans.trans, rec) end function ha_delete_row(trans, rec) table.trans = delete(trans.trans, rec) table.trans = insert(trans.trans, rec) end
In order to transfer the table checksum to the slave in such a manner that older versions of the slave does not stop because it is receiving an unrecognized event, the table checksum will be added to the variable header of the Table_map_log_event. With the introduction of an explicit Begin_stmt_log_event (introduced in WL#????), the position of the Table map log events are free and is not tightly connected with a statement. The table map log event then serves to transfer information about table to the slave: for example, checksums.
Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.