WL#3956: Online Backup: Commit Blocker

Affects: Server-6.0   —   Status: Complete

SUMMARY
-------
Make sure that engines can't commit while in the synchronization
phase of Online Backup algorithm.  This is to make sure that 
backups are consistent between engines.

Guilhem’s Instructions for Implementation Concerning Locking et. al.
--------------------------------------------------------------------
1. preventing new write locks on tables (lock_global_read_lock)
2. wait for existing locks to purge (close_cached_tables)
3. make_global_read_lock_block_commit()
4. read binlog position & do lock calls on the drivers
5. unlock_global_read_lock() undos all of the above locks

Background Information:
* lock_global_read_lock() – applies to non-trx & trx tables. Blocks all future 
calls to mysql_lock_table() with intention to take a write lock. (1) 
* close_cached_tables() – waits for all tables to close. Signals that when the 
statement that is currently execution ends, close the table. Waits for all 
active statements. Currently applies to all tables. (1), (2), (4)
* make_global_read_lock_block_commit – prevents new commits and waits for 
running commits to finish.

Notes:
1. This should be applied to non-transactional tables only. Should be modified 
to allow either a list of tables or a list of engines.
2. Should apply to write locked tables only and only to non-transactional 
engines.
3. For transactional engines when we use the CS driver, we may need to treat as 
non-transactional given in the notes 1 & 2 pending the capabilities of the 
storage engine.
4. Issue while waiting for tables to unlock. We need to poll backup drivers for 
data.
5. See SQL command : SQLCOM_FLUSH : FLUSH TABLES WITH READ LOCK – 
reload_acl_and_cache(). This does the first three steps. This can be used as an 
example for how to call the first three steps above except the unlock on the 
drivers.

For additional information concerning the HD session where this discussion took 
place, see https://inside.mysql.com/wiki/HD_Agenda:_RepBackup_Team

For historical research, see the solution proposed in HLS of WL#3576 - as side 
effect it implements commit blocking.
Test Design
-----------

The goals of the test should be to ensure the following (note: the statements 
below assume statements that change data):

* transactions in progress are not committed until after the backup
* transactions that have not started are allowed to start but do not commit
* transactions that are committing are allowed to commit
* non-transaction statements in progress are allowed to finish
* non-transaction statements that have not started are blocked

The test shall be written to run three sets of data manipulation statements. 
Each shall comprise a complete test of the commit blocker for those statements. 
For each set, the data state shall be identified before and after the backup. 
The expected results of the commit blocker shall be checked by restoring the 
data that was backed up. The three sets are:

1) transactional statments only
2) non-transactional statements only
3) mix of both transactional and non-transactional statements

Control of the test shall be initiated using the backup breakpoint mechanism 
(i.e., BACKUP_SYNC). These statements shall be inserted in the code at 
strategic locations to ensure an accurate test of the system.

The breakpoints include the follow (and perhaps others):

* backup_command - occurs at start of backup or restore command execution
* data_prepare - occurs at start of synchronization phase 
* commit_block_step_1 - occurs before call to global read lock
* commit_block_step_2 - occurs before call to close cached tables (skipped)
* commit_block_step_3 - occurs before call to make global read lock block commit
* commit_block_step_4 - occurs before call to lock()
* data_lock - occurs before first call to lock()
* data_unlock - occurs before first call to unlock()
* commit_block_step_5 - occurs before call to release global read lock
* data_finish - occurs after all data is read from all tables
* backup_end - occurs at end of backup or restore command execution

Note: other breakpoints may be needed to ensure a deterministic test.

The test shall require several client connections to issue the lock commands as 
well as the DML statements.

NOTICE: While there is a base test included with this work, it has been 
termined the testing of this feature requires more design work. Thus the tests 
for this work shall be accomplished in WL#4135 : Online Backup : Commit Blocker 
Testing.

NOTICE: This patch requires the patches for BUG#31383 and WL#3324 
(http://lists.mysql.com/commits/36307 and http://lists.mysql.com/commits/36303) 
in order to use the modifications to show processlist to show which lock the 
backup process has reached. This is needed in this test in order to know when 
the system is at the specific breakpoints listed above.