WL#4641: Replication Heartbeat testing
Affects: Server-6.0
—
Status: Complete
TESTED PROPERTIES ========================== P1. Each value for master_heartbeat_period should work according to the manual. P2. Heartbeats should be sent when it is enabled P3. Stop/start slave should keep value of master_heartbeat_period also during START/STOP SLAVE P4. Master should send heartbeat events only if all events from master binlog have been sent to slave. P5. Two slaves should be able to have different values of master_heartbeat_period while connected to same master. P6. Other areas with possible affects to heartbeat. - Seconds_Behind_Master of Show Slave Status must be reported zero right after a hearbeat event has been received. (The slave's timestamp is not reset by the event though rather by EOF read from the relay log. We may change resetting be governed by heartbeat). TEST METHOD =================== 1. New syntax of CHANGE MASTER TO ... master_heartbeat_period= 1.1. Default value based on slave_net_timeout/2. 1.2. Non-default value. 1.3. Disabled heartbeat. 1.4. Reset to default. 1.5. Min/Max value. 1.6. Unsupported values (negative, too big, too small, not numbers). 2. Test heartbeat implementation. 2.1. Check following variables on slave side: SHOW STATUS like 'slave_heartbeat_period'; SHOW STATUS like 'slave_received_heartbeats' 2.2. Seconds_Behind_Master of Show Slave Status must be reported zero right after a hearbeat event has been received. (The slave's timestamp is not reset by the event though rather by EOF read from the relay log. We may change resetting be governed by heartbeat). 2.3. Basic testing (set heartbeat and make sure that it works). 2.4. Run long-time test (at least hours) where set small value of heartbeat and check that it works during long time (check memory leaks, buffers, etc). 3. Keep value. 3.1. Stop/start slave should keep value (io_thread, sql_thread affects). 3.2. Stop slave by error and then start. 3.3. Reload whole server. 4. Heartbeat event should be sent only if no other events are presented in binlog (master and slave are sync'ed). 4.1. Create events on binlog, sync slave with master, wait until heartbeat. 5. Multiple servers. 5.1. Connect two slaves to master and set own heartbeat values for each. 5.2. Connect two slaves, first should be under high loaded replication, check heartbeat for second. 6. Master. 6.1. Reset master affects to heartbeat or not. 6.2. Rotation/flush of master binlog. 7. Others. 7.1. Heartbeat and locking thread on slave. 7.2. Circular replication. See for reference WL#342 Replication Heartbeat
The following test scenarios (for MTR) are identified: 1. Default value a) check the condition master_heartbeat_period = slave_net_timeout/2. b) change slave_net_timeout (as dynamic variable) and check master_heartbeat_period (should be updated). c) change master_heartbeat_period, change slave_net_timeout and check master_heartbeat_period (shouldn't be updated after changing of slave_net_timeout). d) reset slave, check master_heartbeat_period. e) change slave_net_timeout, reset slave, check master_heartbeat_period. f) change slave_net_timeout on master shouldn't affect to slave's heartbeat. 2. Non-default value. a) change master_heartbeat_period, stop/start slave. b) change master_heartbeat_period, stop/reset/start slave. c) Set non-default value and reload the server. 3. Disabled heartbeat. a) disable/enable hearbeat. b) stop/start slave shouldn't change state. c) stop/reset/start should change state to default. 4. Min/Max/Unsupported values. a) master_heartbeat_period=0.001 (min) b) master_heartbeat_period=4294967 (max) c) master_heartbeat_period=0.0009 d) master_heartbeat_period=0 e) master_heartbeat_period=4294968 f) master_heartbeat_period=2x4294968-1 (double max) g) master_heartbeat_period=-1 h) master_heartbeat_period='123abc' 5. Basic testing. a) set heartbeat and check that heartbeat event sent to slave. b) stop slave. Check that no heartbeat events are on slave c) start slave. Check that heartbeat event sent to slave. d) stop/start io_thread. Check that no heartbeat event sent to slave if io_trhead stopped. e) stop/start sql_trhread. Check that heartbeat event sent to slave if sql_thread stopped. f) stop slave by error and then start. Check that heartbeat event sent to slave. g) set master_heartbeat_period to 1 sec and generate events on master each 0.1 sec. Check that heartbeat event shouldn't be sent because it send if no other events are presented in binlog (master and slave are sync'ed). h) create events on binlog, start slave until position. No heartbeat events there because master and slave aren't sync'ed. i) rotation of relay logs shouldn't affect to heartbeat events: do rotate logs more than master_heartbeat_period (x3-x5) and check hearbeat events on slave j) compression protocol between master and slave: use --slave_compressed_protocol=1 k) ssl between master and slave via CHANGE MASTER TO ... MASTER_SSL_* options 6. Extended testing: multiple servers. a) Connect two slaves to master and set own heartbeat values for each. Check that values of hearbeat are updating properly. b) Connect two slaves, first should be under high loaded replication (create an event that generate tousands of updates), check heartbeat for second. c) Circular replication. Set different values. Check that values of hearbeat are updating properly on both servers. 7. Extended testing: master a) Reset master affects to heartbeat or not. b) Reload the master server. c) Rotation/flush of master binlog. d) Do high load for master (create an event that generate tousands of updates but it should be skipped for binlog via --binlog-ignore-db) that doesn't affect to binlog and check heartbeat on slave
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.