WL#4641: Replication Heartbeat testing

Affects: Server-6.0   —   Status: Complete   —   Priority: Medium

TESTED PROPERTIES
==========================
P1. Each value for master_heartbeat_period should work according
to the manual.
P2. Heartbeats should be sent when it is enabled
P3. Stop/start slave should keep value of master_heartbeat_period also
during START/STOP SLAVE
P4. Master should send heartbeat events only if all events from master
binlog have been sent to slave.
P5. Two slaves should be able to have different values of
master_heartbeat_period while connected to same master.
P6. Other areas with possible affects to heartbeat. 



- Seconds_Behind_Master of Show Slave Status must be reported zero
  right after a hearbeat event has been received.
  (The slave's timestamp is not reset by the event though rather by
  EOF read from the relay log. We may change resetting be governed
  by heartbeat).

TEST METHOD
===================
1. New syntax of CHANGE MASTER TO ... master_heartbeat_period=
 1.1. Default value based on slave_net_timeout/2.
 1.2. Non-default value.
 1.3. Disabled heartbeat.
 1.4. Reset to default.
 1.5. Min/Max value.
 1.6. Unsupported values (negative, too big, too small, not numbers).
2. Test heartbeat implementation.
 2.1. Check following variables on slave side:
SHOW STATUS like 'slave_heartbeat_period';
SHOW STATUS like 'slave_received_heartbeats'
 2.2. Seconds_Behind_Master of Show Slave Status must be reported zero right
after a hearbeat event has been received. (The slave's timestamp is not reset by
the event though rather by EOF read from the relay log. We may change resetting
be governed by heartbeat). 
 2.3. Basic testing (set heartbeat and make sure that it works). 
 2.4. Run long-time test (at least hours) where set small value of heartbeat and
check that it works during long time (check memory leaks, buffers, etc). 
3. Keep value.
 3.1. Stop/start slave should keep value (io_thread, sql_thread affects).
 3.2. Stop slave by error and then start.
 3.3. Reload whole server.
4. Heartbeat event should be sent only if no other events are presented in
binlog (master and slave are sync'ed).
 4.1. Create events on binlog, sync slave with master, wait until heartbeat.  
5. Multiple servers.
 5.1. Connect two slaves to master and set own heartbeat values for each.
 5.2. Connect two slaves, first should be under high loaded replication, check
heartbeat for second.
6. Master.
 6.1. Reset master affects to heartbeat or not.
 6.2. Rotation/flush of master binlog.
7. Others.
 7.1. Heartbeat and locking thread on slave.
 7.2. Circular replication.

See for reference WL#342 Replication Heartbeat
The following test scenarios (for MTR) are identified:

1. Default value
 a) check the condition master_heartbeat_period = slave_net_timeout/2. 
 b) change slave_net_timeout (as dynamic variable) and check
master_heartbeat_period (should be updated).
 c) change master_heartbeat_period, change slave_net_timeout and check
master_heartbeat_period (shouldn't be updated after changing of slave_net_timeout).
 d) reset slave, check master_heartbeat_period.
 e) change slave_net_timeout, reset slave, check master_heartbeat_period.
 f) change slave_net_timeout on master shouldn't affect to slave's heartbeat.

2. Non-default value.
 a) change master_heartbeat_period, stop/start slave.
 b) change master_heartbeat_period, stop/reset/start slave.
 c) Set non-default value and reload the server.

3. Disabled heartbeat.
 a) disable/enable hearbeat.
 b) stop/start slave shouldn't change state.
 c) stop/reset/start should change state to default.

4. Min/Max/Unsupported values.
 a) master_heartbeat_period=0.001 (min)
 b) master_heartbeat_period=4294967 (max)
 c) master_heartbeat_period=0.0009
 d) master_heartbeat_period=0
 e) master_heartbeat_period=4294968
 f) master_heartbeat_period=2x4294968-1 (double max)
 g) master_heartbeat_period=-1
 h) master_heartbeat_period='123abc'

5. Basic testing.  
 a) set heartbeat and check that heartbeat event sent to slave. 
 b) stop slave. Check that no heartbeat events are on slave
 c) start slave. Check that heartbeat event sent to slave. 
 d) stop/start io_thread. Check that no heartbeat event sent to slave if
io_trhead stopped.
 e) stop/start sql_trhread. Check that heartbeat event sent to slave if
sql_thread stopped.
 f) stop slave by error and then start. Check that heartbeat event sent to slave.
 g) set master_heartbeat_period to 1 sec and generate events on master each 0.1
sec. Check that heartbeat event shouldn't be sent because it send if no other
events are presented in binlog (master and slave are sync'ed).
 h) create events on binlog, start slave until position. No heartbeat events
there because master and slave aren't sync'ed.
 i) rotation of relay logs shouldn't affect to heartbeat events: do rotate logs
more than master_heartbeat_period (x3-x5) and check hearbeat events on slave
 j) compression protocol between master and slave: use --slave_compressed_protocol=1
 k) ssl between master and slave via CHANGE MASTER TO ... MASTER_SSL_* options


6. Extended testing: multiple servers.
a) Connect two slaves to master and set own heartbeat values for each. Check
that values of hearbeat are updating properly.
b) Connect two slaves, first should be under high loaded replication (create an
event that generate tousands of updates), check heartbeat for second.
c) Circular replication. Set different values. Check that values of hearbeat are
updating properly on both servers.

7. Extended testing: master
 a) Reset master affects to heartbeat or not.
 b) Reload the master server.
 c) Rotation/flush of master binlog.
 d) Do high load for master (create an event that generate tousands of updates
but it should be skipped for binlog via --binlog-ignore-db) that doesn't affect
to binlog and check heartbeat on slave