RATIONALE --------- 1. To solve BUG#20435 - extra relay log rotation; 2. to detect failures more easily and precisely; 3. to have better second_behind_master value on slave (BUG#29309). DESIGN ------ 1. New extension to CHANGE MASTER: CHANGE MASTER SET master_heartbeat_period= val; to be stored in master info file and mi object to provide the value sent to master. The value 'val' needs to be within some reasonable interval. As the cost of handling creation, sending and treating the event on slave side is supposed to be low, the val can be as small as 1 seconds even less. 2. Slave io thread prepares and sends the query 'SET SESSION @master_heartbeat_period= val' to master. From the query the master's dump thread finds out slave's preference about heartbeat sending period. 3. The heartbeat event is created by master's DUMP thread and sent each time @master_heartbeat_period elapses which designate nothing has added to binlog for the period time. 4. On the slave's side the heartbeat event is handled exclusively by IO thread avoiding its recording to relay log as well as engaging slave sql thread. Upon receiving the last sent event's coordinates are compared against the ones slave io thread maintains and updates per each received event, except some "phantoms" including the new Heartbeat. The heartbeat event does not update that local information. 5. The Heartbeat period and the number of received event should be monitorable via SHOW STATUS like 'slave_heartbeat period' and SHOW STATUS like 'slave_received_heartbeats' respectively. FEATURES -------- A newer heartbeat-aware slave will not have any error response from an unintelligent old master about the slave connecting time query set @master_heartbeat_period=val; and naturally the old master will not send heartbeats. In that case, slave will show its chosen heartbeat value in the status, but there will be no real actions. Using a the user variable @master_heartbeat_period instead of the system one avoids displaying the name within a list of available variables for a plain user session. Existence of the user variable master_heartbeat_period can be noticed only via the general query log. User observable behaviour -------------------------- 1. Requesting from the slave to send heartbeats from master with a period: CHANGE MASTER master_heartbeat_period= val where val is the period being of the decimal type with the value in the range [0.001, 4294967] seconds. Notice, that heartbeats are sent by the master only if there is no more unsent events in the actual binlog file for a period longer that master_heartbeat_period. Whenever the master's binlog is updated with an event, the waiting for heartbeat sending condition gets reset. If `val' is zero no hearbeats will be sending. Notice, heartbeat is active by default with the period slave_net_timeout/2. 2. SHOW STATUS like 'slave_heartbeat_period' Slave's side status variable which gets the value from either CHANGE MASTER, master.info or implicitly as `slave_net_timeout/2' (the default). The denominator 2 provides a reasonable default period to guarantee no reconnection will happen to an idling master upon elapsing slave_net_timeout. 3. SHOW STATUS like 'slave_received_heartbeats'; The counter that initializes at slave init time, increments by every received heartbeat and resets to zero with CHANGE MASTER; The memory size for the counter is the size for ulonglong i.e normally 8 bytes. Overflowing it even with the fastest heartbeat is possible only on a cosmic time scale. 4. RESET SLAVE; resets the current heartbeat's period to the default (see 2.). The check of the valid range remains after computing slave_net_timeout/2 with dropping the period's value to the max allowable if the ratio would be greater. 5. SET @@global.slave_net_timeout=`value less than the current hb period` is warned as that'd be an irrational intention. OLD INFO: --------- Suggestion from Jeremy Zawodny at OSCON, July 2001, as a discussion idea: Need active heartbeat detection mechanism for replication monitoring. Writing an error to log file is insufficient notice of failure. Instead need a way to trigger an alert message to DBA or a method for a watcher program to immediately detect when a slave has failed.
Detailed design --------------- A) Data structs and the new object 1. Heartbeat event class with fields: - master_log_file (the last binlog file name on master) master_log_pos (last written event on master) and master_current_time at the moment of creating the heartbeat event are recorded into existing members of the parent Log_event class. Other reasons to inherit heartbeat class from Log_event are as the following: - the cost of a single heartbeat event processing will be low enough in this case even if heartbeat would be sent several times in a second. - Extendability. If we later want to add fields to the event, the log_event has dynamic headers... 2. master_info struct on slave is augmented with - heartbeat_period type of float to allow selecting a real number with precision up to 1 millisecond. - received_heartbeat type of ulonglong counter. Given 1 msec precision and 4 bytes of the heartbeat value's storage the maximum value of the heartbeat is bound to be within 0 and ULONG_MAX/1000 interval. I.e the effective interval for the period is [0.001, 4294967]; zero is excluded to mean not to send the heartbeat. The value of master_info.heartbeat_period is initialized via three options: - syntacticly extended CHANGE MASTER with MASTER_HEARTBEAT_PERIOD=val. - reading from master.info (float number format) - default (slave_net_timeout/2) Notice, there is no way to do that from server startup options which is a consequence of a deprecation BUG#21490. 3. To change master dumping thread to wait with a timeout at MYSQL_LOG::wait_for_update() as pthread_cond_timedwait(&update_cond, &LOCK_log, master_heartbeat_period); If at return from waiting there is ETIMEDOUT or ETIME error condition then heartbeat event is to be sent. 4. Slave io receives a heartbeat and handles it without recording it in relay log. Slave's side waiting status for master's activity is reset upon receiving anything from the socket which produces the desired effect - no reconnection although no real events are received but a hearbeat only. Slave io thread still instantiates the event for checking its validity status as is supposed to be done for most of the replication events. The event's members log_file_name and log_pos are compared against the slave's local knowledge to stop the io thread if log_pos does not match the value from the last event except heartbeats the slave has received. The file names and the log positions must be equal except the case when slave starts with empty master.info and thus does not know the last received event from the master; the slave will update its local mi->log_file_name upon receiving Rotate event (normally it should happen in some fraction of second after connecting). 5. Slave does not need to do anything special if heartbeat does not come. The current logics for reconnecting upon elapsing slave_net_timeout makes its job. 6. Monitoring facilities are added via "standard" procedure - new status var and its display through a function. Notice, rli struct remains intact and changes are done to master_info only as relay_log_info is supposed to deal with relay-loggable feature that the heartbeat does not belong to.