WL#7817: RPL Monitoring: Move status variables to replication P_S tables
Affects: Server-5.7
—
Status: Complete
This work is a requirement for Multi-source replication. To be able to monitor the following variables on each channel, we cannot have them as global variables but variables per channel. Show status like 'Slave_running'; Show status like 'Slave_retried_transactions'; Show status like 'Slave_last_heartbeat'; Show status like 'Slave_received_heartbeats'; show status like 'Slave_heartbeat_period'; This task aims at moving these variables to replication performance schema tables established in WL#3656 so that these can be reported per channel in these performance schema tables.
Functional requirements: ======================= F1: P_S tables must be read-only, available for SQL queries (SELECT ...) as for regular tables. F2: Before CHANGE MASTER statement P_S table should be empty. F3: After CHANGE MASTER statement P_S table should be filled by a data, as earlier. F4: START/STOP SLAVE statements and running slave should update P_S "status" tables. F5: Concurent access to P_S replication tables must not affect to replication (no locks, no errors, etc) Non-functional requirements: =========================== NF1: Adding these variables to P_S tables should not affect server performance. NF2: Disabed P_S for the server shouldn't affect to the behavior of replication.
- In this section we list all the status variables mentioned in HLD and discuss them one-by-one. 1) SLAVE_RUNNING: The documentation says- ============= This is ON if this server is a replication slave that is connected to a replication master, and both the I/O and SQL threads are running; otherwise, it is OFF. Decision: This will NOT be added to performance_schema replication tables. Justification: table performance_schema.replication_connection_status and table performance_schema.replication_execute_status each have a field that corresponds to the IO and SQL threads respectively. The variable Slave_running is same as (SELECT SERVICE_STATE FROM performance_schema.replication_connection_status AND SELECT SERVICE_STATE FROM performance_schema.replication_execute_status). Hence we drop this request and don't add this field to the P_S tables. 2) SLAVE_RETRIED_TRANSACTIONS: The documentation says- ========================== If a replication slave SQL thread fails to execute a transaction because of an InnoDB deadlock or because the transaction's execution time exceeded InnoDB's innodb_lock_wait_timeout or NDBCLUSTER's TransactionDeadlockDetectionTimeout or TransactionInactiveTimeout, it automatically retries slave_transaction_retries times before stopping with an error. Decision: This will be added to the performance_schema.replication_execute_status. 3) SLAVE_LAST_HEARTBEAT: The documentation says- ==================== Shows when the most recent heartbeat signal was received by a replication slave, as a TIMESTAMP value. Decision: Added to performance_schema.replication_connection_status 4) SLAVE_RECEIVED_HEARTBEATS: The documentation says: ========================= This counter increments with each replication heartbeat received by a replication slave since the last time that the slave was restarted or reset, or a CHANGE MASTER TO statement was issued. Decision: Added to performance_schema.replication_connection_status 5) SLAVE_HEARTBEAT_PERIOD: The documentation says: ====================== Shows the replication heartbeat interval (in seconds) on a replication slave. Decision: Added to performance_schema.replication_connection_configuration. - Here is how the names of global variables map to the names in the P_S tables: +----------------------------+-----------------------------+ | Variable_name | P_S name | +----------------------------+-----------------------------+ | SLAVE_HEARTBEAT_PERIOD | HEARTBEAT_INTERVAL | +----------------------------+-----------------------------+ | SLAVE_RECEIVED_HEARTBEATS | COUNT_RECEIVED_HEARTBEATS | +----------------------------+-----------------------------+ | SLAVE_LAST_HEARTBEAT | LAST_HEARTBEAT_TIMESTAMP | +----------------------------+-----------------------------+ | SLAVE_RETRIED_TRANSACTIONS | COUNT_TRANSACTIONS_RETRIES | +----------------------------+-----------------------------+
1. System tables script changes ============================ The script mysql_system_tables.sql is updated to include the new fields in the replication P_S tables. == modified file 'scripts/mysql_system_tables.sql' --- scripts/mysql_system_tables.sql revid:jorgen.loland@oracle.com- 20140321120934-464u92uocbh7pcvq +++ scripts/mysql_system_tables.sql 2014-04-15 09:35:11 +0000 @@ -2127,7 +2127,8 @@ "SSL_CRL_FILE VARCHAR(255) not null," "SSL_CRL_PATH VARCHAR(255) not null," "CONNECTION_RETRY_INTERVAL INTEGER not null," - "CONNECTION_RETRY_COUNT BIGINT unsigned NOT NULL" + "CONNECTION_RETRY_COUNT BIGINT unsigned not null," + "HEARTBEAT_INTERVAL FLOAT not null" ") ENGINE=PERFORMANCE_SCHEMA;"; SET @str = IF(@have_pfs = 1, @cmd, 'SET @dummy = 0'); @@ -2143,6 +2144,8 @@ "SOURCE_UUID CHAR(36) collate utf8_bin not null," "THREAD_ID BIGINT unsigned," "SERVICE_STATE ENUM('ON','OFF','CONNECTING') not null," + "COUNT_RECEIVED_HEARTBEATS BIGINT unsigned," + "LAST_HEARTBEAT_TIMESTAMP TIMESTAMP(0) not null," "RECEIVED_TRANSACTION_SET TEXT not null," "LAST_ERROR_NUMBER INTEGER not null," "LAST_ERROR_MESSAGE VARCHAR(1024) not null," 2. Add functions to set float data type in P_S ========================================== P_S doesnt have support for float fields. Add infrastructure for the same. === modified file 'storage/perfschema/pfs_engine_table.cc' --- storage/perfschema/pfs_engine_table.cc revid:jorgen.loland@oracle.com- 20140321120934-464u92uocbh7pcvq +++ storage/perfschema/pfs_engine_table.cc 2014-04-15 09:33:53 +0000 @@ -590,6 +590,13 @@ f2->store_timestamp(& tm); } +void PFS_engine_table::set_field_float(Field *f, float value) +{ + DBUG_ASSERT(f->real_type() == MYSQL_TYPE_FLOAT); + Field_float *f2= (Field_float*) f; + f2->store(value); +} + ulonglong PFS_engine_table::get_field_enum(Field *f) { DBUG_ASSERT(f->real_type() == MYSQL_TYPE_ENUM); === modified file 'storage/perfschema/pfs_engine_table.h' --- storage/perfschema/pfs_engine_table.h revid:jorgen.loland@oracle.com- 20140321120934-464u92uocbh7pcvq +++ storage/perfschema/pfs_engine_table.h 2014-04-15 09:32:52 +0000 @@ -138,6 +138,12 @@ */ static void set_field_timestamp(Field *f, ulonglong value); /** + Helper, assign a value to a float field. + @param f the field to set + @param value the value to assign + */ + static void set_field_float(Field *f, float value); + /** Helper, read a value from an enum field. @param f the field to read @return the field value 3. Add fields to the tables. ======================== 3.1 Add HEARTBEAT_INTERVAL in the table replication_connection_configuration. === modified file 'storage/perfschema/table_replication_connection_configuration.cc' --- storage/perfschema/table_replication_connection_configuration.cc revid:jorgen.loland@oracle.com-20140321120934-464u92uocbh7pcvq +++ storage/perfschema/table_replication_connection_configuration.cc 2014-04- 15 09:30:49 +0000 @@ -116,12 +116,17 @@ {C_STRING_WITH_LEN("CONNECTION_RETRY_COUNT")}, {C_STRING_WITH_LEN("bigint")}, {NULL, 0} - } + }, + { + {C_STRING_WITH_LEN("HEARTBEAT_INTERVAL")}, + {C_STRING_WITH_LEN("float")}, + {NULL, 0} + } }; TABLE_FIELD_DEF table_replication_connection_configuration::m_field_def= -{ 16, field_types }; +{ 17, field_types }; PFS_engine_table_share table_replication_connection_configuration::m_share= @@ -279,6 +284,8 @@ m_row.connection_retry_count= (ulong) active_mi->retry_count; + m_row.heartbeat_interval= active_mi->heartbeat_period; + mysql_mutex_unlock(&active_mi->rli->data_lock); mysql_mutex_unlock(&active_mi->data_lock); mysql_mutex_unlock(&LOCK_active_mi); @@ -358,6 +365,9 @@ case 15: /** connect_retry_count */ set_field_ulonglong(f, m_row.connection_retry_count); break; + case 16:/** number of seconds after which heartbeat will be sent */ + set_field_float(f, m_row.heartbeat_interval); + break; default: DBUG_ASSERT(false); } === modified file 'storage/perfschema/table_replication_connection_configuration.h' --- storage/perfschema/table_replication_connection_configuration.h revid:jorgen.loland@oracle.com-20140321120934-464u92uocbh7pcvq +++ storage/perfschema/table_replication_connection_configuration.h 2014-04- 15 09:39:25 +0000 @@ -79,6 +79,7 @@ uint ssl_crl_path_length; uint connection_retry_interval; ulong connection_retry_count; + float heartbeat_interval; }; 3.2 Add COUNT_RECEIVED_HEARTBEATS and LAST_HEARTBEAT_TIMESTAMP and in the table replication_connection_status. === modified file 'storage/perfschema/table_replication_connection_status.cc' --- storage/perfschema/table_replication_connection_status.cc revid:jorgen.loland@oracle.com-20140321120934-464u92uocbh7pcvq +++ storage/perfschema/table_replication_connection_status.cc 2014-04-15 10:13:08 +0000 @@ -54,6 +54,16 @@ {NULL, 0} }, { + {C_STRING_WITH_LEN("COUNT_RECEIVED_HEARTBEATS")}, + {C_STRING_WITH_LEN("bigint(20)")}, + {NULL, 0} + }, + { + {C_STRING_WITH_LEN("LAST_HEARTBEAT_TIMESTAMP")}, + {C_STRING_WITH_LEN("timestamp")}, + {NULL, 0} + }, + { {C_STRING_WITH_LEN("RECEIVED_TRANSACTION_SET")}, {C_STRING_WITH_LEN("text")}, {NULL, 0} @@ -77,7 +87,7 @@ TABLE_FIELD_DEF table_replication_connection_status::m_field_def= -{ 7, field_types }; +{ 9, field_types }; PFS_engine_table_share table_replication_connection_status::m_share= @@ -217,6 +227,13 @@ m_row.service_state= PS_RPL_CONNECT_SERVICE_STATE_NO; } + m_row.count_received_heartbeats= active_mi->received_heartbeats; + /* + Time in Milliseconds since epoch. active_mi->last_heartbeat contains + number of seconds so we multiply by 1000000. + */ + m_row.last_heartbeat_timestamp= (ulonglong)active_mi->last_heartbeat*1000000; + mysql_mutex_lock(&active_mi->err_lock); mysql_mutex_lock(&active_mi->rli->err_lock); @@ -249,7 +266,9 @@ memcpy(m_row.last_error_message, temp_store, m_row.last_error_message_length); - /** time in millisecond since epoch */ + /* + Time in millisecond since epoch. active_mi->last_error().skr contains + number of seconds so we multiply by 1000000. */ m_row.last_error_timestamp= (ulonglong)active_mi->last_error().skr*1000000; } @@ -294,18 +313,24 @@ case 2: /** service_state */ set_field_enum(f, m_row.service_state); break; - case 3: /** received_transaction_set */ + case 3: /** number of heartbeat events received **/ + set_field_ulonglong(f, m_row.count_received_heartbeats); + break; + case 4: /** time of receipt of last heartbeat event **/ + set_field_timestamp(f, m_row.last_heartbeat_timestamp); + break; + case 5: /** received_transaction_set */ set_field_longtext_utf8(f, m_row.received_transaction_set, m_row.received_transaction_set_length); break; - case 4: /*last_error_number*/ + case 6: /*last_error_number*/ set_field_ulong(f, m_row.last_error_number); break; - case 5: /*last_error_message*/ + case 7: /*last_error_message*/ set_field_varchar_utf8(f, m_row.last_error_message, m_row.last_error_message_length); break; - case 6: /*last_error_timestamp*/ + case 8: /*last_error_timestamp*/ set_field_timestamp(f, m_row.last_error_timestamp); break; default: === modified file 'storage/perfschema/table_replication_connection_status.h' --- storage/perfschema/table_replication_connection_status.h revid:jorgen.loland@oracle.com-20140321120934-464u92uocbh7pcvq +++ storage/perfschema/table_replication_connection_status.h 2014-04-13 18:38:53 +0000 @@ -57,6 +57,8 @@ ulonglong thread_id; bool thread_id_is_null; enum_rpl_connect_status_service_state service_state; + ulonglong count_received_heartbeats; + ulonglong last_heartbeat_timestamp; char* received_transaction_set; int received_transaction_set_length; uint last_error_number;
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.