The new release of MySQL is packed with exciting features that help detecting and analyzing replication lag. In this post, you will be able to learn all about the new replication timestamps, the new useful information that is now reported by performance schema tables, and how delayed replication was improved.
New Replication Timestamps
One of the most common ways to monitor replication lag is by relying on the Seconds_Behind_Master field in SHOW SLAVE STATUS. However, this metric may not be suitable when we start considering replication topologies and scenarios more complex than the simple master-slave setup.
In a chained replication setup, for instance, we are only aware of the delay relative to the original master and do not have any information about the lag relative to the nearest, immediate master.
Also, this lag is measured per-event, from the time the statement started to execute, while it would be more relevant to measure it per-transaction, from the time the transaction was actually committed on the master.
Finally, the timestamp used to measure the replication lag is very coarse-grained, with only second precision, and is relative to the original master’s timezone. As a result, if the transaction is replicated to a server in another timezone, the measured delay will be offset by the timezone difference between them.
In this release, we introduce two new timestamps associated with the GTID of each transaction (instead of each event), written to the binary log, which complement the Seconds_Behind_Master metric in circumventing these issues:
- original commit timestamp (OCT): the number of microseconds since epoch when the transaction was written to the binary log (committed) of the original master.
- immediate commit timestamp (ICT): the number of microseconds since epoch when the transaction was written to the binary log (committed) of the immediate master.
The output of mysqlbinlog displays the new timestamps in two formats, microseconds from epoch and also TIMESTAMP format in the user timezone for better readability, as the next snippet from a slave’s binary log exemplifies:
1
2
3
4
5
6
|
#170404 10:48:05 server id 1 end_log_pos 233 CRC32 0x016ce647 GTID last_committed=0 sequence_number=1 original_committed_timestamp=1491299285661130 immediate_commit_timestamp=1491299285843771 # original_commit_timestamp=1491299285661130 (2017-04-04 10:48:05.661130 WEST) # immediate_commit_timestamp=1491299285843771 (2017-04-04 10:48:05.843771 WEST) /*!80001 SET @@session.original_commit_timestamp=1491299285661130*//*!*/; SET @@SESSION.GTID_NEXT= 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1'/*!*/; # at 233 |
As a rule, the OCT is always the same in all replicas where the transaction is applied.
In master-slave replication, the OCT of a transaction in the (original) master’s binary log will be the same as its ICT. In the slave’s relay log, the OCT and ICT of the transaction are the same as in the master’s binary log; whereas in its own binary log, the transaction’s ICT will now correspond to when the slave committed the transaction.
In a group replication setup, when the original master is a member of a group, the OCT is generated when the transaction is ready to be committed, i.e., when it finished executing in the original master and its write set is ready to be sent to all members of the group for certification. Therefore, the same OCT is replicated to all servers (group member or slave replicating from a member) applying the transaction and each will store on its binary log the local commit time on ICT.
View change events, exclusive of Group Replication, are a special case. Transactions containing these events are generated by each server but share the same GTID (so, they are not first executed in a master and then replicated to the group, but all members of the group execute and apply the same transaction). Since there is no original master, these transactions have their OCT set to zero.
More Monitoring Information in Performance Schema
This new MySQL release also improves existing performance schema replication status tables with new fields. These new fields help detecting and diagnosing lag at several points in the replication stream, more specifically lag that may be due to:
- the connection thread, that retrieves transactions from the master and queues them in the relay log;
- when multi-threaded slave mode (MTS) is enabled, the coordinator thread that reads the transactions from the relay log and schedules them to a worker queue;
- the worker threads applying the transactions.
These threads and their activities can be directly mapped to three tables:
- replication_connection_status: current status of the connection to the master;
- replication_applier_status_by_coordinator: current status of the coordinator thread that only displays information when MTS is enabled;
- and replication_applier_status_by_worker: current status of the thread(s) applying transactions received from the master.
The tables were extended with nine additional fields that show information about the last transaction the corresponding thread processed and the transaction that thread is currently processing. This information comprises: i) the transaction’s GTID, ii) its OCT and ICT (retrieved from the slave’s relay log), iii) the time the thread started processing it and, in case of the last processed transaction, iv) the time the thread finished processing it.
The next paragraphs explore in more detail the new information in each table.
The replication_connection_status table was extended with information on the last and current transaction the connection thread queued into the relay log. This includes when it started queueing the transaction, i.e., when it read the first event from the master and queued it in the relay log, and, for LAST_QUEUED_TRANSACTION, when it finished queueing the last event of the transaction in the relay log.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
SELECT * from performance_schema.replication_connection_status; CHANNEL_NAME GROUP_NAME SOURCE_UUID ccaf5c89-191b-11e7-9bea-340286faf52d THREAD_ID 37 SERVICE_STATE ON COUNT_RECEIVED_HEARTBEATS 0 LAST_HEARTBEAT_TIMESTAMP 0000-00-00 00:00:00.000000 RECEIVED_TRANSACTION_SET aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_ERROR_NUMBER 0 LAST_ERROR_MESSAGE LAST_ERROR_TIMESTAMP 0000-00-00 00:00:00.000000 LAST_QUEUED_TRANSACTION aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 2017-04-04 12:48:05.661130 LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 2017-04-04 12:48:05.661130 LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP 2017-04-04 12:48:05.674003 LAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMP 2017-04-04 12:48:05.697760 QUEUEING_TRANSACTION QUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 QUEUEING_TRANSACTION_START_QUEUE_TIMESTAMP 0000-00-00 00:00:00.000000 |
When MTS is enabled, the replication_applier_status_by_coordinator table also reports which was the last transaction buffered by the coordinator thread to a worker’s queue, as well as the transaction it is currently buffering. The start timestamp refers to when this thread read the first event of the transaction from the relay log to buffer it to a worker’s queue, while the end timestamp refers to when the last event finished buffering to the worker’s queue.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
SELECT * from performance_schema.replication_applier_status_by_coordinator; CHANNEL_NAME THREAD_ID 33 SERVICE_STATE ON LAST_ERROR_NUMBER 0 LAST_ERROR_MESSAGE LAST_ERROR_TIMESTAMP 0000-00-00 00:00:00.000000 LAST_PROCESSED_TRANSACTION aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_PROCESSED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 2017-04-04 12:48:05.661130 LAST_PROCESSED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 2017-04-04 12:48:05.661130 LAST_PROCESSED_TRANSACTION_START_BUFFER_TIMESTAMP 2017-04-04 12:48:05.674139 LAST_PROCESSED_TRANSACTION_END_BUFFER_TIMESTAMP 2017-04-04 12:48:05.819167 PROCESSING_TRANSACTION PROCESSING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 PROCESSING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 PROCESSING_TRANSACTION_START_BUFFER_TIMESTAMP 0000-00-00 00:00:00.000000 |
Finally, the table replication_applier_status_by_worker now contains the details about the transactions applied by the applier thread, or by each worker, in case MTS is enabled. More precisely, the start timestamp refers to when the worker started applying the first event, and the end timestamp refers to when the last event of the transaction was applied.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
SELECT * from performance_schema.replication_applier_status_by_worker; CHANNEL_NAME WORKER_ID 1 THREAD_ID 34 SERVICE_STATE ON LAST_ERROR_NUMBER 0 LAST_ERROR_MESSAGE LAST_ERROR_TIMESTAMP 0000-00-00 00:00:00.000000 LAST_APPLIED_TRANSACTION aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1 LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 2017-04-04 12:48:05.661130 LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 2017-04-04 12:48:05.661130 LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP 2017-04-04 12:48:05.822463 LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP 2017-04-04 12:48:05.948926 APPLYING_TRANSACTION APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP 0000-00-00 00:00:00.000000 APPLYING_TRANSACTION_START_APPLY_TIMESTAMP 0000-00-00 00:00:00.000000 |
Improved Delayed Replication
Delayed replication has also been improved in this new release. The server now leverages the newly introduced ICT to measure delay, with the following advantages:
- the delay is now checked per transaction instead of per event, so all events in the transaction either wait or are immediately applied without waiting the delay period between each of them;
- the delay is no longer relative to when the transaction (or the first event of the transaction) started, but to when it was committed on the immediate master;
- the delay is now independent of the timezone of the master because all delay computations are performed in UTC timezone.
There are no changes to the interface: the configuration is set with CHANGE MASTER TO MASTER_DELAY= XYZ;, and the monitoring is done via SHOW SLAVE STATUS (SQL_Delay and SQL_Remaining_Delay) and replication performance schema tables replication_execute_configuration (DESIRED_DELAY) and replication_execute_status (REMAINING_DELAY).
In a cross-version scenario, if either the immediate master or the slave do not support ICT, the old delayed replication implementation is used.
This sums up the new replication lag monitoring features in this release. We look forward to hearing your feedback!