WL#7374: Performance schema tables to monitor replication lags and queue

Affects: Server-8.0 — Status: Complete

MOTIVATION
----------

This worklog will add fields to the replication P_S tables to help inspect
internal execution of the server at runtime better.

OBJECTIVE
---------

The current concept of SECONDS BEHIND MASTER is buggy and not meeting the
present needs. This worklog introduces an alternative way to monitor
replication lags using a number of fields that will be added to the
replication P_S tables. These fields will help detect lags at various points
in the replication stream like lag owing to IO Thread, lag in the job of
scheduling by the coordinator thread and lag in execution by a worker thread.

GOALS
-----

The main objective of this task is to show replication stats in the form of P_S
tables.

Current approach of measuring replication delay, Seconds_Behind_Master(SBM)
field of SHOW SLAVE STATUS, is not matching the present needs and, in some
scenarios, is not working properly. This worklog aims at showing replication
delays in more reliable and well defined fields.

A new implementation to show replication delays will match the following
requirements:

  1. There should be a clear distinction between the server from which
     a transaction originates from and the path through it is received on
     slave.
     original master: server on which the transaction is created.
     immediate master (the one used on CHANGE MASTER): the connection
     through which transactions arrive, and it may or not be the
     originating master.

  2. For connectivity problems we only need to monitor how often transactions
     arrive to slave, monitor local delays between deliveries, and not
     rely on transactions' original timestamps.

  3. Complex stats like replication hop delay or replication topology
     finder should be provided by utilities.

Functional Requirements
-----------------------

F1. The table performance_schema.replication_connection_status shall present
    updated information on the last transaction queued in the relay log, as
    well as on the transaction currently being queued on the relay log.

F2. The table performance_schema.replication_applier_status_by_coordinator shall
    present updated information on the last transaction written to the buffer of
    a worker, as well as on the transaction currently being processed by the
    coordinator.

F3. The table performance_schema.replication_applier_status_by_worker shall
    present updated information on the last transaction applied by the worker,
    as well as on the transaction currently being applied.

F4. The information presented in each of the above tables regarding
    transactions shall comprise of: the transaction's GTID, original and
    immediate commit timestamps, the timestamp at which the transaction started
    the stage corresponding to the table (queueing, processing or applying) and,
    for the transaction that last completed the stage, the timestamp of the
    completion.

F5. If there is no transaction currently in the above mentioned stages, the row
    corresponding to the GTID shall not show any information and the rows
    reserved for the timestamps shall show "0000-00-00 00:00:00.000000".

F6. Information concerning transactions that were neither the last nor the one
    currently executing in any of these three stage shall be discarded.

F7. When the slave is in multithread mode and the sql thread is stopped, the
    monitoring information shall still be accessible but, when it is restarted,
    the information shall be reset.

F8. All timestamps in the affected tables shall have microseconds precision.

F9. In case of an error while applying a transaction, the information in the
    worker table regarding that applying transaction shall be kept (because it
    is useful for troubleshooting), but it shall not be displayed in the fields
    corresponding to the last_applied_transaction.

F10. When a partial transaction is present in the relay logs, upon restart,
     the server locates the last partial transaction to update the transaction
     boundary parser state. The information on this last GTID in the relay log 
     shall be available in table
     performance_schema.replication_connection_status even before the
     connection thread is started and START_QUEUE_TIME shall refer to when this
     GTID was found the relay log upon restart. 

Non-Functional Requirements
---------------------------

NF1. Even though the new fields added to the performance_schema tables do not 
     entail any computation, the worklog frequently fetches timestamps and 
     makes use of synchronization primitives to protect the monitoring data. 
     So, performance degradation should not be over 5%.

DEFINITION OF NEW FIELDS ADDED
==============================

 Information on at most two transactions will be provided:
  1) Information on the last transaction that was completely processed by the
     reporting thread (LAST_*_TRANSACTION).
     This information is updated when the reporting thread completes the
     processing of a new transaction.
  2) Information on the transaction currently processing by the reporting thread
     (*ING_TRANSACTION).
     This information is updated when the reporting thread starts processing a
     new transaction. When the processing of the currently transaction ends,
     these fields are cleared until the processing of a new transaction begins.

The fields added to the tables, as well as their corresponding data types, are
defined below.
The fields ORIGINAL_COMMIT_TIMESTAMP and IMMEDIATE_COMMIT_TIMESTAMP may not be
reported by the immediate master (MySQL 5.7 or less). In this case, these
fields will display NULL as their values.


1) Table: performance_schema.replication_connection_status

1.1 LAST_QUEUED_TRANSACTION (char(57))- GTID of the last transaction that was
                                        queued to the relay log.

1.2 LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIME (timestamp)- The timestamp
                                        when the last transaction that was
                                        queued in the relay log was committed
                                        on the original master.

1.3 LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIME (timestamp)- The timestamp
                                        when the last transaction that was
                                        queued in the relay log was committed
                                        on the immediate master.

1.4 LAST_QUEUED_TRANSACTION_START_QUEUE_TIME (timestamp)- The timestamp at
                                        which the last transaction was put in
                                        the queue in the relay log by this IO
                                        thread.

1.5 LAST_QUEUED_TRANSACTION_END_QUEUE_TIME (timestamp)- The timestamp at
                                        which the last transaction was
                                        completely queued to the relay log
                                        files.

1.6 QUEUEING_TRANSACTION (char(57))-    GTID of the currently queueing
                                        transaction in the relay log.

1.7 QUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIME (timestamp)- The timestamp when
                                        the currently queueing transaction was
                                        committed on the original master.

1.8 QUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIME (timestamp)- The timestamp when
                                        the currently queueing transaction was
                                        committed on the immediate master.

1.9 QUEUEING_TRANSACTION_START_QUEUE_TIME (timestamp)- The timestamp at which
                                        the first event of the currently
                                        queueing transaction was written to the
                                        relay log by this IO thread.


2) Table: performance_schema.replication_applier_status_by_coordinator

2.1 LAST_PROCESSED_TRANSACTION (char(57))- GTID of the last transaction
                                        processed by this coordinator.

2.2 LAST_PROCESSED_TRANSACTION_ORIGINAL_COMMIT_TIME (timestamp)- The timestamp
                                        when the last transaction processed by
                                        this coordinator was committed on the
                                        original master.

2.3 LAST_PROCESSED_TRANSACTION_IMMEDIATE_COMMIT_TIME (timestamp)- The timestamp
                                        when the last transaction processed by
                                        this coordinator was committed on the
                                        immediate master.

2.4 LAST_PROCESSED_TRANSACTION_START_BUFFER_TIME (timestamp)- The timestamp at
                                        which last transaction started to be
                                        written into the buffer of a worker
                                        thread by this coordinator thread.

2.5 LAST_PROCESSED_TRANSACTION_END_BUFFER_TIME (timestamp)- The timestamp at
                                        which last transaction was written into
                                        the buffer of a worker thread by this
                                        coordinator thread.

2.6 PROCESSING_TRANSACTION (char(57))-  GTID of the transaction this coordinator
                                        thread is currently processing.

2.7 PROCESSING_TRANSACTION_ORIGINAL_COMMIT_TIME (timestamp)- The timestamp when
                                        the currently processing transaction was
                                        committed on the original master.

2.8 PROCESSING_TRANSACTION_IMMEDIATE_COMMIT_TIME (timestamp)- The timestamp when
                                        the currently processing transaction was
                                        committed on the immediate master.

2.9 PROCESSING_TRANSACTION_START_BUFFER_TIME (timestamp)- The timestamp at which
                                         the currently processing transaction
                                         started being written into the buffer
                                         of a worker by this coordinator.


3) Table: performance_schema.replication_applier_status_by_worker

3.1 LAST_APPLIED_TRANSACTION (char(57))- GTID of the last transaction applied by
                                        this worker.

3.2 LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIME (timestamp)- The timestamp
                                        when the last transaction applied by
                                        this worker was committed on the
                                        original master.

3.3 LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIME (timestamp)- The timestamp
                                        when the last transaction applied by
                                        this worker was committed on the
                                        immediate master.

3.4 LAST_APPLIED_TRANSACTION_START_APPLY_TIME (timestamp)- The timestamp at
                                        which this worker started applying the
                                        last applied transaction.

3.5 LAST_APPLIED_TRANSACTION_FINISH_APPLY_TIME (timestamp)- The timestamp at
                                        which this worker finished applying the
                                        last applied transaction.

3.6 APPLYING_TRANSACTION (char(57))-    GTID of the transaction this worker is
                                        currently applying.
                                        Replaces LAST_SEEN_TRANSACTION.

3.7 APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIME (timestamp)- The timestamp when
                                        the transaction this worker is currently
                                        applying committed on the original
                                        master.

3.8 APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIME (timestamp)- The timestamp when
                                        the transaction this worker is currently
                                        applying committed on the immediate
                                        master.

3.9 APPLYING_TRANSACTION_START_APPLY_TIME (timestamp)- The timestamp at which
                                        this worker started applying the
                                        currently applying transaction.

1) Since this worklog only adds new rows to already existing performance schema
   tables, no new files will be added.

2) A new struct with the information used for monitoring transactions in
   replication will be added to rpl_gtid.h with the following definition:
   struct gtid_monitoring_info {
     Gtid gtid;
     uint64 original_commit_timestamp;
     uint64 immediate_commit_timestamp;
     uint64 start_time;
     uint64 end_time;
   }

3) All timestamps in gtid_monitoring_info are stored in microseconds and
   converted to the user timezone when displayed in the performance_schema
   tables.

4) Master_info will have two new class members:
   1) gtid_monitoring_info last_queued_trx
   2) gtid_monitoring_info queueing_trx

5) Relay_log_info will have two new class members:
   1) gtid_monitoring_info last_processed_trx
   2) gtid_monitoring_info processing_trx

6) Master_info will have two new functions to manipulate the new class members:
   1) void started_queueing(Gtid_event)
   2) void finished_queueing()

7) Relay_log_info will have two new functions to manipulate the new class
   members:
   1) void started_processing(Gtid_event)
   2) void finished_processing()

8) The started_*() functions set the GTID (or Anonymous for non-identified
   transactions), the original and immediate timestamps, and the start_time.
   The finished_*() functions set the end_time, copy the *ing_trx to last_*_trx
   and empty the *ing_trx.