WL#7364: RBR: Enhanced Applier Thread Progress Details

Affects: Server-8.0   —   Status: Complete

EXECUTIVE SUMMARY
=================

This worklog aims at improving/fixing the stats that are shown through
SHOW PROCESSLIST, INFORMATION_SCHEMA.PROCESSLIST,
PERFORMANCE_SCHEMA.THREADS and PERFORMANCE_SCHEMA.EVENTS_STAGES_* for
the SQL thread when it is applying row events. This is motivated by
the fact that users do not have a clue of what is happening when the
SQL thread is applying a ROW event. In fact, currently wrong
information can be reported through PROCESSLIST.  Therefore, users
should at least be able to see through PERFORMANCE_SCHEMA tables what
is the correct state of the SQL thread and what exactly it is doing at
the moment.  This minimal information is of great help to troubleshoot
Row-based Replication problems.

The work on this worklog should be later extended with further
PERFORMANCE_SCHEMA additions that export many more stats for RBR.

SOLUTION
========

- Fix the command field: set it to NULL (or QUERY(?))

- Put in the "Info" field the contents of Rows_query_log_event.

- Fix a proper THD_STAGE_INFO for when the thread is applying
  row events.

- Add RBR stages for reporting how far the SQL thread has processed
  each ROW event (lets call it bundle of RBR changes).

  This is reported through performance_schema stages tables.

FUNCTIONAL REQUIREMENTS
-----------------------

F1. SHOW PROCESSLIST SHALL have the following fields updated
    with accurate information when processing a Delete, Write,
    or Update_rows_log_event:

    - command: NULL / (QUERY?)
    - db --> the database the row event is changing
    - Info --> NULL or the contents of Rows_query_log_event if
      available or the statement itself (for BINLOG '' statements)

F2. SELECT command, db, Info FROM INFORMATION_SCHEMA.PROCESSLIST SHALL
    have the following fields updated with accurate information
    when processing a Delete, Write, or Update_rows_log_event:

    - command --> same as SHOW PROCESSLIST
    - db --> same as SHOW PROCESSLIST
    - Info --> same as SHOW PROCESSLIST

F3. SELECT PROCESSLIST_DB, PROCESSLIST_STATE, PROCESSLIST_INFO 
    FROM PERFORMANCE_SCHEMA.THREADS SHALL have the following fields
    updated with accurate information when processing a Delete,
    Write or Update_rows_log_event:

    - processlist_db --> same as SHOW PROCESSLIST
    - processlist_command --> same as SHOW PROCESSLIST
    - processlist_info --> same as SHOW PROCESSLIST

F4. There SHALL be three new Performance schema stages:
    - Applying batch of row changes (write)
    - Applying batch of row changes (update)
    - Applying batch of row changes (delete)

F5. These new stages SHALL report progress (WORK_COMPLETED and
    WORK_ESTIMATED on the PERFORMANCE_SCHEMA.EVENTS_STAGES_CURRENT
    table).

F6. The data presented SHALL be discarded once the worker thread
    stops.

F7. BINLOG statements SHALL comply with F1, F2, F3 and F4.

NON-FUNCTIONAL REQUIREMENTS
---------------------------

NFR1. There SHALL NOT be a significant performance degradation on the
      binlog applier. (No more than 3%)
- Three new performance schema stages are implemented:
  - "Applying batch of row changes (write)"
  - "Applying batch of row changes (update)"
  - "Applying batch of row changes (delete)"
    - These stages are set while the applier thread applies the rows.

  As such the regular performance schema stages operations apply.

- The following commands will report more accurate information than
  that that is exposed today (see requirements section):
 - SHOW PROCESSLIST
 - SELECT command, db, Info FROM INFORMATION_SCHEMA.PROCESSLIST
 - SELECT PROCESSLIST_DB, PROCESSLIST_STATE, PROCESSLIST_INFO 
   FROM PERFORMANCE_SCHEMA.THREADS

- The following command will report accurate information when instrumentation
  for the three stages mentioned above is enabled:
  - SELECT WORK_ESTIMATED, WORK_COMPLETED
    FROM performance_schema.events_stages_current 
    WHERE EVENT_NAME LIKE 'stage/sql/Applying%'
Stage Instrumentation
---------------------

  - Progress for each stage is to be accounted for in the member
    function Rows_log_event::do_post_row_operations .

  - Estimation of work to do is done after processing the first row
    change. This is needed since estimation is performed based on
    pointers to the data in the buffer: m_rows_end, m_rows_buf and
    m_curr_row . Therefore, we know the size of the buffer and the
    approximate size of each row change after we process the first row
    and set the m_curr_row correctly.

  - Context lifecycle is set as part of the Rows_log_event.