WL#10622: MySQL GCS: Instrument threads in GCS/XCom

Affects: Server-8.0   —   Status: Complete

Executive Summary
=================

This worklog will instrument the GCS and XCom threads and expose them
automatically in P_S tables metrics. It is also a requirement so that we
do further instrumentation in XCom and GCS, such as mutexes and condition
variables, as well as memory usage.

Since XCom is currently single threaded, and executed under a GCS thread,
this worklog will focus on modifying the source code of GCS, using the
Server's Performance Schema Interface to instrument it and declaring and
registering the keys for the instrumented GCS threads.


User Stories
============

-As a MySQL DBA I want to know how many and which threads are being executed
in GCS, so that I am able to determine if there are any performance issues
due to the number of active threads.

-As a MySQL DBA I want to know which events have been executed by each
instrumented thread in GCS, so that I am able to determine if GCS is executing
correctly.

-As a MySQL GCS dev I want to have instrumented GCS threads, so I can
instrument other code elements executed by those threads in order to achieve
more detailed monitorization of GCS.
------------------------
Functional requirements:
------------------------
FR1. GCS and XCom must be able to be built with or without instrumentation.
FR2. The threads table of the Performance Schema database must contain an
entry for each instrumented GCS thread that is running, indicating if
monitoring is enabled.
FR3. An entry for the GCS Engine thread must be in the threads table during
the entire lifecycle of GCS.
FR4. An entry for the XCom thread must be in the threads table while the
node is in a group or trying to join one.
FR5. An entry for the logger consumer thread must be in the threads table
during the entire lifecycle of GCS, if the Gcs_ext_logger_impl is being used
as the logger.
FR6. An entry for the thread that kills suspect nodes must be in the threads
table while the node is in a group or trying to join one.
FR7. All entries for GCS instrumented threads must be removed from the
threads table when Group Replication is stopped.


----------------------------
Non-Functional requirements:
----------------------------
NFR1. The code modifications resulting from this WL must not degrade the
performance of GCS and XCom.
NFR2. Each instrumented code element must be univocally identified so that
we can monitor and understand the behavior of each individual element as a
separate entity.
In order to enable thread instrumentation, the source code of GCS must be
modified with:
-the removal of platform-specific code from GCS XPlatform, being replaced
by invocations to the corresponding instrumented mysys functions;
-the declaration and registration of the instrumentation keys;
-the instrumentation keys and flags are then added as parameters for the
mysql_thread_create function invocations to create the instrumented threads.
Since XCom is single-threaded, and initialized inside GCS, it will not
be modified.

The usage of the Server's mysys library in GCS XPlatform allows the removal of
platform-specific code currently included in XPlatform, and it also leverages
the Server's existing mechanism for turning instrumentation on or off in
build time.


Interface changes
=================

This WL introduces some interface changes, which can be exemplified on:

-Thread creation, which requires an additional parameter, the thread
instrumentation key.

gcs_xcom_control_interface.cc
-----------------------------
...
  /* Spawn XCom's main loop thread. */
  if (local_port != 0)
  {
    m_xcom_thread.create(key_GCS_THD_Gcs_xcom_control_m_xcom_thread,
                         NULL, xcom_taskmain_startup, (void *) this);
  }
...
-----------------------------


-Creating a detached thread can now be done using the
My_xp_thread::create_detached method, which wraps the procedure of setting
the attributes of the thread to be created in a detached state and invoking
My_xp_thread::create afterwards.

-----------------------------
  My_xp_thread_impl a_thread;
  a_thread.create_detached(key, thread_procedure, (void *) args);
-----------------------------
To enable the instrumentation of threads in GCS, we resort to the corresponding
functions provided by the Server's mysys library. Besides allowing us to
remove the platform-specific code contained in XPlatform, it also leverages
the Server's existing mechanism for turning instrumentation on or off in
build time.

========================================
1. Changes to the build process of GCS
========================================

The Windows building process of GCS will be modified to be compiled with
the MYSQL_DYNAMIC_PLUGIN directive, since it is built independently from
Group Replication.
This directive is required to allow correct linking of GCS with the appropriate
libraries to use the Performance Schema thread service.

Hence, the following lines will be added to the CMakeLists.txt file of
GCS:

ADD_COMPILE_FLAGS(${XCOM_SOURCES} COMPILE_FLAGS "-DMYSQL_DYNAMIC_PLUGIN")
ADD_COMPILE_FLAGS(${GCS_SOURCES} COMPILE_FLAGS "-DMYSQL_DYNAMIC_PLUGIN")


======================================
2. Changes to the source code of GCS
======================================

2.1.Instrumentation keys
==========================

GCS will use 4 thread instrumentation keys:

1-For the logger consumer thread, stored in the m_consumer member of the
Gcs_ext_logger_impl class, the key_GCS_THD_Gcs_ext_logger_impl_m_consumer
variable will be used, associated with the name
THD_Gcs_ext_logger_impl::m_consumer;

2-For the GCS engine thread, stored in the m_engine_thread member of the
Gcs_xcom_engine class, the key_GCS_THD_Gcs_xcom_engine_m_engine_thread variable
will be used, associated with the name THD_Gcs_xcom_engine::m_engine_thread;

3-For the XCom thread, stored in the m_xcom_thread member of the
Gcs_xcom_control class, the key_GCS_THD_Gcs_xcom_control_m_xcom_thread variable
will be used, associated with the name THD_Gcs_xcom_control::m_xcom_thread;

4-For the suspect expelling thread, stored in the m_suspicions_thread variable,
used inside the xcom_receive_global_view method of the Gcs_xcom_control
class, the key_GCS_THD_Gcs_xcom_control_m_suspicions_processing_thread
variable will be used, associated with the name
THD_Gcs_xcom_control::m_suspicions_processing_thread.


The thread instrumentation keys will be declared in the file
src/interface/gcs_psi.cc, and as extern in the corresponding header file
gcs_psi.h to allow their usage throughout the code. The gcs_psi.cc file
also comprises the implementation of the register_gcs_thread_keys function,
which invokes the mysql_thread_register function for their registration. The
register_gcs_thread_keys function will be invoked during the execution of
the initialize method of the Gcs_xcom_interface class. All the keys will
be registered under the "group_rpl" category, as it is the server's plugin
that includes GCS.

See the following code regarding the declaration and registration of all
the keys for instrumented threads.


gcs_psi.h
---------
extern PSI_thread_key key_GCS_THD_Gcs_ext_logger_impl_m_consumer,
                      key_GCS_THD_Gcs_xcom_engine_m_engine_thread,
                      key_GCS_THD_Gcs_xcom_control_m_xcom_thread,
                      
key_GCS_THD_Gcs_xcom_control_m_suspicions_processing_thread;

---------


gcs_psi.cc
----------
PSI_thread_key key_GCS_THD_Gcs_ext_logger_impl_m_consumer,
               key_GCS_THD_Gcs_xcom_engine_m_engine_thread,
               key_GCS_THD_Gcs_xcom_control_m_xcom_thread,
               key_GCS_THD_Gcs_xcom_control_m_suspicions_processing_thread;


static PSI_thread_info all_gcs_psi_thread_keys_info[]=
{
  {&key_GCS_THD_Gcs_ext_logger_impl_m_consumer, 
"THD_Gcs_ext_logger_impl::m_consumer", PSI_FLAG_GLOBAL},
  {&key_GCS_THD_Gcs_xcom_engine_m_engine_thread, 
"THD_Gcs_xcom_engine::m_engine_thread", PSI_FLAG_GLOBAL},
  {&key_GCS_THD_Gcs_xcom_control_m_xcom_thread, 
"THD_Gcs_xcom_control::m_xcom_thread", PSI_FLAG_GLOBAL},
  {&key_GCS_THD_Gcs_xcom_control_m_suspicions_processing_thread, 
"THD_Gcs_xcom_control::m_suspicions_processing_thread", PSI_FLAG_GLOBAL}
};


void register_gcs_thread_keys()
{
  const char *category = "group_rpl";
  int count= static_cast(array_elements(all_gcs_psi_thread_keys_info));

  mysql_thread_register(category, all_gcs_psi_thread_keys_info, count);
}


2.2 XPlatform
=============
The My_xp_thread_util, My_xp_thread_pthread and My_xp_thread_win classes
contain some code that is inspired by the mysys library of the Server. This
code will be replaced by invocations to the corresponding functions on
mysys in the My_xp_thread_impl and the My_xp_thread_util classes, which
extends the My_xp_thread abstract class, reducing our codebase by removing
duplicate code from my_xp_thread.cc. This will enable the removal of all the
code dealing with multiple platforms, such as the My_xp_thread_pthread and
My_xp_thread_win classes, since mysys already provides multi-platform support.
However, XPlatform will be kept in order to reduce the number of changes to the
GCS codebase as a result of this WL, due to the existing encapsulation, which
will also ease future changes, for instance, if an alternative to the used
mysys functions should be required.

The my_xp_thread.h file will be modified as follows:
-The inclusion of the errno.h header file will be replaced by the inclusion
of the mysql/psi/mysql_thread.h file.
-All multi-platform related code, declarations and defines will be removed,
mostly enclosed in pre-processor directives, such as #ifdef _WIN32.
-Some missing functions will be added as static methods to the
My_xp_thread_util class.
-The detach method will be removed as there is no corresponding function
in mysys. To create a detached thread, the detached state of the thread
attributes must be set to NATIVE_THREAD_CREATE_DETACHED, using the
My_xp_thread_util::attr_setdetachstate method, and then the pointer to these
attributes used as a parameter of the create method. The create_detached
method was introduced to wrap these operations.
-The once method will be removed since there is no corresponding function
in mysys.
-The create method signature will have to be modified in order to receive the
instrumentation key as a parameter for the invocation to mysql_thread_create,
which is the instrumented version of the my_thread_create function. The
signature of the remaining My_xp_thread methods do not require any change.


All these changes will result in the following my_xp_thread.h file:

my_xp_thread.h
--------------
#ifndef MY_XP_THREAD_INCLUDED
#define MY_XP_THREAD_INCLUDED

#ifndef XCOM_STANDALONE

#include 
#include 

typedef my_thread_t        native_thread_t;
typedef my_thread_handle   native_thread_handle;
typedef my_thread_attr_t   native_thread_attr_t;
typedef my_start_routine   native_start_routine;

#define NATIVE_THREAD_CREATE_DETACHED MY_THREAD_CREATE_DETACHED
#define NATIVE_THREAD_CREATE_JOINABLE MY_THREAD_CREATE_JOINABLE
#endif

#include "mysql/gcs/xplatform/my_xp_cond.h"

/**
  @class My_xp_thread

  Abstract class used to wrap mutex for various platforms.

  A typical use case is:

  @code{.cpp}

  My_xp_thread *thread= new My_xp_thread_impl();
  thread->create(key, NULL, &function, &args);

  void *result;
  thread->join(&result);

  @endcode
*/
class My_xp_thread
{
public:
  /**
    Creates thread.

    @param key thread instrumentation key
    @param attr thread attributes
    @param func routine function
    @param arg function parameters
    @return success status
  */

  virtual int create(PSI_thread_key key, const native_thread_attr_t *attr,
                     native_start_routine func, void *arg)= 0;


  /**
    Creates a detached thread.

    @param key thread instrumentation key
    @param func routine function
    @param arg function parameters
    @return success status
  */

  virtual int create_detached(PSI_thread_key key,
                              native_start_routine func,
                              void *arg)= 0;


  /**
    Suspend invoking thread until this thread terminates.

    @param value_ptr pointer for a placeholder for the terminating thread status
    @return success status
  */

  virtual int join(void **value_ptr)= 0;


  /**
    Cancel this thread.

    @return success status
  */

  virtual int cancel()= 0;


  /**
    Retrieves native thread reference

    @return native thread pointer
  */

  virtual native_thread_t *get_native_thread()= 0;

  virtual ~My_xp_thread() {}
};


#ifndef XCOM_STANDALONE
class My_xp_thread_server : public My_xp_thread
{
public:
  explicit My_xp_thread_server();
  virtual ~My_xp_thread_server();

  int create(PSI_thread_key key, const native_thread_attr_t *attr,
             native_start_routine func, void *arg);
  int create_detached(PSI_thread_key key,
                      native_start_routine func, void *arg);
  int join(void **value_ptr);
  int cancel();
  native_thread_t *get_native_thread();

protected:
  native_thread_handle *m_thread_handle;
};
#endif


#ifndef XCOM_STANDALONE
class My_xp_thread_impl : public My_xp_thread_server
#endif
{
public:
  explicit My_xp_thread_impl() {}
  ~My_xp_thread_impl() {}
};


class My_xp_thread_util
{
public:
  /**
    Terminate invoking thread.

    @param value_ptr thread exit value pointer
  */

  static void exit(void *value_ptr);


  /**
    Initialize thread attributes object.

    @param attr thread attributes
    @return success status
  */

  static int attr_init(native_thread_attr_t *attr);


  /**
    Destroy thread attributes object.

    @param attr thread attributes
    @return success status
  */

  static int attr_destroy(native_thread_attr_t *attr);


  /**
    Retrieve current thread id.

    @return current thread id
  */

  static native_thread_t self();


  /**
    Compares two thread identifiers.

    @param t1 identifier of one thread
    @param t2 identifier of another thread
    @retval 0 if ids are different
    @retval some other value if ids are equal
  */

  static int equal(native_thread_t t1, native_thread_t t2);


  /**
    Sets the stack size attribute of the thread attributes object referred
    to by attr to the value specified in stacksize.

    @param attr thread attributes
    @param stacksize new attribute stack size
    @retval 0 on success
    @retval nonzero error number, on error
  */

  static int attr_setstacksize(native_thread_attr_t *attr, size_t stacksize);


  /**
    Returns the stack size attribute of the thread attributes object referred
    to by attr in the buffer pointed to by stacksize.

    @param attr thread attributes
    @param stacksize pointer to attribute stack size returning placeholder
    @retval 0 on success
    @retval nonzero error number, on error
  */

  static int attr_getstacksize(native_thread_attr_t *attr, size_t *stacksize);


  /**
    Sets the detach state attribute of the thread attributes object referred
    to by attr to the value specified in detachstate.

    @param attr thread attributes
    @param detachstate determines if the thread is to be created in a joinable
    (MY_THREAD_CREATE_JOINABLE) or a detached state (MY_THREAD_CREATE_DETACHED)
    @retval 0 on success
    @retval nonzero error number, on error
  */

  static int attr_setdetachstate(native_thread_attr_t *attr, int detachstate);


  /**
    Causes the calling thread to relinquish the CPU, and to be moved to the
    end of the queue for its static priority and another thread gets to run.
  */

  static void yield();

};

#endif // MY_XP_THREAD_INCLUDED
--------------


As a consequence of these changes, the my_xp_thread.cc will be modified
accordingly:

-The constructor and destructor of My_xp_thread_server will be implemented
as follows:

--------------
My_xp_thread_server::My_xp_thread_server()
  :m_thread_handle(
     static_cast(malloc(sizeof(native_thread_handle)))))
{}


My_xp_thread_server::~My_xp_thread_server()
{
  free(m_thread_handle);
}
--------------

-The native thread getter will return the thread field of the
native_thread_handle struct:

--------------
native_thread_t *My_xp_thread_server::get_native_thread()
{
  return &m_thread_handle->thread;
}
--------------

-The remaining methods will invoke the corresponding mysys thread function:

--------------
int My_xp_thread_server::create(PSI_thread_key key,
                              const native_thread_attr_t *attr,
                              native_start_routine func,
                              void *arg)
{
  return mysql_thread_create(key, m_thread_handle, attr, func, arg);
};


int My_xp_thread_server::create_detached(PSI_thread_key key,
                                         native_start_routine func,
                                         void *arg)
{
  native_thread_attr_t my_attr;

  My_xp_thread_util::attr_init(&my_attr);
  My_xp_thread_util::attr_setdetachstate(&my_attr,
                                         NATIVE_THREAD_CREATE_DETACHED);

  int ret_status = create(key, &my_attr, func, arg);

  My_xp_thread_util::attr_destroy(&my_attr);

  return ret_status;
};


int My_xp_thread_server::join(void **value_ptr)
{
  return my_thread_join(m_thread_handle, value_ptr);
}


int My_xp_thread_server::cancel()
{
  return my_thread_cancel(m_thread_handle);
}
--------------

-When GCS is built with the server, the My_xp_thread_util static methods
will invoke the corresponding mysys functions directly, as follows:

--------------

void My_xp_thread_util::exit(void *value_ptr)
{
  my_thread_exit(value_ptr);
}


int My_xp_thread_util::attr_init(native_thread_attr_t *attr)
{
  return my_thread_attr_init(attr);
}


int My_xp_thread_util::attr_destroy(native_thread_attr_t *attr)
{
  return my_thread_attr_destroy(attr);
}


native_thread_t My_xp_thread_util::self()
{
  return my_thread_self();
}


int My_xp_thread_util::equal(native_thread_t t1, native_thread_t t2)
{
  return my_thread_equal(t1, t2);
}


int My_xp_thread_util::attr_setstacksize(native_thread_attr_t *attr, size_t 
stacksize)
{
  return my_thread_attr_setstacksize(attr, stacksize);
}


int My_xp_thread_util::attr_setdetachstate(native_thread_attr_t *attr, int 
detachstate)
{
  return my_thread_attr_setdetachstate(attr, detachstate);
}


int My_xp_thread_util::attr_getstacksize(native_thread_attr_t *attr, size_t 
*stacksize)
{
  return my_thread_attr_getstacksize(attr, stacksize);
}


void My_xp_thread_util::yield()
{
  my_thread_yield();
}

--------------

2.3. Usage examples
===================

-Here is an example of the usage of the modified create method, since it is
the only method that requires changes to enable thread instrumentation.
The invocation to initialize XCom's main loop thread will have an additional
parameter, which identifies the thread's instrumentation key that was declared
and registered before, preceding the currently existing parameters.


gcs_xcom_control_interface.cc
-----------------------------
...
  /* Spawn XCom's main loop thread. */
  if (local_port != 0)
  {
    m_xcom_thread.create(key_GCS_THD_Gcs_xcom_control_m_xcom_thread,
                         NULL, xcom_taskmain_startup, (void *) this);
  }
...
-----------------------------

-The My_xp_thread::create_detached method can now be used to create a detached
thread instead of setting the attributes of the thread to be created in a
detached state and invoking My_xp_thread::create afterwards, which is still
possible to do.

-----------------------------
  My_xp_thread_impl a_thread;
  a_thread.create_detached(key, thread_procedure, (void *) args);
-----------------------------


Using the thread attributes, this code would look like this:

-----------------------------
  My_xp_thread_impl a_thread;

  native_thread_attr_t thread_attr;

  My_xp_thread_util::attr_init(&thread_attr);
  My_xp_thread_util::attr_setdetachstate(&thread_attr,
                                         NATIVE_THREAD_CREATE_DETACHED);

  a_thread.create(key_GCS_THD_Gcs_xcom_control_m_suspicions_thread,
                  &thread_attr,
                  thread_procedure,
                  (void *) args);
  My_xp_thread_util::attr_destroy(&thread_attr);
-----------------------------


3. Instrumentation data
=======================

When code instrumentation is enabled, the instrumentation data is inserted
and, or updated in the appropriate Performance Schema tables in runtime. In
this case, the threads table will include an entry per thread stating if it
is monitored or not, among other data.


3.1 threads table
=================

The threads table lists all running server threads, where each row contains
information about a thread and indicates whether monitoring and historical
event logging are enabled for it. The most relevant columns on this table
for this worklog are:
-THREAD_ID - Unique thread identifier.
-NAME - Name associated with the thread instrumentation code in the server.
-TYPE - The thread type, either FOREGROUND or BACKGROUND.
-PARENT_THREAD_ID - If this thread is a subthread (spawned by another thread),
this is The THREAD_ID value of the spawning thread, if this thread is thread
is spawned by another thread.
-INSTRUMENTED - Defines whether events executed by the thread are instrumented.
-HISTORY - Defines whether historical events for the thread should be logged.

Since all GCS threads are background threads, the last two columns will be
YES by default and can be altered during the thread's lifetime.
To monitor events of an instrumented thread, thread_instrumentation must be
enabled in the setup_consumers table as well as the events instrumentation
in the setup_instruments table.
In order for historical event logging for an instrumented thread to occur,
additionally to its HISTORY column enabled, the instrumented code elements
must be in the setup_instruments table and the appropriate history-related
consumers must be enabled in the setup_consumers table.
For example, regarding wait event logging in the events_waits_history and
events_waits_history_long tables requires the corresponding consumers to be
set to YES in the setup_consumers table.

For further detail on the threads tables see:
https://dev.mysql.com/doc/refman/8.0/en/threads-table.html


3.2 Summary tables
==================

The summary tables contain aggregated information for completed events. The
following two tables are the most relevant for GCS and XCom since they
aggregate monitoring information by thread and event name:
The events_waits_summary_by_thread_by_event_name table aggregates information,
such as the usage count and waiting time statistics if the event is timed,
for recent and current wait events.
The memory_summary_by_thread_by_event_name table aggregates memory usage
statistics, such as operation counts and used memory sizes.

For further detail on the summary tables see:
https://dev.mysql.com/doc/refman/8.0/en/performance-schema-summary-tables.html