WL#4259: Debug Sync Facility

Affects: Server-5.1 — Status: Complete

The Debug Sync Facility allows to place synchronization points in the
code:

    open_tables(...)

    DEBUG_SYNC(thd, after_open_tables);

    lock_tables(...)

When activated, a sync point can

  - Send a signal and/or
  - Wait for a signal

Nomenclature:

- signal:             A value of a global variable that persists
                      until overwritten by a new signal. The global
                      variable can also be seen as a "signal post"
                      or "flag mast". Then the signal is what is
                      attached to the "signal post" or "flag mast".

- send a signal:      Assign the value (the signal) to the global
                      variable ("set a flag") and broadcast a
                      global condition to wake those waiting for
                      a signal.

- wait for a signal:  Loop over waiting for the global condition until
                      the global value matches the wait-for signal.

By default, all sync points are inactive. They do nothing (except of
burning a couple of CPU cycles for checking if they are active).

A sync point becomes active when an action is requested for it.

For every sync point there can be one action per thread only. Every
thread can request multiple actions, but only one per sync point. In
other words, a thread can activate multiple sync points.


The proposal includes three ways how the action for a sync point can be
requested. Only one way will be implemented:

  - User Variables
  - SQL Syntax Extension
  - System Variable


==== User Variables (Syntax Proposal #1) ====

By setting a user variable with the same name as the sync point, you
specify a string with the requested action:

    SET @after_open_tables= 'SIGNAL opened WAIT_FOR flushed';

This method has the downside that it pollutes the name space of user
variables.

Another downside is the performance. A sync point needs to search for
"its" user variable every time the execution runs through it. [I call
this a "sync point hit" or simply a "hit".] User variables are kept in a
hash. This is optimized for user variable handling, but not for sync
points. Since good sync points are probably at frequently executed
places, a hash look up might have a measurable performance hit.

When a sync point finds its action, it needs to parse the user variable
string, which is also bad for performance.

Because of the performance problems the sync points must not exist in
production servers, if this method is implemented. Test case design
precautions are to be taken as mentioned in the "Dbug Sleep" section.

A prototype implementation is here: http://lists.mysql.com/commits/39192
(Warning: The action string syntax is very different in that patch.)

This proposal resulted from the initial idea. The downsides are so grave
that no one supports it any more.


==== SQL Syntax Extension (Syntax Proposal #2) ====

By a special SQL statement you request an action for a sync point:

    DEBUG SYNC after_open_tables SIGNAL opened WAIT_FOR flushed;

The main advantage is the excellent readability.

In contrast to the user variables the parsing happens when the 'DEBUG'
statement is parsed. The sync point does not need to do this.

The parsed action can be put into a new container (or say data
structure) that can be optimized for sync point performance. More about
this later.

The performance of this method may be good enough to have it even in a
production server.

A possible downside is the enlargement of the parser, which might make
it slightly slower.

Another possible downside might be that the new keywords (here 'DEBUG',
'SYNC', 'SIGNAL', 'WAIT_FOR') require quoting when statements like "SHOW
CREATE TABLE" are executed under SET sql_quote_show_create=0 and table
elements are named like them.

A prototype implementation is here: http://lists.mysql.com/commits/40450

This proposal is from Monty. He favors it.


==== System Variable (Syntax Proposal #3) ====

By setting a system variable you request an action for a sync point:

    SET DEBUG_SYNC= 'after_open_tables SIGNAL opened WAIT_FOR flushed';

There is no name space pollution. System variable names are under full
control of MySQL developers. No clash with user defined names can
happen. No new syntax keywords are defined, so there is neither any
pollution on this side.

Like with the SQL syntax extension, the parsing happens when the
'DEBUG_SYNC' variable is set. The sync point does not need to do this.

Like with the SQL syntax extension, the parsed action can be put into a
new container (or say data structure) that can be optimized for sync
point performance. More about this later.

There is no parser extension, so no performance hit on this side.

The performance of this method may be good enough to have it even in a
production server.

A small performance regression might be possible due to the fact that
there is one more system variable. Whenever a system variable is set,
the search for the variable in the list of variables might take some CPU
cycles longer. However, setting a system variable is not a frequent
action.

The readability of the synchronization requests is worse that with the
syntax extension. A bit odd is the fact that the assignment of this
variable does not just set the variable value itself. But we do already
have similar behavior with the 'debug' variable. Setting
'key_buffer_size' does also do a lot more than just change the variable
value.

This method requires an extra parser for the string, though an extremely
trivial one. There are no loops and there is a fixed order of elements.

A prototype implementation is here: http://lists.mysql.com/commits/40767

This proposal is favored by Ingo and, if understood correctly, by
Konstantin, Magnus, Rafal, Sergei. I haven't heard other opinions so far.


==== Synchronization Point Action Details ====

Here is an example how to activate and use the sync points, using the
syntax extension proposal (syntax proposal #2):

    --connection conn1
    DEBUG SYNC after_open_tables SIGNAL opened WAIT_FOR flushed;
    send INSERT INTO t1 VALUES(1);
        --connection conn2
        DEBUG SYNC now WAIT_FOR opened;
        DEBUG SYNC after_abort_locks SIGNAL flushed;
        FLUSH TABLE t1;

When conn1 runs through the INSERT statement, it hits the sync point
'after_open_tables'. It notices that it is active and executes its
action. It sends the signal 'opened' and waits for another thread to
send the signal 'flushed'.

conn2 waits immediately at the special sync point 'now' for another
thread to send the 'opened' signal.

A signal remains in effect until it is overwritten. If conn1 signals
'opened' before conn2 reaches 'now', it will still find the 'opened'
signal. It does not wait in this case.

When conn2 reaches 'after_abort_locks', it signals 'flushed', which lets
conn1 awake.

Normally the activation of a sync point is cleared when it has been
executed. Sometimes it is necessary to keep the sync point active for
another execution. You can add an execute count to the action:

    DEBUG SYNC name SIGNAL sig EXECUTE 3;

This will set an activation counter to 3. Each execution decrements the
counter. After the third execution the sync point becomes inactive in
this example.

One of the primary goals of this facility is to eliminate sleeps from
the test suite. In most cases it should be possible to rewrite test
cases so that they do not need to sleep. (But this facility cannot
synchronize multiple processes.) However to support developing of the
tests, and as a last resort, sync point waiting times out. There is a
default timeout, but it can be overridden:

    DEBUG SYNC name WAIT_FOR sig TIMEOUT 10 EXECUTE 2;

TIMEOUT 0 is special: If the signal is not present, the wait times out
immediately.

When a wait timed out (even on TIMEOUT 0), a warning is generated so
that it shows up in the test result.

You can throw an error message and kill the query when a synchronization
point is hit a certain number of times:

    DEBUG SYNC name HIT_LIMIT 3;

Or combine it with signal and/or wait:

    DEBUG SYNC name SIGNAL sig EXECUTE 2 HIT_LIMIT 3;

Here the first two hits send the signal, the third hit returns the error
message and kills the query.

For cases where you are not sure that an action is taken and thus
cleared in any case, you can force to clear (deactivate) a sync point:

    DEBUG SYNC name CLEAR;

If you want to clear all actions and clear the global signal, use:

    DEBUG SYNC RESET;

This is the only way to reset the global signal to an empty string.

For testing of the facility itself you can test a sync point:

    SET DEBUG_SYNC= 'name TEST';


==== Implementation ====

The facility shall be an optional part of the MySQL server. This
requires a switch to the configure tool.

===== Enabling/Disabling of the whole Facility =====

If the Debug Sync Facility is compiled into a production
server, it must be disabled by default. It is to be enabled by a mysqld
command line option:

    --debug-sync

This could also be combined with setting a default wait timeout:

    --debug-sync-timeout[=default_wait_timeout_value_in_seconds]

'default_wait_timeout_value_in_seconds' is the default timeout for the
WAIT_FOR action. If set to zero, the facility stays disabled.

The facility is enabled by default in the test suite, but can be
disabled with:

    mysql-test-run.pl ... --debug-sync-timeout=0 ...

Likewise the default wait timeout can be set:

    mysql-test-run.pl ... --debug-sync-timeout=10 ...

Pseudo code for a sync point:

    #define DEBUG_SYNC(thd, sync_point_identifier)
              if (unlikely(opt_debug_sync))
                debug_sync(thd, sync_point_identifier)

Alternatively one could use -1 for disabling. Then one could have a
default timeout of zero.

===== Enum sync points (implementation proposal #1) =====

Monty suggested to have an enum with synchronization point identifiers
and an array of actions per sync point referenced from THD.

    void debug_sync(thd, sync_point_identifier)
    {
      if (!thd->debug_sync_action[sync_point_identifier].activation_count)
        return;
      execute sync point action
    }

The DEBUG SYNC statement searches for the array slot of the
sync point by using a typelib. It does then modify the array slot.

The advantage is that inactive sync points have little overhead. However
to add a new sync point one must:

    1. Edit the enum in debug_sync.h
    2. Edit the string array in debug_sync.cc
    3. Edit the file where the sync point is to be put.

Then the make compiles almost all of the server. Even if this takes just
two minutes, in my experience it takes a couple of attempts until a sync
point is at a good place. Even when moving a sync point within the same
file, one should rename it so that it reflects its position. Thus it is
easy to burn 15-20 minutes.

I would like to be able to use the facility for quick experiments ("see
what happens ..."), even if I do not aim for a new test case. I would
have to add sync points and remove them later. Not much fun with that
much of editing.

This proposal is from Monty. He favors it.


====== Alternative for enum sync points ======

one could split the enum in an extra header file and include this just
in those files that needs it. But then there is another step:

    4. Verify that the file with the new sync point includes debug_sync.h

I feel that these 3-4 steps are likely to discourage developers from
using this facility widely.


===== String type sync points (implementation proposal #2) =====

Use strings as sync point identifiers:

    DEBUG_SYNC(thd, "after_open_tables");

No enum and no string array need to be edited.

The macro would look like:

    #define DEBUG_SYNC(thd, sync_point_name)
              if (unlikely(opt_debug_sync))
                debug_sync(thd, STRING_WITH_LEN(sync_point_name))

The DEBUG SYNC statement adds the action to an dynamic array
and keeps it sorted by string length and string. In my prototype
implementation only 2 of 12 sync point names have the same length.
String comparison will rarely be done for searching, but just for the
final verification.

The debug_sync() function would do a binary search over the array.
In my prototype implementation for merge.test, I don't have more than 3
actions active at a time. Even a very complicated case with 4-7 active
actions would have a sync point find its action in three steps.

When the debug sync facility is disabled, it has the same
overhead as the other approach. The command line option is checked first
in any case.

When there are no active actions in a thread, this is detected
immediately after the command line option check. The search function is
not even entered.


===== Enum + String sync points (implementation proposal #3) =====

One could implement both approaches. The developer can first play with
the string identifiers. When his sync points prove useful and he wants
to keep them, he could turn them into "static" sync points.

This would mean that the DEBUG SYNC statement would first try
to find the sync point in the typelib and then in the dynamic array.

The developer would have to add

    DEBUG_SYNC_STR(thd, "after_update");

And change it later to

    DEBUG_SYNC(thd, AFTER_UPDATE);

plus editing the enum and the string array.


===== Spare Enum sync points (implementation proposal #4) =====

If we decide against string type sync points, we could have a couple
of predefined, not location bound sync point identifiers that can be
used during test case development, and have to be turned into location
bound identifiers before commit/push. For example:

    #ifndef DBUG_OFF
      /* WARNING: use these only for test case development ... */
      DEBUG_SYNC_POINT_1, DEBUG_SYNC_POINT_2, ...
      "debug_sync_point_1", "debug_sync_point_2", ...
    #endif

and have them to be renamed to "after_insert" or such and add the
belonging enums and strings before commit/push.


===== Self organizing string sync points (implementation proposal #5) =====

The trick is that the DEBUG_SYNC macro provides static storage for
the sync point index, which replaces the enum value. The index is
dynamically selected when the sync point is hit for the first time.

We have a global dynamic array of sync point names and a thread specific
array of sync point actions:

    external int  debug_sync_point_count= 0;
    external char **debug_sync_point_name= NULL;

    int                  THD::debug_sync_point_count;
    st_debug_sync_action *THD::debug_sync_point_action;

The global array is empty when the server starts. Every time a sync
point is hit for the first time, a new array element is established for
it in the global array. Its array index is stored in static storage.
This index is then used to address the thread specific array. On a later
hit, the index is present already. Only the second step is taken then.

Pseudo code:

    #define DEBUG_SYNC(thd, sync_point_name)
              if (unlikely(opt_debug_sync))
              {
                static uint sync_point_index= 0;
                debug_sync(thd, &sync_point_index, sync_point_name);
              }

    void debug_sync(thd, *sync_point_index, sync_point_name)
    {
      if (!*sync_point_index)
        *sync_point_index= debug_sync_point_new(thd, sync_point_name);
      /*
        sync_point_index 0 is reserved for an always inactive action.
      */
      if (*sync_point_index < thd->debug_sync_point_count &&
          !thd->debug_sync_action[*sync_point_index].activation_count)
        return;
      execute sync point action
    }

    int debug_sync_point_new(thd, sync_point_name)
    {
      
      int sync_point_index= debug_sync_point_count++;
      debug_sync_point_name= realloc(debug_sync_point_name,
                                     debug_sync_point_count);
      /* If malloc error, return special sync_point_index 0 */
      debug_sync_point_name[sync_point_index]= sync_point_name;
      
      return(sync_point_index);
    }

When a thread starts, it allocates an action array of the current size
of the global sync point name array. When a "new" sync point is hit in a
thread, its index is beyond the threads array. This means that the sync
point is inactive in this thread. When an action is requested for a
"new" sync point, the thread array is reallocated to the current size of
the global array.

This is a complicated algorithm, but now we find the action for a sync
point relatively quickly like so (beginning from the second hit of a
sync point):

    if (unlikely(opt_debug_sync) &&
        sync_point_index &&
        sync_point_index < thd->debug_sync_point_count &&
        thd->debug_sync_point_action[sync_point_index].activation_count)

This are just two conditions more than with enum sync points
(sync_point_index && sync_point_index < thd->debug_sync_point_count).

A downside of this proposal is that a sync point must be run through at
least once before an action can be requested for it. Otherwise it would
not appear in the sync point list and no index can be retrieved. While
this should be easy to do in the tests, it is still somewhat
uncomfortable, if not annoying.

A possible way to get around this could be to "queue" the request if the
sync point is not registered. A sync point that is hit for the first
time registers itself and scans the queue.


=== Further reading ===

For a discussion of other methods to synchronize threads see
http://forge.mysql.com/wiki/MySQL_Internals_Test_Synchronization


Comments:

Implementation proposal #5:
to limit run-time impact, on C++ we can use constructors to register points on
load. In GCC and ICC we could use __attribute__("constructor").
Also, mallocs could go away.

Syntax proposal #3:
variable name should be debug_syncronize (I'd personally prefer debug_sync,
even). It's not SQL and doesn't deal with data - it works and refers to
server's source code. Just as "debug" variable does. It's cleaner to have
a dedicated namespace for these stuff. Will help in the future too, when
we'll be adding more similar features.

The Debug Sync Facility allows to place synchronization points in the
code:

    open_tables(...)

    DEBUG_SYNC(thd, "after_open_tables");

    lock_tables(...)

When activated, a sync point can

  - Send a signal and/or
  - Wait for a signal

Nomenclature:

- signal:             A value of a global variable that persists
                      until overwritten by a new signal. The global
                      variable can also be seen as a "signal post"
                      or "flag mast". Then the signal is what is
                      attached to the "signal post" or "flag mast".

- send a signal:      Assign the value (the signal) to the global
                      variable ("set a flag") and broadcast a
                      global condition to wake those waiting for
                      a signal.

- wait for a signal:  Loop over waiting for the global condition until
                      the global value matches the wait-for signal.

By default, all sync points are inactive. They do nothing (except of
burning a couple of CPU cycles for checking if they are active).

A sync point becomes active when an action is requested for it:

    SET DEBUG_SYNC= 'after_open_tables SIGNAL opened WAIT_FOR flushed';

This activates the sync point 'after_open_tables'. It requests it to
send the signal 'opened' and wait for another thread to send the signal
'flushed' when the threads execution runs through the sync point.

For every sync point there can be one action per thread only. Every
thread can request multiple actions, but only one per sync point. In
other words, a thread can activate multiple sync points.

Here is an example how to activate and use the sync points:

    --connection conn1
    SET DEBUG_SYNC= 'after_open_tables SIGNAL opened WAIT_FOR flushed';
    send INSERT INTO t1 VALUES(1);
        --connection conn2
        SET DEBUG_SYNC= 'now WAIT_FOR opened';
        SET DEBUG_SYNC= 'after_abort_locks SIGNAL flushed';
        FLUSH TABLE t1;

When conn1 runs through the INSERT statement, it hits the sync point
'after_open_tables'. It notices that it is active and executes its
action. It sends the signal 'opened' and waits for another thread to
send the signal 'flushed'.

conn2 waits immediately at the special sync point 'now' for another
thread to send the 'opened' signal.

A signal remains in effect until it is overwritten. If conn1 signals
'opened' before conn2 reaches 'now', it will still find the 'opened'
signal. It does not wait in this case.

When conn2 reaches 'after_abort_locks', it signals 'flushed', which lets
conn1 awake.

Normally the activation of a sync point is cleared when it has been
executed. Sometimes it is necessary to keep the sync point active for
another execution. You can add an execute count to the action:

    SET DEBUG_SYNC= 'name SIGNAL sig EXECUTE 3';

This will set an activation counter to 3. Each execution decrements the
counter. After the third execution the sync point becomes inactive in
this example.

One of the primary goals of this facility is to eliminate sleeps from
the test suite. In most cases it should be possible to rewrite test
cases so that they do not need to sleep. (But this facility cannot
synchronize multiple processes.) However to support developing of the
tests, and as a last resort, sync point waiting times out. There is a
default timeout, but it can be overridden:

    SET DEBUG_SYNC= 'name WAIT_FOR sig TIMEOUT 10 EXECUTE 2';

TIMEOUT 0 is special: If the signal is not present, the wait times out
immediately.

When a wait timed out (even on TIMEOUT 0), a warning is generated so
that it shows up in the test result.

You can throw an error message and kill the query when a synchronization
point is hit a certain number of times:

    SET DEBUG_SYNC= 'name HIT_LIMIT 3';

Or combine it with signal and/or wait:

    SET DEBUG_SYNC= 'name SIGNAL sig EXECUTE 2 HIT_LIMIT 3';

Here the first two hits send the signal, the third hit returns the error
message and kills the query.

For cases where you are not sure that an action is taken and thus
cleared in any case, you can force to clear (deactivate) a sync point:

    SET DEBUG_SYNC= 'name CLEAR';

If you want to clear all actions and clear the global signal, use:

    SET DEBUG_SYNC= 'RESET';

This is the only way to reset the global signal to an empty string.

For testing of the facility itself you can test a sync point:

    SET DEBUG_SYNC= 'name TEST';


==== Activation/Deactivation ====

The facility is an optional part of the MySQL server.

    ./configure --with-debug-sync

The Debug Sync Facility, when compiled in, is disabled by default. It
can be enabled by a mysqld command line option:

    --debug-sync-timeout[=default_wait_timeout_value_in_seconds]

'default_wait_timeout_value_in_seconds' is the default timeout for the
WAIT_FOR action. If set to zero, the facility stays disabled.

The facility is enabled by default in the test suite, but can be
disabled with:

    mysql-test-run.pl ... --debug-sync-timeout=0 ...

Likewise the default wait timeout can be set:

    mysql-test-run.pl ... --debug-sync-timeout=10 ...


==== Implementation ====

Pseudo code for a sync point:

    #define DEBUG_SYNC(thd, sync_point_name)
              if (unlikely(opt_debug_sync))
                debug_sync(thd, STRING_WITH_LEN(sync_point_name))

The sync point performs a binary search in a sorted array of actions
for this thread.

The SET DEBUG_SYNC statement adds a requested action to the array or
overwrites an existing action for the same sync point. When it adds a
new action, the array is sorted again.

Autoconf extension for the facility:

  # Debug Sync Facility. NOTE: depends on 'with_debug'. Must be behind it.
  AC_MSG_CHECKING(if Debug Sync Facility should be enabled.)
  AC_ARG_ENABLE(debug_sync,
                AS_HELP_STRING([--enable-debug-sync],
                               [Build a version with Debug Sync Facility]),
                [ enable_debug_sync=$enableval ],
                [ enable_debug_sync=$with_debug ])
  if test "$enable_debug_sync" != "no"
  then
    AC_DEFINE([ENABLED_DEBUG_SYNC], [1],
              [If Debug Sync Facility should be enabled])
    AC_MSG_RESULT([yes]) 
  else
    AC_MSG_RESULT([no])
  fi


Use the autoconf defined symbols in the code like so:

  #if defined(ENABLED_DEBUG_SYNC)
  #endif /* defined(ENABLED_DEBUG_SYNC) */


Mysqld option for enabling the facility:

  #if defined(ENABLED_DEBUG_SYNC)
    {"debug-sync-timeout", OPT_DEBUG_SYNC_TIMEOUT,
     "Enable the debug sync facility "
     "and optionally specify a default wait timeout in seconds. "
     "A zero value keeps the facility disabled.",
     (uchar**) &opt_debug_sync_timeout, 0,
     0, GET_UINT, OPT_ARG, 0, 0, UINT_MAX, 0, 0, 0},
  #endif /* defined(ENABLED_DEBUG_SYNC) */


The system variable:

  #if defined(ENABLED_DEBUG_SYNC)
  /* Debug Sync Facility. Implemented in debug_sync.cc. */
  class sys_var_debug_sync :public sys_var_thd
  {
  public:
    sys_var_debug_sync(sys_var_chain *chain, const char *name_arg)
      :sys_var_thd(name_arg)
    { chain_sys_var(chain); }
    bool check(THD *thd, set_var *var);
    bool update(THD *thd, set_var *var);
    SHOW_TYPE show_type() { return SHOW_CHAR; }
    bool check_update_type(Item_result type) { return type != STRING_RESULT; }
    uchar *value_ptr(THD *thd, enum_var_type type, LEX_STRING *base);
  };
  sys_var_debug_sync sys_debug_sync(&vars, "debug_sync");
  #endif /* defined(ENABLED_DEBUG_SYNC) */

Example sync point:

  DEBUG_SYNC(thd, "before_lock_tables_takes_lock");


Macro to be put in the code at synchronization points:

  #if defined(ENABLED_DEBUG_SYNC)
  #define DEBUG_SYNC(_thd_, _sync_point_name_)                            \
            do { if (unlikely(opt_debug_sync_timeout))                    \
                 debug_sync(_thd_, STRING_WITH_LEN(_sync_point_name_));   \
               } while (0)
  #else /* defined(ENABLED_DEBUG_SYNC) */
  #define DEBUG_SYNC(_thd_, _sync_point_name_)    /* disabled DEBUG_SYNC */
  #endif /* defined(ENABLED_DEBUG_SYNC) */


Global data for the facility:

  uint opt_debug_sync_timeout; // Defined in mysqld.cc


Global static data in debug_sync.cc:

  /**
    Definitions for the debug sync facility.
    1. Global string variable to hold a "signal" ("signal post", "flag mast").
    2. Global condition variable for signalling and waiting.
    3. Global mutex to synchronize access to the above.
  */
  struct st_debug_sync_globals
  {
    String                ds_signal;              /* signal variable */
    pthread_cond_t        ds_cond;                /* condition variable */
    pthread_mutex_t       ds_mutex;               /* mutex variable */
    ulonglong             dsp_hits;               /* statistics */
    ulonglong             dsp_executed;           /* statistics */
    ulonglong             dsp_max_active;         /* statistics */
  };
  static st_debug_sync_globals debug_sync_global; /* All globals in one obj */


Thread specific data:

  struct st_debug_sync_control
  {
    st_debug_sync_action  *ds_action;             /* array of actions */
    uint                  ds_active;              /* # active actions */
    uint                  ds_allocated;           /* # allocated actions */
    ulonglong             dsp_hits;               /* statistics */
    ulonglong             dsp_executed;           /* statistics */
    ulonglong             dsp_max_active;         /* statistics */
  };


Extension of THD:

  #if defined(ENABLED_DEBUG_SYNC)
    struct st_debug_sync_control *debug_sync_control;
  #endif /* defined(ENABLED_DEBUG_SYNC) */


Definition of an action:

  struct st_debug_sync_action
  {
    ulong         activation_count;       /* max(hit_limit, execute) */
    ulong         hit_limit;              /* hits before kill query */
    ulong         execute;                /* executes before self-clear */
    ulong         timeout;                /* wait_for timeout */
    String        signal;                 /* signal to send */
    String        wait_for;               /* signal to wait for */
    String        sync_point;             /* sync point name */
    bool          need_sort;              /* if new action, array needs sort */
  };


Global functions:

  /**
    Initialize the debug sync facility at server start.

    @return status
      @retval     0       ok
      @retval     != 0    error
  */
  int debug_sync_init(void);

  /**
    End the debug sync facility.

    @description
      This is called at server shutdown or after a thread initialization error.
  */
  void debug_sync_end(void);

  /**
    Initialize the debug sync facility at thread start.

    @param[in]    thd             thread handle
  */
  void debug_sync_init_thread(THD *thd);

  /**
    End the debug sync facility at thread end.

    @param[in]    thd             thread handle
  */
  void debug_sync_end_thread(THD *thd);

  /**
    Execute requested action at a synchronization point.

    @param[in]     thd                thread handle
    @param[in]     sync_point_name    name of synchronization point
    @param[in]     name_len           length of sync point name
  */
  void debug_sync(THD *thd, const char *sync_point_name, size_t name_len);


Important static functions in debug_sync.cc:

  /**
    Execute requested action at a synchronization point.

    @param[in]    thd                 thread handle
    @param[in]    action              action to be executed

    @note
      This is to be called only if activation count > 0.
  */
  static void debug_sync_execute(THD *thd, st_debug_sync_action *action);

  /**
    Evaluate a debug sync action string.

    @param[in]        thd             thread handle
    @param[in,out]    action_str      action string to receive '\0' terminators

    @return           status
      @retval         FALSE           ok
      @retval         TRUE            error

    @description
      This is called when the DEBUG_SYNC system variable is set.
      Parse action string, build a debug sync action, activate it.

    @note
      The input string needs to be ASCII NUL ('\0') terminated. We split
      nul-terminated tokens in it without copy.
  */
  static bool debug_sync_eval_action(THD *thd, char *action_str);

  /**
    Set a debug sync action.

    @param[in]    thd             thread handle
    @param[in]    action          synchronization action

    @return       status
      @retval     FALSE           ok
      @retval     TRUE            error

    @description
      This is called from the debug sync parser.
  */
  static bool debug_sync_set_action(THD *thd, st_debug_sync_action *action);

  /**
    Get a debug sync action.

    @param[in]    thd             thread handle
    @param[in]    dsp_name        debug sync point name
    @param[in]    name_len        length of sync point name

    @return       action
      @retval     != NULL         ok
      @retval     NULL            error

    @description
      Find the debug sync action for a debug sync point or make a new one.
  */
  static st_debug_sync_action *debug_sync_get_action(THD *thd,
                                                     const char *dsp_name,
                                                     uint name_len);

  /**
    Remove a debug sync action.

    @param[in]    ds_control      control object
    @param[in]    action          action to be removed

    @description
      Removing an action mainly means to decrement the ds_active counter.
      But if the action is between other active action in the array, then
      the array needs to be shrinked. The active actions above the one to
      be removed have to be moved down by one slot.
  */
  static void debug_sync_remove_action(st_debug_sync_control *ds_control,
                                       st_debug_sync_action *action);

  /**
    Reset the debug sync facility.

    @param[in]    thd             thread handle

    @description
      Remove all actions of this thread.
      Clear the global signal.
  */
  static void debug_sync_reset(THD *thd);

  /**
    Find a debug sync action.

    @param[in]    actionarr       array of debug sync actions
    @param[in]    quantity        number of actions in array
    @param[in]    dsp_name        name of debug sync point to find
    @param[in]    name_len        length of name of debug sync point

    @return       action
      @retval     != NULL         found sync point in array
      @retval     NULL            not found

    @description
      Binary search. Array needs to be sorted by length, sync point name.
  */
  static st_debug_sync_action *debug_sync_find(st_debug_sync_action *actionarr,
                                               int quantity,
                                               const char* dsp_name,
                                               uint name_len);


Non-C++ files need hooks. Example: thr_lock.c

In global space:

  #if defined(ENABLED_DEBUG_SYNC)
  /**
    Global pointer to be set if callback function is defined
    (e.g. in mysqld). See debug_sync.cc.
  */
  void (*debug_sync_wait_for_lock_callback_ptr)(void);
  #endif /* defined(ENABLED_DEBUG_SYNC) */

And in wait_for_tables():

  #if defined(ENABLED_DEBUG_SYNC)
    /*
      One can use this to signal when a thread is going to wait for a lock.
      See debug_sync.cc.
    */
    if (debug_sync_wait_for_lock_callback_ptr)
      (*debug_sync_wait_for_lock_callback_ptr)();
  #endif /* defined(ENABLED_DEBUG_SYNC) */


New error messages:

  ER_DEBUG_SYNC_TIMEOUT
    eng "debug sync point wait timed out"
  ER_DEBUG_SYNC_HIT_LIMIT
    eng "debug sync point hit limit reached"


New test include include/have_debug_sync.inc:

  --require r/have_debug_sync.require
  disable_query_log;
  let $value= query_get_value(SHOW VARIABLES LIKE 'debug_sync', Value, 1);
  eval SELECT ('$value' LIKE 'ON %') AS debug_sync;
  enable_query_log;

Required result:

  debug_sync
  1

This detects not-compiled-in facility as well as disabled facility.