WL#7152: PERFORMANCE SCHEMA, EXTRACT DIGEST

Affects: Server-Prototype Only   —   Status: Complete

The performance schema currently computes a statement DIGEST.

This task is to allow the sql layer to access a statement digest,
once parsing is done.


User Documentation
==================

Code refactoring. No user documentation required.
Pure re factoring : no functional requirements.
This task alone is only about re factoring,
to move the digest related code:
- from storage/perfschema (the performance schema implementation)
- to sql/ (the SQL layer of the server)

As a result of re factoring, the following APIs are available in the SQL layer:
- compute_digest_md5(), to compute a query digest
- compute_digest_text(), to compute a query digest text.

Overview of code moved:
=======================

Moved code
- from file 'storage/perfschema/gen_pfs_lex_token.cc' (deleted)
- to file 'sql/gen_lex_token.cc' (new file)

Moved code
- from storage/perfschema/pfs_digest.h
- from storage/perfschema/pfs_digest.cc
- to file 'sql/sql_digest.h' (new file)
- to file 'sql/sql_digest_stream.h' (new file)
- to file 'sql/sql_digest.cc' (new file)

Renaming of code:
=================

#define PSI_MAX_DIGEST_STORAGE_SIZE --> #define MAX_DIGEST_STORAGE_SIZE
#define PFS_MD5_SIZE --> #define MD5_HASH_SIZE
#define PFS_SIZE_OF_A_TOKEN --> #define SIZE_OF_A_TOKEN

macro MYSQL_ADD_TOKEN() --> method Lex_input_stream::add_digest_token()

struct PSI_digest_storage --> struct sql_digest_storage
function digest_reset() --> method sql_digest_storage::reset()
function digest_copy() --> method sql_digest_storage::copy()

struct PSI_digest_locker_state --> struct sql_digest_state

function get_digest_text() --> function compute_digest_text()

Changes to the server:
======================

1. class THD

New attributes:
- THD::m_digest_state
- THD::m_digest

The digest of a statement has a life cycle longer that just a call to the
parser, because the server may need to read a statement digest *after* parsing
is completed.
To achieve that, the memory used to represent a digest is stored in THD.

2. Invoking the parser

A digest computation is a by product of parsing.
A digest can be used independently by the performance schema itself,
and by the calling code.
Not every parser invocation needs to compute a digest, as the parser is used in
many different contexts.

2.1 When a digest is never required:

Before invoking the parser:
  sql_digest_state *parent_digest;
  parent_digest= thd->m_digest;
  thd->m_digest= NULL;

After invoking the parser:
  thd->m_digest= parent_digest;

2.2 When a digest might be required by the performance schema:

Before invoking the parser:
  sql_digest_state *parent_digest;
  parent_digest= thd->m_digest;
  thd->m_digest= & thd->m_digest_state; /* or any valid storage */

After invoking the parser:
  thd->m_digest= parent_digest;

The decision to compute the digest or not depends on the runtime
configuration used in the performance schema.
See MYSQL_DIGEST_START()

2.3 When a digest is required by the server code:

Before invoking the parser:
  Parser_state parser_state;
  sql_digest_state *parent_digest;
  parent_digest= thd->m_digest;
  thd->m_digest= & thd->m_digest_state; /* or any valid storage */
  parser_state.m_input.m_compute_digest= true;

After invoking the parser:
  thd->m_digest->m_digest_storage contains the resulting digest.
  Functions compute_digest_text() and compute_digest_md5() can be used
  to read the result.
  thd->m_digest= parent_digest; /* restore the context */

Explicitly setting the compute digest flag in the parser input
forces the parser to compute a digest, even when the performance schema
instrumentation is disabled.

The Parser_input structure, part of Parser_state, contains all the parameters
that affect the parser behavior, for clarity.
Currently, this consist only of one flag (whether to compute a digest or not).

New API exposed:
================

After parsing is done, the digest computed by the parser can be obtained using
the following functions.

Function compute_digest_md5()
-----------------------------

/**
  Compute a digest hash.
  @param digest_storage The digest
  @param [out] md5 The computed digest hash. This parameter is a buffer of size
@c MD5_HASH_SIZE.
*/
void compute_digest_md5(
  const sql_digest_storage *digest_storage,
  unsigned char *md5);

Function compute_digest_text()
------------------------------

/**
  Compute a digest text.
  A 'digest text' is a textual representation of a query,
  where:
  - comments are removed,
  - non significant spaces are removed,
  - literal values are replaced with a special '?' marker,
  - lists of values are collapsed using a shorter notation
  @param digest_storage The digest
  @param [out] digest_text
  @param digest_text_length Size of @c digest_text.
  @param [out] truncated true if the text representation was truncated
*/
void compute_digest_text(const sql_digest_storage *digest_storage,
                         char *digest_text, size_t digest_text_length,
                         bool *truncated);