WL#8058: PERFORMANCE_SCHEMA, INSTRUMENT SERVER ERRORS

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

Instrument server errors and warnings,
provide performance schema statistics by thread/account/user/host/global.
-----------
Background:
-----------
There are different sources of Errors in MySQL described here:
http://dev.mysql.com/doc/refman/5.7/en/error-sources.html

-------------------
What to Instrument:
-------------------
This worklog scope is limited to expose MySQL Server Errors Instrumentation
listed here:
http://dev.mysql.com/doc/refman/5.7/en/error-messages-server.html
i.e. it would not be focusing on error handling/generation for Clients/Storage
Engine specific errors (like InnoDB).

---------------
New Instrument:
---------------
Following new instrument added to Performance Schema which could be seen in
setup_instruments table.

           errors

           - If ENABLED='YES', errors on MySQL Server would be instrumented and 
             statistical information would be collected for them.
           - TIMED='YES'/'NO', is not applicable for 'error' instrument.

Default : By Default 'error' instrument would be Enabled.

--------------------
New global variable:
--------------------
A new variable, performance_schema_error_size, introduced to control size
allocated for errors statistics. 

Variable name: performance_schema_error_size
Variable scope: global
Dynamic variable: No
Type: integer
Default: the number of errors used in the server, currently 1027
Min value: 0
Max value: 1048576

Let's say PFS_MAX_SERVER_ERROR= Total number of MySQL server errors 
                                (excluding obsolete errors).

When performance_schema_error_size = 0
            - No memory is allocated therefore no rows in error summary tables.

When performance_schema_error_size = PFS_MAX_SERVER_ERROR (default)
            - Stats for each MySQL server errors is collected in a separate row.

When performance_schema_error_size < PFS_MAX_SERVER_ERROR
            - Stats for errors, which fall short of PFS_MAX_SERVER_ERROR, are 
              collected on NULL row.

When performance_schema_error_size > PFS_MAX_SERVER_ERROR
            - Stats for each MySQL server errors is collected in a separate row.
              Extra rows (for extra allocated space) have NULL for ERROR_NAME, 
              ERROR_NUMBER, SQLSTATE.

NOTE: This variable, performance_schema_error_size,
has the proper default value based on error codes used in the server.
It is intended to be:
- left as is, in which case all errors are instrumented
- set to 0, in which case no errors are instrumented

Setting the variable to other numeric values is reserved for future use.

---------------------
Error Summary Tables:
---------------------
There would be new summary tables to display error statistics.
events_errors_summary_global_by_error
events_errors_summary_by_thread_by_error
events_errors_summary_by_user_by_error
events_errors_summary_by_host_by_error
events_errors_summary_by_account_by_error

Each row in above tables will be corresponding to an error and aggregation is
done based on different factors as indicated by tables' names above.

--------------------------
SUMMARY TABLE DESCRIPTION:
--------------------------

TABLE performance_schema.events_errors_summary_global_by_error
-------------------------------------------------------------------
CREATE TABLE performance_schema.events_errors_summary_global_by_error("
  "ERROR_NUMBER INTEGER,"
  "ERROR_NAME varchar(64),"
  "SQLSTATE varchar(5),"
  "SUM_ERROR_RAISED BIGINT unsigned not null,"
  "SUM_ERROR_HANDLED BIGINT unsigned not null,"
  "FIRST_SEEN TIMESTAMP(0),"
  "LAST_SEEN TIMESTAMP(0)"
  ")ENGINE=PERFORMANCE_SCHEMA;

TABLE performance_schema.events_errors_summary_by_thread_by_error
----------------------------------------------------------------------
CREATE TABLE performance_schema.events_errors_summary_by_thread_by_error("
  "THREAD_ID BIGINT unsigned not null,"
  <Same as _global_by_error>

TABLE performance_schema.events_errors_summary_by_user_by_error
--------------------------------------------------------------------
CREATE TABLE performance_schema.events_errors_summary_by_user_by_error("
  "USER CHAR(32) collate utf8_bin default null,"
  <Same as _global_by_error>

TABLE performance_schema.events_errors_summary_by_host_by_error
--------------------------------------------------------------------
CREATE TABLE performance_schema.events_errors_summary_by_host_by_error("
  "HOST CHAR(60) collate utf8_bin default null,"
  <Same as _global_by_error>

TABLE performance_schema.events_errors_summary_by_account_by_error
-----------------------------------------------------------------------
CREATE TABLE performance_schema.events_errors_summary_by_account_by_error("
  "USER CHAR(32) collate utf8_bin default null,"
  "HOST CHAR(60) collate utf8_bin default null,"
  <Same as _by_global_by_error>

-----------
Operations:
-----------
- SELECT is allowed.
- TRUNCATE is allowed, to reset statistics.

--------------------
COLUMNS DESCRIPTION:
--------------------

Example error:
-------------------------------------------------
Error: 1006 SQLSTATE: HY000 (ER_CANT_CREATE_DB)

Message: Can't create database '%s' (errno: %d) 
-------------------------------------------------

ERROR_NUMBER INTEGER
 - unique error number for an error.
 - For our example, it would be 1006.

ERROR_NAME varchar(32) not null
 - Unique name for error.
 - In our example, it would be ER_CANT_CREATE_DB.

SQLSTATE varchar(5) not null
 - SQLSTATE for the error.
 - In our example, it would be HY000.

SUM_ERROR_RAISED BIGINT unsigned not null
 - Number of times a specific error raised.

SUM_ERROR_HANDLED BIGINT unsigned not null
 - Number of times a specific error is handled by SQL Exception Handler.

FIRST_SEEN TIMESTAMP(0)
 - Timestamp, when first time this error occurred.

LAST_SEEN TIMESTAMP(0)
 - Timestamp, when last time this error occurred.

THREAD_ID BIGINT unsigned not null
 - Thread Id (PFS specific) based on which error stats are aggregated.

USER CHAR(32) collate utf8_bin default null
 - User name based on which errors stats are aggregated.

HOST CHAR(60) collate utf8_bin default null
 - Host name based on which errors stats are aggregated.

--------
NULL ROW
--------
There would be a specific row (very first row) in above tables where:
 - ERROR_NUMBER = 0
 - ERROR_NAME = NULL
 - SQLSTATE = NULL

This row will be used to aggregate stats of all errors which:
 - Fall out-of-range of MySQL Server Errors.

REQUIREMENTS
============

Install
-------

Func-Req (1): A fresh MySQL installation of CURRENT-VERSION must create the
following tables:
- 1.1, TABLE performance_schema.events_errors_summary_by_thread_by_error
- 1.2, TABLE performance_schema.events_errors_summary_by_account_by_error
- 1.3, TABLE performance_schema.events_errors_summary_by_user_by_error
- 1.4, TABLE performance_schema.events_errors_summary_by_host_by_error
- 1.5, TABLE performance_schema.events_errors_summary_global_by_error

Upgrade
-------

Func-Req (2): An upgrade from PREVIOUS-VERSION to CURRENT-VERSION must create
the following tables:
- 2.1, TABLE performance_schema.events_errors_summary_by_thread_by_error
- 2.2, TABLE performance_schema.events_errors_summary_by_account_by_error
- 2.3, TABLE performance_schema.events_errors_summary_by_user_by_error
- 2.4, TABLE performance_schema.events_errors_summary_by_host_by_error
- 2.5, TABLE performance_schema.events_errors_summary_global_by_error

Misc impact on doc
==================

Error
ER_CANT_SET_ENFORCE_GTID_CONSISTENCY_ON_WITH_ONGOING_GTID_VIOLATING_TRANSACTIONS
was renamed to
ER_CANT_ENFORCE_GTID_CONSISTENCY_WITH_ONGOING_GTID_VIOLATING_TX

and 

ER_SET_ENFORCE_GTID_CONSISTENCY_WARN_WITH_ONGOING_GTID_VIOLATING_TRANSACTIONS
was renamed to
ER_ENFORCE_GTID_CONSISTENCY_WARN_WITH_ONGOING_GTID_VIOLATING_TX

This affects:
http://dev.mysql.com/doc/refman/5.7/en/error-messages-server.html#error_er_cant_set_enforce_gtid_consistency_on_with_ongoing_gtid_violating_transactions

and

http://dev.mysql.com/doc/refman/5.7/en/error-messages-server.html#error_er_set_enforce_gtid_consistency_warn_with_ongoing_gtid_violating_transactions