WL#7698: PERFORMANCE SCHEMA, REDUCE MEMORY USAGE FOR TABLE IO / TABLE LOCK

Affects: Server-5.7   —   Status: Complete   —   Priority: Medium

The performance schema currently has instrumentation for table io and table locks.

This instrumentation is functional, but consumes too much memory when the
database contains a *lot* of tables, so that in practice it is not suitable for
deployments using co location (typically, ISV).

This task is mostly re factoring of performance schema internal buffers,
so that the memory overhead for the table instrumentation scales better when the
number of tables in the database is large.

F-1 A new system variable is defined, named
performance-schema-max-table-lock-stat
This variable is global and read only.

F-2 A new system variable is defined, named
performance-schema-max-index-stat
This variable is global and read only.

F-3 A new status variable is defined, named
performance-schema-table-lock-stat-lost
This status variable is global.

F-4 A new status variable is defined, named
performance-schema-index-stat-lost
This status variable is global.

PROBLEM
=======

Assuming the following use case:

* Very large deployment, 100,000 tables in the database
* Co located deployment, 1,000 users each using 100 tables each on average
* A DBA wants to instrument table io only for a few users, say 10,
so that table io instrumentation is needed for 1,000 tables only out of 100,000.

Problem 1: table statistics
---------------------------

Currently, the performance schema instrumentation must be sized to support
100,000 tables, even when no statistics are collected for most tables.

The memory cost for 100,000 entries in PFS_table_share is too high.

Problem 2: index statistics
---------------------------

Currently, MySQL supports a maximum of 64 indexes per table.
Not every user table however does use all 64 indexes.

The performance schema memory allocated, however, is sized at 64 + 1 index
statistics, for each table.

The memory cost for deployments using only a few indexes is too high.

SOLUTION
========

Part 1: Decouple PFS_table_share from the real table lock statistics:
----------------------------------------------------------

Currently real table lock stat is part of PFS_table_share therefore memory for table
stat is allocated even though stats are not being collected for that table.
Decoupling real table stat from PFS_table_share and allocating memory for table
stat only when its stats are being collected would save considerable amount of
memory.

Server global variable performance_schema_max_table_lock_stat
=============================================================

This variable is global, read only.
This variable indicates the maximum number of tables the performance schema
keeps lock statistics for.

The default value is -1 (auto sized), in which case the value
estimated for performance_schema_max_table_lock_stat
is the same as performance-schema-max-table-instances

Server global status performance-schema-table-lock-stat-lost 
============================================================

This variable is global.
It is a status variable, can be reset to 0 with FLUSH STATUS.
This variable indicates how many tables lock statistics were lost by the
performance schema instrumentation, due to insufficient space in
performance_schema_max_table_lock_stat.

Part 2: Allocate index statistics separately:
---------------------------------------------

Currently for each real table stat, we keep space for all 64 indexes stats
even though indexes are not being used (or less then 64 indexes are used) in a
table. To avoid this memory waste, memory space for indexes would be allocated
separately on need basis only.

Server global variable performance-schema-max-index-stats 
=========================================================

This variable is global, read only.
This variable indicates the maximum number of table indexes the performance
schema keeps statistics for.

The default value is -1 (auto sized), in which case the value
estimated for performance_schema_max_index_stat
is 10 * performance-schema-max-table-instances

(Implementation specific, subject to change)

Server global status performance-schema-index-stat-lost 
=======================================================

This variable is global.
It is a status variable, can be reset to 0 with FLUSH STATUS.
This variable indicates how many indexes' statistics were lost by the
performance schema instrumentation, due to insufficient space in
performance_schema_max_index_stat.

Part 1 :
--------
Add a new array table_stat_array where stats for tables will be stored. Size of
this array is deteremined by global server variable
Performance_schema_max_table_stat. 

Memory to table_stat_array will be allocated during initialization of
Performance Schema.

Modify PFS_table_share as follow :
truct PFS_ALIGNED PFS_table_share
{
 ...
-  PFS_table_stat m_table_stat;
+  PFS_table_stat *m_table_stat;
 ...
}

Now m_table_stat is a pointer to an element in table_stat_array.

Therfore, now if stats for any table is supposed to be stored, an empty slot in
table_stat_array will be utilized to stored that table's stat. If no slot is
available, global status variable performance-schema-table-stat-lost will be
incremented.

If a table is dropped, its stats from table_stat_array should be deleted? (TBD)

Part 2 :
--------
Add a new array index_stat_array where stats for table indexes will be stored.
Size of this array is deteremined by global server variable
Performance_schema_max_index_stat. 2

Memory to index_stat_array will be allocated during initialization of
Performance Schema.

Modify PFS_table_stat as follow :
struct PFS_table_stat
{
   ...
   /**
     Statistics, per index.
     Each index stat is in [0, MAX_INDEXES-1],
     stats when using no index are in [MAX_INDEXES].
   */
-  PFS_table_io_stat m_index_stat[MAX_INDEXES + 1];
+  PFS_table_io_stat* m_index_stat[MAX_INDEXES + 1];
+  /* This is place where stats are stored when no index is used. */
+  PFS_table_io_stat m_no_index_stat;
   ...
}

Here, 
m_index_stat is an array of pointers which are pointing to an element of
index_stat_array or else NULL. Last element of m_index_stat would always be
pointing to m_no_index_stat.
m_no_index_stat is where stats would be stored when no index is being used.

Therfore, now if stats for any table index is supposed to be stored, an empty
slot in index_stat_array will be utilized to stored that index stat. If no slot
is available, global status variable performance-schema-index-stat-lost will be
incremented.