WL#7698: PERFORMANCE SCHEMA, REDUCE MEMORY USAGE FOR TABLE IO / TABLE LOCK
Affects: Server-5.7
—
Status: Complete
The performance schema currently has instrumentation for table io and table locks. This instrumentation is functional, but consumes too much memory when the database contains a *lot* of tables, so that in practice it is not suitable for deployments using co location (typically, ISV). This task is mostly re factoring of performance schema internal buffers, so that the memory overhead for the table instrumentation scales better when the number of tables in the database is large.
F-1 A new system variable is defined, named performance-schema-max-table-lock-stat This variable is global and read only. F-2 A new system variable is defined, named performance-schema-max-index-stat This variable is global and read only. F-3 A new status variable is defined, named performance-schema-table-lock-stat-lost This status variable is global. F-4 A new status variable is defined, named performance-schema-index-stat-lost This status variable is global.
PROBLEM ======= Assuming the following use case: * Very large deployment, 100,000 tables in the database * Co located deployment, 1,000 users each using 100 tables each on average * A DBA wants to instrument table io only for a few users, say 10, so that table io instrumentation is needed for 1,000 tables only out of 100,000. Problem 1: table statistics --------------------------- Currently, the performance schema instrumentation must be sized to support 100,000 tables, even when no statistics are collected for most tables. The memory cost for 100,000 entries in PFS_table_share is too high. Problem 2: index statistics --------------------------- Currently, MySQL supports a maximum of 64 indexes per table. Not every user table however does use all 64 indexes. The performance schema memory allocated, however, is sized at 64 + 1 index statistics, for each table. The memory cost for deployments using only a few indexes is too high. SOLUTION ======== Part 1: Decouple PFS_table_share from the real table lock statistics: ---------------------------------------------------------- Currently real table lock stat is part of PFS_table_share therefore memory for table stat is allocated even though stats are not being collected for that table. Decoupling real table stat from PFS_table_share and allocating memory for table stat only when its stats are being collected would save considerable amount of memory. Server global variable performance_schema_max_table_lock_stat ============================================================= This variable is global, read only. This variable indicates the maximum number of tables the performance schema keeps lock statistics for. The default value is -1 (auto sized), in which case the value estimated for performance_schema_max_table_lock_stat is the same as performance-schema-max-table-instances Server global status performance-schema-table-lock-stat-lost ============================================================ This variable is global. It is a status variable, can be reset to 0 with FLUSH STATUS. This variable indicates how many tables lock statistics were lost by the performance schema instrumentation, due to insufficient space in performance_schema_max_table_lock_stat. Part 2: Allocate index statistics separately: --------------------------------------------- Currently for each real table stat, we keep space for all 64 indexes stats even though indexes are not being used (or less then 64 indexes are used) in a table. To avoid this memory waste, memory space for indexes would be allocated separately on need basis only. Server global variable performance-schema-max-index-stats ========================================================= This variable is global, read only. This variable indicates the maximum number of table indexes the performance schema keeps statistics for. The default value is -1 (auto sized), in which case the value estimated for performance_schema_max_index_stat is 10 * performance-schema-max-table-instances (Implementation specific, subject to change) Server global status performance-schema-index-stat-lost ======================================================= This variable is global. It is a status variable, can be reset to 0 with FLUSH STATUS. This variable indicates how many indexes' statistics were lost by the performance schema instrumentation, due to insufficient space in performance_schema_max_index_stat.
Part 1 : -------- Add a new array table_stat_array where stats for tables will be stored. Size of this array is deteremined by global server variable Performance_schema_max_table_stat. Memory to table_stat_array will be allocated during initialization of Performance Schema. Modify PFS_table_share as follow : truct PFS_ALIGNED PFS_table_share { ... - PFS_table_stat m_table_stat; + PFS_table_stat *m_table_stat; ... } Now m_table_stat is a pointer to an element in table_stat_array. Therfore, now if stats for any table is supposed to be stored, an empty slot in table_stat_array will be utilized to stored that table's stat. If no slot is available, global status variable performance-schema-table-stat-lost will be incremented. If a table is dropped, its stats from table_stat_array should be deleted? (TBD) Part 2 : -------- Add a new array index_stat_array where stats for table indexes will be stored. Size of this array is deteremined by global server variable Performance_schema_max_index_stat. 2 Memory to index_stat_array will be allocated during initialization of Performance Schema. Modify PFS_table_stat as follow : struct PFS_table_stat { ... /** Statistics, per index. Each index stat is in [0, MAX_INDEXES-1], stats when using no index are in [MAX_INDEXES]. */ - PFS_table_io_stat m_index_stat[MAX_INDEXES + 1]; + PFS_table_io_stat* m_index_stat[MAX_INDEXES + 1]; + /* This is place where stats are stored when no index is used. */ + PFS_table_io_stat m_no_index_stat; ... } Here, m_index_stat is an array of pointers which are pointing to an element of index_stat_array or else NULL. Last element of m_index_stat would always be pointing to m_no_index_stat. m_no_index_stat is where stats would be stored when no index is being used. Therfore, now if stats for any table index is supposed to be stored, an empty slot in index_stat_array will be utilized to stored that index stat. If no slot is available, global status variable performance-schema-index-stat-lost will be incremented.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.