WL#14079: P_S memory instrumentation for DD and runtime code
The purpose of this worklog is to review current performance schema memory keys owned by runtime, propose a uniform naming scheme, update key names and append descriptions.
The WL will also implement basic P_S memory keys for the major data dictionary data structures. Suggested keys for first step:
- dd::infrastructure
- dd::objects
Note that the bulk of the object data is allocated with the dd::String_type, which has the associated P_S key dd::String_type. Hence, at least for now, it probably makes sense to have a single P_S key covering all DD objects since the dd::String_type key is used for all character strings anyway.
Contents |
Use cases
Below are the main use cases for this functionality.
Cache size on idle system
If the system is running with expected load, then the user load is stopped, we will see the memory used by the data dictionary cache to store the non-evicted data dictionary objects. We cannot distinguish between memory used by different types of DD objects, but it is probably safe to assume that the bulk of this is occupied by table objects. Cache size can then be tuned if the size of the memory allocated needs to be changed. The system needs to be idle because with the current performance schema implementation, the incrementation and decrementation of counters is not always properly serialized, and hence can be misleading.
Cache size on system with DML load
If the data dictionary cache has sufficient capacity, it should be able to hold all data dictionary objects needed by the queries that are executed with few cache misses. To see if this is the case, we can set the size of the table definition cache to some small value (to force new table shares to be created when tables are opened), then we can run the system with the expected DML load. If the data dictionary cache is sufficiently large, then there should be only small amounts of additional allocation and freeing of data dictionary objects. For a system with DDL load, this approach will not make sense since new data dictionary objects are allocated regardless of the cache size.
Memory usage on system with load
If the performance schema memory counters were incremented and decremented in the correct order, then it might also be possible to use the data dictionary memory keys to see the maximum memory usage by the data dictionary cache. However, if external load is halted, then it might be possible to analyze this in retrospect.
Counting memory allocations to constrain memory usage can be done by piggybacking on the performance schema framework, i.e., the performance schema counters are not used directly, but different counters are maintained while using part of the performance schema framework. Thus, introducing the data dictionary memory keys may also allow them to be part of a forthcoming mechanism to monitor and constrain allocation of global shared memory. The implementation of such a mechanism is not part of this WL, though.
Functional requirements
- F1
- Add descriptions for performance schema memory keys owned by the Runtime team as listed in the HLS.
- F2
- Change key names as listed in the HLS to conform to a uniform naming scheme.
- F3
- Implement support for the additional data dictionary memory keys as listed in the HLS according to the deign in the LLD.
Contents |
Current runtime P_S memory keys
Below, we list all the P_S memory keys. For the keys owned by runtime team we will:
- Add a description for all keys.
- Propose a new name for some of the keys.
- New keys are listed with 'existing key name' being empty.
For the key naming scheme, we suggest to group related keys by using a key prefix, in the same style as C++ namespaces. E.g., for the data dictionary keys, we use the 'dd::' prefix, for the THD related keys, we use the 'THD::' prefix, etc.
The naming scheme of the P_S keys follows the pattern 'memory/category/instrument_name', hence we could also introduce new categories, however, this is presumably too much of a change for a maintenance release.
Existing key name | New key name | Description |
---|---|---|
acl_cache | ||
acl_map_cache | ||
binlog_cache_mngr | ||
binlog_pos | ||
binlog_statement_buffer | ||
bison_stack | ||
Blob_mem_storage::storage | ||
db_worker_hash_entry | ||
dd::column_statistics | - | Column statistics histograms allocated |
dd::default_values | - | Temporary buffer for preparing column default values |
dd::import | - | File name handling while importing MyISAM tables |
dd::String_type | - | Character strings used by data dictionary objects |
- | dd::infrastructure | Infrastructure of the data dictionary structures |
- | dd::objects | Memory occupied by the data dictionary objects |
debug_sync_control::debug_sync_action | THD::debug_sync_action | Debug sync actions to perform per thread. |
Delegate::memroot | ||
display_table_locks | - | Debug utility |
errmsgs | errmsgs::server | In-memory representation of server error messages |
Event_basic::mem_root | - | Event base class with root used for definiton etc. |
Event_queue_element_for_exec::names | - | Copy of schema- and event name in exec queue element |
Event_scheduler::scheduler_param | - | Infrastructure of the priority queue of events |
File_query_log::name | ||
Filesort_buffer::sort_keys | ||
Filesort_info::merge | ||
Filesort_info::record_pointers | ||
Geometry::ptr_and_wkb_data | ||
Gis_read_stream::err_msg | ||
global_system_variables | ||
Gtid_set::Interval_chunk | ||
Gtid_set::to_string | ||
Gtid_state::group_commit_sidno_locks | ||
Gtid_state::to_string | ||
handler::errmsgs | errmsgs::handler | Handler error messages (HA_ERR_...) |
handlerton | handlerton::objects | Handlerton objects |
hash_index_key_buffer | ||
hash_join | ||
HASH_ROW_ENTRY | ||
help | - | Temporary memroot used to print help texts as part of usage description |
histograms | ||
host_cache::hostname | - | Hostname keys in the host_cache map |
JOIN_CACHE | ||
JSON | ||
load_env_plugins | ||
Locked_tables_list::m_locked_tables_root | - | Memroot for list of locked tables |
log_error_loaded_services | log_error::loaded_services | Memory allocated for duplicate log events |
log_error_stack | log_error::stack | Log events for the error log |
Log_event | ||
LOG_name | LOG::file_name | File name of slow log and general log |
LOG_POS_COORD | ||
MDL_context_backup_manager | MDL_context::backup_manager | MDL for prepared XA trans with disconnected client |
MDL_context::acquire_locks | - | Buffer for sorting lock requests |
MPVIO_EXT::auth_info | ||
Mutex_cond_array::Mutex_cond | ||
my_bitmap_map | ||
my_str_malloc | ||
MYSQL_BIN_LOG::basename | ||
MYSQL_BIN_LOG::index | ||
MYSQL_BIN_LOG::recover | ||
MYSQL_LOCK | - | Table locks per session |
MYSQL_LOG::name | ||
mysql_plugin | ||
mysql_plugin_dl | ||
MYSQL_RELAY_LOG::basename | ||
MYSQL_RELAY_LOG::index | ||
NAMED_ILINK::name | - | Names in the MyISAM key cache |
NET::buff | - | Buffer in the client protocol communications layer |
NET::compress_packet | - | Buffer used when compressing a packet |
opt_bin_logname | ||
Owned_gtids::sidno_to_hash | ||
Owned_gtids::to_string | ||
Partition_admin | Partition::admin | Buffer for printing messages into the client protocol |
Partition_share | Partition::share | Partition name and auto increment mutex |
partition_sort_buffer | Partition::sort_buffer | Record buffer for a partition |
partition_syntax_buffer | Partition::syntax_buffer | Buffer used for formatting the partition expression |
plugin_bookmark | ||
plugin_init_tmp | ||
plugin_int_mem_root | ||
plugin_mem_root | ||
plugin_ref | ||
Prepared_statement_map | Prepared_statement::infrastructure | Map infrastructure for prepared statements per session |
Prepared_statement::main_mem_root | - | Mem root for each prepared statement for items etc. |
PROFILE | ||
prune_partitions::exec | Partition::prune_exec | Mem root used temporarily while pruning partitions |
Queue::queue_item | ||
QUICK_GROUP_MIN_MAX_SELECT::alloc | ||
QUICK_INDEX_MERGE_SELECT::alloc | ||
QUICK_RANGE_SELECT::alloc | ||
QUICK_RANGE_SELECT::mrr_buf_desc | ||
Quick_ranges | ||
QUICK_ROR_INTERSECT_SELECT::alloc | ||
QUICK_ROR_UNION_SELECT::alloc | ||
READ_INFO | ||
READ_RECORD_cache | ||
Recovered_xa_transactions | XA::recovered_transactions | List infrastructure for recovered XA transactions |
Relay_log_info::mts_coor | ||
root | ||
Row_data_memory::memory | ||
rpl_filter memory | ||
Rpl_info_file::buffer | ||
Rpl_info_table | ||
rpl_slave::check_temp_dir | ||
servers | - | Note: Duplicate of the key below, will be deleted |
servers_cache | - | Cache infrastructure and mem root for servers cache |
Shared_memory_name | - | Communication through shared memory (windows) |
show_slave_status_io_gtid_set | ||
Sid_map::Node | ||
SLAVE_INFO | ||
Slave_job_group::group_relay_log_name | ||
sp_head::call_mem_root | - | Mem root for objects with same life time as stored program call |
sp_head::execute_mem_root | - | Mem root per instruction |
sp_head::main_mem_root | - | Mem root for parsing and representation of stored programs |
sql_acl_mem | ||
sql_acl_memex | ||
ST_SCHEMA_TABLE | - | Structure describing an information schema table implemented by a plugin |
String::value | ||
Sys_var_charptr::value | ||
TABLE | - | Memory used by TABLE objects and their mem root |
table_mapping::m_mem_root | ||
TABLE_RULE_ENT | ||
TABLE_SHARE::mem_root | - | Cache infrastructure and individual table shares |
TABLE::sort_io_cache | ||
TC_LOG_MMAP::pages | - | In-memory transaction coordinator log |
test_quick_select | ||
thd_timer | - | Thread timer object |
THD::db | - | Name of currently used schema |
THD::debug_sync_control | - | Structure to control debug sync per thread |
THD::handler_tables_hash | - | Hash map of tables used by HANDLER statements |
thd::main_mem_root | THD::main_mem_root | Main mem root used for e.g. the query arena |
THD::Session_sysvar_resource_manager | ||
THD::Session_tracker | ||
THD::sp_cache | - | Per session cache for stored programs |
THD::transactions::mem_root | - | Transaction context information per session |
THD::variables | - | Per session copy of global dynamic variables |
tz_storage | - | Shared time zone data |
udf_mem | - | Shared structure of UDFs |
Unique::merge_buffer | ||
Unique::sort_buffer | ||
user_conn | - | Objects describing user connections |
User_level_lock | - | Per session storage of user level locks |
user_var_entry | ||
user_var_entry::value | ||
write_set_extraction | ||
XID | XA::transaction_contexts | Shared cache of XA transaction contexts |
Runtime P_S memory keys grouped by relation
Keys related to the same subsystem or data structure are grouped below. We also split the groups into the shared resources and the per-session resources. Note that the shared resources are also usually allocated by request from a session, however, the shared resource may also be used by other sessions, and is usually not freed when the session exits. Per-session resources, on the other hand, are only available to the session that has allocated them, and are freed when the session ends.
- Data dictionary (shared):
- dd::column_statistics
- dd::default_values
- dd::import
- dd::String_type
- dd::objects
- dd::infrastructure
- Table definition cache (shared):
- TABLE_SHARE::mem_root
- Event scheduler (shared):
- Event_basic::mem_root
- Event_queue_element_for_exec::names
- Event_scheduler::scheduler_param
- Transaction coordinator (shared):
- TC_LOG_MMAP::pages
- XA (shared):
- Recovered_xa_transactions
- XID
- Host name cache (shared):
- host_cache::hostname
- Error messages, error- and query logging (shared):
- errmsgs
- handler::errmsgs
- log_error_loaded_services
- log_error_stack
- LOG_name
- MDL (shared):
- MDL_context_backup_manager
- MDL_context::acquire_locks
- Information schema (shared):
- ST_SCHEMA_TABLE
- Time zone storage (shared):
- tz_storage
- User defined functions (shared):
- udf_mem
- User connections (shared):
- user_conn
- Help (shared):
- help
- Handlerton (shared):
- handlerton
- Servers cache for Federated (shared):
- servers
- servers_cache
- MyISAM (shared):
- NAMED_ILINK::name
Per session resources
- Per THD memory (session):
- thd_timer
- THD::db
- THD::debug_sync_control
- THD::handler_tables_hash
- thd::main_mem_root
- THD::sp_cache
- THD::transactions::mem_root
- THD::variables
- MYSQL_LOCK
- Locked_tables_list::m_locked_tables_root
- Table cache (session):
- TABLE
- SP structures (session):
- sp_head::call_mem_root
- sp_head::execute_mem_root
- sp_head::main_mem_root
- Prepared statements (session):
- Prepared_statement_map
- Prepared_statement::main_mem_root
- User level locks (session):
- User_level_lock
- Partitioning (session):
- Partition_admin
- Partition_share
- partition_sort_buffer
- partition_syntax_buffer
- prune_partitions::exec
- Protocol / communication (session):
- NET::buff
- NET::compress_packet
- Shared_memory_name
- Debug (session):
- display_table_locks
Contents |
Micro benchmarking
We will implement benchmarking to verify:
- The memory allocation overhead imposed by the P_S instrumentation.
- The performance overhead imposed by the P_S instrumentation.
Unit tests will be implemented to test these two issues. For the performance related test, we will use the existing micro benchmarking framework for the unit tests.
Small allocations
Adding P_S memory instrumentation implicitly adds some overhead in terms of allocating additional memory to store P_S meta data. For small objects, the overhead can be substantial. However, for most data dictionary objects, the size can be assumed to be sufficiently large that it makes sense to instrument all object allocations.
We will introduce monitoring and control of per-session memory. Here, allocations for certain P_S memory keys are counted. The counting itself is a separate mechanism, so the P_S instrumentation is added for all memory keys in the same way as it used to be. Hence, even if some allocations are not counted by the resource constraint framework, the overhead imposed by the P_S instrumentation is still present.
Possibly increase max number of keys
- We might have to increase performance_schema_max_memory_classes to accommodate the new classes. The current max is 450. There are 422 classes defined in the server, InnoDB and elsewhere.
- The 75 memory instruments defined by the Performance Schema are not counted against max_memory_classes.
- Status var performance_schema_memory_classes_lost will indicate > 0 if max_memory_classes is insufficient, and at least one P_S mtr test will fail.
Change the key type for the dd::String_type
For future use by a resource control mechanism, we may need to see which threads have allocated the memory consumed by the data dictionary objects.
---------------------------- sql/psi_memory_key.cc ---------------------------- @@ -311,7 +311,7 @@ static PSI_memory_info all_server_memory[] = { PSI_DOCUMENT_ME}, {&key_memory_TABLE, "TABLE", PSI_FLAG_ONLY_GLOBAL_STAT, 0, PSI_DOCUMENT_ME}, {&key_memory_LOG_name, "LOG_name", 0, 0, PSI_DOCUMENT_ME}, - {&key_memory_DD_String_type, "dd::String_type", PSI_FLAG_ONLY_GLOBAL_STAT, + {&key_memory_DD_String_type, "dd::String_type", 0, 0, PSI_DOCUMENT_ME}, {&key_memory_ST_SCHEMA_TABLE, "ST_SCHEMA_TABLE", 0, 0, PSI_DOCUMENT_ME}, {&key_memory_PROFILE, "PROFILE", 0, 0, PSI_DOCUMENT_ME},
Define the new performance schema memory keys
---------------------------- sql/psi_memory_key.cc ---------------------------- @@ -33,9 +33,11 @@ MAINTAINER: Please keep this list in order, to limit merge collisions. */ +PSI_memory_key key_memory_DD_cache_infrastructure; PSI_memory_key key_memory_DD_column_statistics; PSI_memory_key key_memory_DD_default_values; PSI_memory_key key_memory_DD_import; +PSI_memory_key key_memory_DD_objects; PSI_memory_key key_memory_DD_String_type; PSI_memory_key key_memory_Event_queue_element_for_exec_names; PSI_memory_key key_memory_Event_scheduler_scheduler_param; @@ -300,11 +302,14 @@ static PSI_memory_info all_server_memory[] = { {&key_memory_JOIN_CACHE, "JOIN_CACHE", 0, 0, PSI_DOCUMENT_ME}, {&key_memory_TABLE_sort_io_cache, "TABLE::sort_io_cache", 0, 0, PSI_DOCUMENT_ME}, + {&key_memory_DD_cache_infrastructure, "dd::cache_infrastructure", 0, 0, + PSI_DOCUMENT_ME}, {&key_memory_DD_column_statistics, "dd::column_statistics", 0, 0, PSI_DOCUMENT_ME}, {&key_memory_DD_default_values, "dd::default_values", 0, 0, PSI_DOCUMENT_ME}, {&key_memory_DD_import, "dd::import", 0, 0, PSI_DOCUMENT_ME}, + {&key_memory_DD_objects, "dd::objects", 0, 0, PSI_DOCUMENT_ME}, {&key_memory_Unique_sort_buffer, "Unique::sort_buffer", 0, 0, PSI_DOCUMENT_ME}, {&key_memory_Unique_merge_buffer, "Unique::merge_buffer", 0, 0, ----------------------------- sql/psi_memory_key.h ----------------------------- @@ -58,9 +58,11 @@ extern PSI_memory_key key_memory_string_service_iterator; /* These are defined in psi_memory_key.cc */ +extern PSI_memory_key key_memory_DD_cache_infrastructure; extern PSI_memory_key key_memory_DD_column_statistics; extern PSI_memory_key key_memory_DD_default_values; extern PSI_memory_key key_memory_DD_import; +extern PSI_memory_key key_memory_DD_objects; extern PSI_memory_key key_memory_DD_String_type; extern PSI_memory_key key_memory_Event_queue_element_for_exec_names; extern PSI_memory_key key_memory_Event_scheduler_scheduler_param;
Use the appropriate key for various list infrastructures
-------------------------- sql/dd/cache/element_map.h -------------------------- @@ -29,6 +29,7 @@ #include "my_dbug.h" #include "sql/malloc_allocator.h" // Malloc_allocator. +#include "sql/psi_memory_key.h" // key_memory_DD_cache_infrastructure namespace dd { namespace cache { @@ -84,9 +85,10 @@ class Element_map { public: Element_map() - : m_map(std::less<K>(), - Malloc_allocator<std::pair<const K, E *>>(PSI_INSTRUMENT_ME)), - m_missed(std::less<K>(), Malloc_allocator<K>(PSI_INSTRUMENT_ME)) { + : m_map(std::less<K>(), Malloc_allocator<std::pair<const K, E *>>( + key_memory_DD_cache_infrastructure)), + m_missed(std::less<K>(), + Malloc_allocator<K>(key_memory_DD_cache_infrastructure)) { } /* purecov: tested */ /** ------------------------ sql/dd/impl/cache/free_list.h ------------------------ @@ -27,6 +27,7 @@ #include "my_dbug.h" #include "sql/malloc_allocator.h" // Malloc_allocator. +#include "sql/psi_memory_key.h" // key_memory_DD_cache_infrastructure namespace dd { namespace cache { @@ -54,7 +55,8 @@ class Free_list { List_type m_list; // The actual list. public: - Free_list() : m_list(Malloc_allocator<E *>(PSI_INSTRUMENT_ME)) {} + Free_list() + : m_list(Malloc_allocator<E *>(key_memory_DD_cache_infrastructure)) {} // Return the actual free list length. size_t length() const { return m_list.size(); } --------------------- sql/dd/impl/cache/shared_multi_map.h --------------------- @@ -52,6 +52,7 @@ #include "sql/dd/types/tablespace.h" #include "sql/malloc_allocator.h" // Malloc_allocator. #include "sql/mysqld.h" // max_connections +#include "sql/psi_memory_key.h" // key_memory_DD_cache_infrastructure #include "thr_mutex.h" namespace dd { @@ -137,9 +138,10 @@ class Shared_multi_map : public Multi_map_base<T> { public: // Lock the multi map on instantiation. explicit Autolocker(Shared_multi_map<T> *map) - : m_objects_to_delete(Malloc_allocator<const T *>(PSI_INSTRUMENT_ME)), - m_elements_to_delete( - Malloc_allocator<const Cache_element<T> *>(PSI_INSTRUMENT_ME)), + : m_objects_to_delete( + Malloc_allocator<const T *>(key_memory_DD_cache_infrastructure)), + m_elements_to_delete(Malloc_allocator<const Cache_element<T> *>( + key_memory_DD_cache_infrastructure)), m_map(map) { mysql_mutex_lock(&m_map->m_lock); }
Redefine new for the common DD object superclass
--------------------- sql/dd/impl/types/weak_object_impl.h --------------------- @@ -1,4 +1,4 @@ -/* Copyright (c) 2014, 2017, Oracle and/or its affiliates. All rights reserved. +/* Copyright (c) 2014, 2019, Oracle and/or its affiliates. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2.0, @@ -23,8 +23,11 @@ #ifndef DD__WEAK_OBJECT_IMPL_INCLUDED #define DD__WEAK_OBJECT_IMPL_INCLUDED -#include "sql/dd/object_id.h" // Object_id -#include "sql/dd/types/weak_object.h" // dd::Weak_object +#include "my_sys.h" // MY_WME +#include "mysql/service_mysql_alloc.h" // my_malloc +#include "sql/dd/object_id.h" // Object_id +#include "sql/dd/types/weak_object.h" // dd::Weak_object +#include "sql/psi_memory_key.h" // key_memory_DD_objects namespace dd { @@ -46,6 +49,24 @@ class Weak_object_impl : virtual public Weak_object { virtual ~Weak_object_impl() {} + void *operator new(size_t size, const std::nothrow_t &) noexcept { + /* + Call my_malloc() with the MY_WME flag to make sure that it will + write an error message if the memory could not be allocated. + */ + return my_malloc(key_memory_DD_objects, size, MYF(MY_WME)); + } + + void *operator new(size_t size) noexcept { + /* + Call my_malloc() with the MY_WME flag to make sure that it will + write an error message if the memory could not be allocated. + */ + return my_malloc(key_memory_DD_objects, size, MYF(MY_WME)); + } + + void operator delete(void *ptr) noexcept { my_free(ptr); } + public: virtual const Object_table &object_table() const = 0;