WL#8706: Persistent storage of Histogram data
Affects: Server-8.0
—
Status: Complete
This worklog covers the work needed to store histogram data persistent. This is needed so that histogram data does not have to be created each time the server starts. We have considered several ways of storing histograms persistent, and we propose to store the histogram statistics in a new system table; mysql.column_stats. The histogram itself will be stored in a JSON column, which gives us a very flexible and powerful way of storing histograms. We also present some of the other approaches we have considered.
Functional Requirements ======================= F-1: The storage method chosen MUST be able to store histograms for all of the following data types: - BOOL/BOOLEAN, BIT, TINYINT, SMALLINT, MEDIUMINT, INT/INTEGER, BIGINT - FLOAT, DOUBLE, DECIMAL - DATE, TIME, YEAR, DATETIME, TIMESTAMP - CHAR, VARCHAR - TINYTEXT, MEDIUMTEXT, TEXT, LONGTEXT - BINARY, VARBINARY, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB - ENUM, SET F-2: After booting the server or upgrading from a previous version, one new table MUST be present in the database "mysql": column_stats F-3: The new system table MUST be created in the InnoDB storage engine. F-4: After installing MySQL or upgrading from a previous version without histogram support, the new table MUST be empty F-5: SQL statements to mysql.column_stats SHALL be replicated unless anything else is specified at server startup. See "High-Level Specification" for a discussion around replication. F-6: This worklog shall not have any effect on the query optimizer. F-7: The new system table MUST be accessible by the root user only. Non-Functional Requirements =========================== NF-1: The storage SHOULD facilitate for expanding MySQL with table- and/or column statistics in the future. NF-2: The storage SHOULD facilitate for adding new histogram types in the future. These new histogram types might require additional attributes to function properly.
For the first implementation of histogram statistics, we will implement two different histogram types; a small variant of the equi-height histogram and a "singleton" histogram. For the equi-height histogram, we want to store the following attributes for each histogram bucket: - Lower inclusive value. - Upper inclusive value. - Cumulative frequency. - Number of distinct values. For the singleton histogram, we want to store the following attributes for each histogram bucket: - Value. - Cumulative frequency. We have considered multiple ways of modeling and storing the histograms, and we present some of the solutions we have considered below. DIFFERENT WAYS OF MODELING HISTOGRAMS ===================================== == STORING HISTOGRAM IN A SYSTEM TABLE, WITH ONE ROW FOR EACH BUCKET == This approach seems like the most common among other systems. Basically, a table will have one column for each attribute needed in the histogram. This normally includes upper inclusive value, lower inclusive value, bucket frequency, the bucket number and possibly a reference to the column the histogram belongs to. The data will be easily readable by humans, have a decent performance etc. This approach has a few drawbacks though: - Finding the "best" data type for storing lower and upper bucket values. Since we must support several data types, we would probably need to use BLOB, BINARY or VARCHAR. A different approach could be to convert everything to a BIGINT. This will however give us some big restrictions, especially with strings. - We might want to implement other histogram types in the future. This can lead to changes in the schema, which may (or may not) cause issues. With all of the attributes mentioned above, each bucket will require around 70 bytes on average, but this depends a lot on the data type stored in the histogram. == STORING HISTOGRAM IN A SINGLE FIELD == MariaDB has chosen this approach, where they store the entire histogram in a VARBINARY(255). This gives however some big limitations. A histogram over string columns is very limited (only a small prefix of the string is considered), and it also imposes a limitation on the number of buckets. So we want to use a "bigger" data type that can store more information. With the new JSON type, one option is to store the entire histogram as a JSON object (equi-height histogram shown): { "histogram-type": "equi-height", "last-updated": "2015-06-08 12:09:00.000000", "buckets": [ { "low": 3, "high": 3, "freq": 0.02, "distinct": 1 }, { "low": 4, "high": 4, "freq": 0.98, "distinct": 1 }, ... ] } Each bucket will require a bit over 100 bytes. In order to make it more compact, each bucket can be stored as an array instead of an object. This will save around 30% for each bucket: "buckets": [ [ 3, 3, 0.02, 1 ], [ 4, 4, 0.98, 1 ], ... ] By using this approach, it solves most of the challenges mentioned earlier: - Virtually unlimited number of buckets if desirable. - Any data type can be stored in a readable format. - Strings can be stored in their full length if desirable. - Adding new histogram types in the future is trivial. So, what are the drawbacks with this approach? - A very "untraditional" way of doing it. - May require a bit more storage. This however depends on the data types stored in the histogram. Numeric values favor this approach, while other types mostly favor the other approach. A different approach is to create a new format for storing histograms, and serialize it into a BLOB column. This will however increase the complexity, and most likely increase the amount of things that can go wrong. We have thus not considered this approach any further. TABLE DEFINITION ALTERNATIVES ============================= - A histogram must support multiple different data types. The choice of storage must reflect this. - It is possible that we want to extend MySQL with more statistics about tables and columns in the future. We should thus choose a storage option that facilitates this. - The histogram should not require unreasonable amount of storage space. - We might want to add more histogram types in the future, and the storage format chosen should facilitate this. Based on the above observations, we present the following four ways to store the histograms using the two modeling approaches descibed above: 1) Three new system tables in 'mysql': +----------------------------+ | mysql.table_stats | +----------------------------+ | PK table_id (INT) | | database_name (VARCHAR) | | table_name (VARCHAR) | +----------------------------+ +--------------------------+ | mysql.column_stats | +--------------------------+ | PK column_id (INT) | | PK table_id (INT) | | column_name (VARCHAR) | +--------------------------+ +---------------------------------+ | mysql.histograms | +---------------------------------+ | PK column_id (INT) | | PK bucket_number (INT) | | histogram_type (ENUM) | | bucket_high (VARBINARY) | | bucket_low (VARBINARY) | | bucket_frequency (DOUBLE) | | num_distinct (INT) | +---------------------------------+ -- ADVANTAGES -- - Easy to expand for both table and column statistics in the future. -- DISADVANTAGES -- - VARBINARY has a limited length of 255. - Quite a few new system tables, which leads to more FK constraints (complexity). - Using auto increment IDs imposes an upper limit a few places. 2) Two new system tables in 'mysql': +----------------------------+ | mysql.table_stats | +----------------------------+ | PK table_id (INT) | | database_name (VARCHAR) | | table_name (VARCHAR) | +----------------------------+ +--------------------------+ | mysql.column_stats | +--------------------------+ | PK table_id (INT) | | PK column_name (VARCHAR) | | histogram (JSON) | +--------------------------+ -- ADVANTAGES -- - Flexible to store both different data types, new histogram types and such in the future. - Possible to implement both table and column statistics in the future. -- DISADVANTAGES -- - Requires more storage space than proposal number 1. - Using auto increment IDs imposes an upper limit a few places. 3) Two new system tables in 'mysql': +----------------------------+ | mysql.column_stats | +----------------------------+ | PK column_id (INT) | | database_name (VARCHAR) | | table_name (VARCHAR) | | column_name (VARCHAR) | +----------------------------+ +---------------------------------+ | mysql.histograms | +---------------------------------+ | PK column_id (INT) | | PK bucket_number (INT) | | histogram_type (ENUM) | | bucket_high (VARBINARY) | | bucket_low (VARBINARY) | | bucket_frequency (DOUBLE) | | num_distinct (INT) | +---------------------------------+ -- ADVANTAGES -- - Less FK constraints/complexity than proposal number one. -- DISADVANTAGES -- - VARBINARY has a limited length of 255. - Does not facilitate for table statistics in the future. - Using auto increment IDs imposes an upper limit a few places. 4) One new system table in 'mysql': +----------------------------+ | mysql.column_stats | +----------------------------+ | PK database_name (VARCHAR) | | PK table_name (VARCHAR) | | PK column_name (VARCHAR) | | histogram (JSON) | +----------------------------+ -- ADVANTAGES -- - Flexible to store both different data types, new histogram types and such in the future. - No limit with auto increment column as mentioned above. - Simple, but flexible design. -- DISADVANTAGES -- - Repeating database name and table name is redundant. - Does not facilitate for table statistics in the future. THE CHOSEN APPROACH =================== We have chosen to go for suggestion number four, due to both the simplicity and the flexibility it gives. JSON HISTOGRAM EXAMPLES ======================= The following section will give some examples on how much storage space the chosen storage solution will use in different scenarios. # 100 buckets with DATETIME values - 4781 bytes - A sample from the histogram: { "last-updated": "2015-06-08 12:09:00.000000", "histogram-type": "equi-height", "buckets": [ [ "2015-10-21 13:13:04.000000", "2015-11-01 23:36:02.000000", 0.010009765625, 984 ], [ "2015-11-01 23:37:20.000000", "2015-11-13 09:19:31.000000", 0.02001953125, 984 ], ... [ "2018-11-30 07:13:54.000000", "2018-12-11 18:43:43.000000", 0.9909769694010417, 984 ], [ "2018-12-11 19:04:44.000000", "2018-12-21 23:12:28.000000", 1, 886 ] ] } # 512 buckets with DOUBLE values - 21793 bytes - A sample from the histogram: { "last-updated": "2015-06-08 12:09:00.000000", "histogram-type": "equi-height", "buckets": [ [ 4.296909448967231e303, 3.0266566256216324e305, 0.001983642578125, 65 ], [ 3.0912283005204318e305, 6.8428144850887495e305, 0.00396728515625, 65 ], ... [ 1.794196092769798e308, 1.7975437041712213e308, 0.999786376953125, 65 ], [ 1.7975458137001152e308, 1.797660823206325e308, 1, 7 ] ] } # 512 buckets with STRING values. The histogram only considers the 42 first # characters, and the average string length in the table is 22 characters. - 36059 bytes - A sample from the histogram: { "last-updated": "2015-06-08 12:09:00.000000", "histogram-type": "equi-height", "buckets": [ [ "0", "002a38227ecc7f0d952e85ffe37832d3f58910da", 0.001978728666831561, 10 ], [ "002de4494914a3a210d68f47b826df8d06a", "00b1dbd0070ba5ed", 0.003957457333663122, 32 ], ... [ "ff1f6d", "ff908e8fd6f", 0.9988251298540731, 32 ], [ "ff9725bad9bcb7d209a52c", "ffe82024b2e4870ef290b2cc7aae01695", 1, 19 ] ] } Since the table is created in the "mysql" database, it will only be visible to the root user. Doing updates directly to the table by executing SQL statements will be possible, but not recommended. NOTE: The following solution suggests to add the new table to the mysql.* database. However, it is expected that the new Data Dictionary will provide a new API that allows us to define system tables that aren't visible to the user. It will also most likely give us the ability to refer to a unique colum ID instead of the actual column name. This will make the job of maintaining the referential integrity to the original column much easier (that is, handling DDL statements which alters the column name and such). Defining the system table with the new DD API will hide the table from all users, including root. If we want to make the histogram data visible to the user, we will instead create a view over the system table with the correct/necessary permissions. This DD API is not expected to be available before early/mid 2016. When this is available, we will consider to reimplement this worklog to use these new features. REPLICATION OF HISTOGRAMS ========================= Manual changes to the new system table by running SQL statements will be replicated, to keep it in line with the normal behavior of MySQL. Note that manual changes is not recommended though. The histograms are populated with data by running "ANALYZE TABLE [table] UPDATE HISTOGRAMS". By default, "ANALYZE TABLE" is replicated. Running this command on a master, will thus populate histograms on the slaves as well. If specifying "ANALYZE LOCAL TABLE", the statement is not replicated. In order to avoid "double work" when running ANALYZE TABLE, binlogging will be temporarily disabled while inserting/updating mysql.column_stats. This will avoid that both ANALYZE TABLE AND the statements from inserting/updating mysql.column_stats will be replicated. Only ANALYZE TABLE will be replicated. GENERATED COLUMNS ================= Histograms in MySQL will support both "normal" columns and generated columns (both virtual and stored). The storage for normal and generated will be similar, and this worklog will thus not take this into account.
Note: All the J_* data types mentioned below are taken from enum_json_type in json_dom.h. System table definition ----------------------- CREATE TABLE IF NOT EXISTS column_stats ( database_name VARCHAR(64) NOT NULL, table_name VARCHAR(64) NOT NULL, column_name VARCHAR(64) NOT NULL, histogram JSON NOT NULL, PRIMARY KEY (database_name, table_name, column_name) ) ENGINE=InnoDB CHARACTER SET=utf8 COLLATE=utf8_bin COMMENT="Column statistics"; This definition will be added to "scripts/mysql_system_tables.sql". This will take care of creating the table on both initialize and upgrade. Equi-height JSON definition --------------------------- { // Last time the histogram was updated. As of now, this means "when the // histogram was created" (incremental updates are not supported). Date/time // is given in UTC. // -- J_DATETIME "last-updated": "2015-11-04 15:19:51.000000", // Histogram type. Always "equi-height" for equi-height histograms. // -- J_STRING "histogram-type": "equi-height", // Histogram buckets. This will always be at least one bucket. // -- J_ARRAY "buckets": [ [ // Lower inclusive value. // -- Data type depends on the source column. "0", // Upper inclusive value. // -- Data type depends on the source column. "002a38227ecc7f0d952e85ffe37832d3f58910da", // Cumulative frequence // -- J_DOUBLE 0.001978728666831561, // Number of distinct values in this bucket. // -- J_UINT 10 ] ] } Singleton JSON definition ------------------------- { // Last time the histogram was updated. As of now, this means "when the // histogram was created" (incremental updates are not supported). Date/time // is given in UTC. // -- J_DATETIME "last-updated": "2015-11-04 15:19:51.000000", // Histogram type. Always "singleton" for singleton histograms. // -- J_STRING "histogram-type": "singleton", // Histogram buckets. This will always be at least one bucket. // -- J_ARRAY "buckets": [ [ // Value value. // -- Data type depends on the source column. 42, // Cumulative frequence // -- J_DOUBLE 0.001978728666831561, ] ] }
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.