Documentation Home
MySQL 8.0 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 33.0Mb
PDF (A4) - 32.9Mb
PDF (RPM) - 30.9Mb
HTML Download (TGZ) - 7.9Mb
HTML Download (Zip) - 7.9Mb
HTML Download (RPM) - 6.7Mb
Man Pages (TGZ) - 143.7Kb
Man Pages (Zip) - 203.9Kb
Info (Gzip) - 3.0Mb
Info (Zip) - 3.0Mb


MySQL 8.0 Reference Manual  /  ...  /  Optimizing INFORMATION_SCHEMA Queries

Pre-General Availability Draft: 2017-08-21

8.2.3 Optimizing INFORMATION_SCHEMA Queries

Applications that monitor databases may make frequent use of INFORMATION_SCHEMA tables. To write queries for these tables most efficiently, use the following general guidelines:

  • Try to query only INFORMATION_SCHEMA tables that are views on data dictionary tables.

  • Try to query only for static metadata. Selecting columns or using retrieval conditions for dynamic metadata along with static metadata adds overhead to process the dynamic metadata.

Note

Comparison behavior for database and table names in INFORMATION_SCHEMA queries might differ from what you expect. For details, see Section 10.1.8.7, “Using Collation in INFORMATION_SCHEMA Searches”.

These INFORMATION_SCHEMA tables are implemented as views on data dictionary tables, so queries on them retrieve information from the data dictionary:

CHARACTER_SETS
COLLATIONS
COLLATION_CHARACTER_SET_APPLICABILITY
COLUMNS
EVENTS
KEY_COLUMN_USAGE
PARAMETERS
ROUTINES
SCHEMATA
STATISTICS
TABLES
TABLE_CONSTRAINTS
TRIGGERS
VIEWS

Some types of values, even for a non-view INFORMATION_SCHEMA table, are retrieved by lookups from the data dictionary. This includes values such as database and table names, table types, and storage engines.

Some INFORMATION_SCHEMA tables contain columns that provide table statistics:

STATISTICS.CARDINALITY
TABLES.AUTO_INCREMENT    
TABLES.AVG_ROW_LENGTH    
TABLES.CHECKSUM    
TABLES.CHECK_TIME    
TABLES.CREATE_TIME    
TABLES.DATA_FREE    
TABLES.DATA_LENGTH    
TABLES.INDEX_LENGTH    
TABLES.MAX_DATA_LENGTH    
TABLES.TABLE_ROWS    
TABLES.UPDATE_TIME

Those columns represent dynamic table metadata; that is, information that changes as table contents change.

The server has two sources from which to retrieve table statistics. The information_schema_stats system variable controls which table statistics source the server uses:

To determine whether to set information_schema_stats to CACHED or LATEST, consider the tradeoffs:

  • When the server starts, cached statistics are NULL. To update them for a given table, use ANALYZE TABLE. This incurs a one-time statistics-calculation cost, but the cached statistics remain up to date to the extent that the table changes slowly. To update the cached statistics at any time thereafter, use ANALYZE TABLE again.

  • For use of the latest statistics, ANALYZE TABLE is not needed, but each query that retrieves statistics incurs somewhat higher execution cost than for use of cached statistics.

Set the global information_schema_stats value to determine the default used initially by all sessions. Individual sessions can set the session information_schema_stats value to override the global value as desired.

Note

If the innodb_read_only system variable is enabled, ANALYZE TABLE may fail because it cannot update statistics tables in the data dictionary, which use InnoDB. For ANALYZE TABLE operations that update the key distribution, failure may occur even if the operation updates the table itself (for example, if it is a MyISAM table). To obtain the updated distribution statistics, set information_schema_stats=LATEST.

For INFORMATION_SCHEMA tables implemented as views on data dictionary tables, indexes on the underlying data dictionary tables permit the optimizer to construct efficient query execution plans. To see the choices made by the optimizer, use EXPLAIN. To also see the query used by the server to execute an INFORMATION_SCHEMA query, use SHOW WARNINGS immediately following EXPLAIN.

Consider this statement, which identifies collations for the utf8mb4 character set:

mysql> SELECT COLLATION_NAME
       FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY
       WHERE CHARACTER_SET_NAME = 'utf8mb4';
+----------------------------+
| COLLATION_NAME             |
+----------------------------+
| utf8mb4_general_ci         |
| utf8mb4_bin                |
| utf8mb4_unicode_ci         |
| utf8mb4_icelandic_ci       |
| utf8mb4_latvian_ci         |
| utf8mb4_romanian_ci        |
| utf8mb4_slovenian_ci       |
...

How does the server process that statement? To find out, use EXPLAIN:

mysql> EXPLAIN SELECT COLLATION_NAME
       FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY
       WHERE CHARACTER_SET_NAME = 'utf8mb4'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: cs
   partitions: NULL
         type: const
possible_keys: PRIMARY,name
          key: name
      key_len: 194
          ref: const
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: col
   partitions: NULL
         type: ref
possible_keys: character_set_id
          key: character_set_id
      key_len: 8
          ref: const
         rows: 68
     filtered: 100.00
        Extra: NULL
2 rows in set, 1 warning (0.01 sec)

To see the query used to statisfy that statement, use SHOW WARNINGS:

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `mysql`.`col`.`name` AS `COLLATION_NAME`
         from `mysql`.`character_sets` `cs`
         join `mysql`.`collations` `col`
         where ((`mysql`.`col`.`character_set_id` = '45')
         and ('utf8mb4' = 'utf8mb4'))

As indicated by SHOW WARNINGS, the server handles the query on COLLATION_CHARACTER_SET_APPLICABILITY as a query on the character_sets and collations data dictionary tables in the mysql system database.


User Comments
  Posted by Shlomi Noach on October 3, 2011
The example for joining TABLES with COLUMNS shows good EXPLAIN plan for TABLES, but poor EXPLAIN plan for columns, since the table name and schema for COLUMNS cannot be deduced ahead.
But this is just due to poor optimizer's work.
Here's how to get the same results, but with far better execution plan; we push the constants down to COLUMNS:

EXPLAIN SELECT B.TABLE_NAME
-> FROM INFORMATION_SCHEMA.TABLES AS A, INFORMATION_SCHEMA.COLUMNS AS B
-> WHERE A.TABLE_SCHEMA = 'test'
-> AND A.TABLE_NAME = 't1'
-> AND B.TABLE_NAME = 't1'
-> AND B.TABLE_SCHEMA = 'test'
->
-> \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: A
type: ALL
possible_keys: NULL
key: TABLE_SCHEMA,TABLE_NAME
key_len: NULL
ref: NULL
rows: NULL
Extra: Using where; Skip_open_table; Scanned 0 databases
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: B
type: ALL
possible_keys: NULL
key: TABLE_SCHEMA,TABLE_NAME
key_len: NULL
ref: NULL
rows: NULL
Extra: Using where; Open_frm_only; Scanned 0 databases; Using join buffer
2 rows in set (0.00 sec)

Sign Up Login You must be logged in to post a comment.