In some cases it is not possible to use an index to optimize a JOIN. This may for example happen when you query the Performance Schema. As a result these kind of queries can be very slow; however in MySQL 5.6 and later you can use a trick to improve the performance considerably.
As a working example in this post, I will use the schema_table_statistics view in the sys schema. Since the view involves the schema, I will create a reasonable large number of databases and tables for the test:
1 2 3 4 5 6 7 |
shell$ for ((i=0; i<100; i++)); do > echo "Database ${i}" > mysql -e "CREATE DATABASE db${i}" > for ((j=0; j<100; j++)); do > mysql -e "CREATE TABLE db${i}.t${j} (id int unsigned NOT NULL auto_increment PRIMARY KEY, val varchar(10) NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT INTO db${i}.t${j} (val) VALUES ('aaa'), ('bbb'), ('ccc'), ('ddd'), ('eee'), ('fff'), ('ggg'), ('hhh'), ('iii'), ('jjj');" > done > done |
The current view definition for sys.schema_table_statistics is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
CREATE OR REPLACE ALGORITHM = TEMPTABLE DEFINER = 'root'@'localhost' SQL SECURITY INVOKER VIEW schema_table_statistics ( table_schema, table_name, total_latency, rows_fetched, fetch_latency, rows_inserted, insert_latency, rows_updated, update_latency, rows_deleted, delete_latency, io_read_requests, io_read, io_read_latency, io_write_requests, io_write, io_write_latency, io_misc_requests, io_misc_latency ) AS SELECT pst.object_schema AS table_schema, pst.object_name AS table_name, sys.format_time(pst.sum_timer_wait) AS total_latency, pst.count_fetch AS rows_fetched, sys.format_time(pst.sum_timer_fetch) AS fetch_latency, pst.count_insert AS rows_inserted, sys.format_time(pst.sum_timer_insert) AS insert_latency, pst.count_update AS rows_updated, sys.format_time(pst.sum_timer_update) AS update_latency, pst.count_delete AS rows_deleted, sys.format_time(pst.sum_timer_delete) AS delete_latency, SUM(fsbi.count_read) AS io_read_requests, sys.format_bytes(SUM(fsbi.sum_number_of_bytes_read)) AS io_read, sys.format_time(SUM(fsbi.sum_timer_read)) AS io_read_latency, SUM(fsbi.count_write) AS io_write_requests, sys.format_bytes(SUM(fsbi.sum_number_of_bytes_write)) AS io_write, sys.format_time(SUM(fsbi.sum_timer_write)) AS io_write_latency, SUM(fsbi.count_misc) AS io_misc_requests, sys.format_time(SUM(fsbi.sum_timer_misc)) AS io_misc_latency FROM performance_schema.table_io_waits_summary_by_table AS pst LEFT JOIN performance_schema.file_summary_by_instance AS fsbi ON pst.object_schema = extract_schema_from_file_name(fsbi.file_name) AND pst.object_name = extract_table_from_file_name(fsbi.file_name) GROUP BY pst.object_schema, pst.object_name ORDER BY pst.sum_timer_wait DESC; |
The issue is the LEFT JOIN on performance_schema.file_summary_by_instance. The Performance Schema tables do not have any indexes, so the query will need to examine the cross product:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
mysql> EXPLAIN SELECT ... <original view definition>\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: pst type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1000 Extra: Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: fsbi type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1000 Extra: Using where; Using join buffer (Block Nested Loop) 2 rows in set (0.00 sec) |
To ensure that some activity has happened on all tables, a mysqldump backup is made:
shell$ mysqldump --all-databases > /dev/null
The original query takes 14 minutes and 24 seconds.
The trick to improve the performance is to rewrite the query to take advantage of the materialized temporary tables that MySQL 5.6 supports. To do this, change the LEFT JOIN on performance_schema.file_summary_by_instance to a LEFT JOIN on a subquery:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT ... FROM performance_schema.table_io_waits_summary_by_table AS pst LEFT JOIN ( SELECT extract_schema_from_file_name(file_name) AS table_schema, extract_table_from_file_name(file_name) AS table_name, count_read, sum_number_of_bytes_read, sum_timer_read, count_write, sum_number_of_bytes_write, sum_timer_write, count_misc, sum_timer_misc FROM performance_schema.file_summary_by_instance ) AS fsbi ON pst.object_schema = fsbi.table_schema AND pst.object_name = fsbi.table_name GROUP BY pst.object_schema, pst.object_name ORDER BY pst.sum_timer_wait DESC; |
Now the output of EXPLAIN looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
*************************** 1. row *************************** id: 1 select_type: PRIMARY table: pst type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1000 Extra: Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ref possible_keys: <auto_key0> key: <auto_key0> key_len: 390 ref: performance_schema.pst.OBJECT_SCHEMA,performance_schema.pst.OBJECT_NAME rows: 10 Extra: NULL *************************** 3. row *************************** id: 2 select_type: DERIVED table: file_summary_by_instance type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1000 Extra: NULL 3 rows in set (0.00 sec) |
The key here is the <auto_key0> index used for the LEFT JOIN between the pst table and the <derived2> table. This means the optimizer adds an ad-hoc index.
The performance difference is clear: the query now completes in around 20 seconds.
However the original query is a view definition and subqueries are not allowed in view definitions, so how to improve the view itself?
This can be done by creating a “helper view” instead of the subquery. The important thing to be aware of is that MySQL has two algorithms for creating a view:
- MERGE
- TEMPTABLE
MERGE will merge the corresponding parts of the view into the statement that refers to the view. So a view using the MERGE algorithm will not work as that will effectively take us back to the original view. However TEMPTABLE will materialize the view before using it with the rest of the statement, so that will do what we need.
So the view definition we need in this case is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
CREATE OR REPLACE ALGORITHM = TEMPTABLE DEFINER = 'root'@'localhost' SQL SECURITY INVOKER VIEW x$ps_schema_table_statistics_io ( table_schema, table_name, count_read, sum_number_of_bytes_read, sum_timer_read, count_write, sum_number_of_bytes_write, sum_timer_write, count_misc, sum_timer_misc ) AS SELECT extract_schema_from_file_name(file_name) AS table_schema, extract_table_from_file_name(file_name) AS table_name, count_read, sum_number_of_bytes_read, sum_timer_read, count_write, sum_number_of_bytes_write, sum_timer_write, count_misc, sum_timer_misc FROM performance_schema.file_summary_by_instance; |
which then can be used in the view definition for schema_table_statistics:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
CREATE OR REPLACE ALGORITHM = TEMPTABLE DEFINER = 'root'@'localhost' SQL SECURITY INVOKER VIEW schema_table_statistics ( table_schema, table_name, total_latency, rows_fetched, fetch_latency, rows_inserted, insert_latency, rows_updated, update_latency, rows_deleted, delete_latency, io_read_requests, io_read, io_read_latency, io_write_requests, io_write, io_write_latency, io_misc_requests, io_misc_latency ) AS SELECT pst.object_schema AS table_schema, pst.object_name AS table_name, sys.format_time(pst.sum_timer_wait) AS total_latency, pst.count_fetch AS rows_fetched, sys.format_time(pst.sum_timer_fetch) AS fetch_latency, pst.count_insert AS rows_inserted, sys.format_time(pst.sum_timer_insert) AS insert_latency, pst.count_update AS rows_updated, sys.format_time(pst.sum_timer_update) AS update_latency, pst.count_delete AS rows_deleted, sys.format_time(pst.sum_timer_delete) AS delete_latency, SUM(fsbi.count_read) AS io_read_requests, sys.format_bytes(SUM(fsbi.sum_number_of_bytes_read)) AS io_read, sys.format_time(SUM(fsbi.sum_timer_read)) AS io_read_latency, SUM(fsbi.count_write) AS io_write_requests, sys.format_bytes(SUM(fsbi.sum_number_of_bytes_write)) AS io_write, sys.format_time(SUM(fsbi.sum_timer_write)) AS io_write_latency, SUM(fsbi.count_misc) AS io_misc_requests, sys.format_time(SUM(fsbi.sum_timer_misc)) AS io_misc_latency FROM performance_schema.table_io_waits_summary_by_table AS pst LEFT JOIN x$ps_schema_table_statistics_io AS fsbi ON pst.object_schema = fsbi.table_schema AND pst.object_name = fsbi.table_name GROUP BY pst.object_schema, pst.object_name ORDER BY pst.sum_timer_wait DESC; |
Note: to take fully advantage of this rewrite, the extract_schema_from_file_name() and extract_table_from_file_name() functions needs to be changed from their current definitions (this has been done throughout the examples above). The current definition in the sys schema is that they return a varchar(512) – the same as the argument they take. However as the widest allowed schema and table name is a varchar(64), it is safe to change the return value to a varchar(64). The reason this is important is that the maximum width supported for a MyISAM index is 1000 bytes, so two columns each a UTF8 vachar(512) is too wide to allow for an auto key to be created.
Thanks a lot to Roy Lyseng and Mark Leith who is behind the ideas for this rewrite.
If you are interested to hear more about the Performance Schema, the MySQL sys schema, and performance tuning, then you should consider the following sessions at MySQL Central at Oracle OpenWorld:
- Mayank Prasad: MySQL’s Performance Schema: New Features in 5.6 and 5.7 [CON2222]
- Mark Leith: The MySQL sys schema [CON3751]
- Øystein Grøvlen: How to Analyze and Tune MySQL Queries for Better Performance [TUT3157]
- Dimitri Kravtchuk: MySQL Performance: Demystified Tuning and Best Practices [CON5097]
- Lig Isler-turmelle: MySQL Server Performance Tuning 101 [CON3942]