MySQL :: MySQL 5.7: InnoDB Intrinsic Tables

Introduction

The MySQL Optimizer sometimes needs a temporary data-store during query processing, for storing intermediate results. Before MySQL 5.7, this need was serviced exclusively using a combination of the HEAP/MEMORY storage engine (for smaller tables) and the MyISAM storage engine (for larger tables). You can find more information on when disk based temporary tables (MyISAM or InnoDB) are used instead of MEMORY tables here.

The InnoDB storage engine has been the default engine used for user tables since MySQL 5.6, but MyISAM was still used for internal disk based temporary tables (for some related notes, see this excellent blog post by Jaime Crespo). This legacy behavior lead to many headaches, both for our users and for our developers. For example, see Stewart Smith’s excellent write up on some of the related problems. In order to address the related issues in MySQL 5.7, we’ve had to do a lot of work within both InnoDB and the Optimizer to lay the groundwork for what we’ll talk about next. You can read about all of the related 5.7 work leading up to this point here.

Optimizer Switches to InnoDB Instrinsic Tables

InnoDB is our MVCC ACID compliant storage engine, and as part of our larger effort to decouple the MyISAM storage engine from MySQL itself so that it becomes an optional engine, starting with MySQL 5.7.6 InnoDB is now the default storage engine used for internal disk based temporary tables. The engine used can be controlled using the new internal_tmp_disk_storage_engine server option.

In our efforts to make this change in MySQL 5.7.6, we have made changes to InnoDB so that it performs as good or better (often far better) than MyISAM when used for internal disk based temporary tables. To fulfill this use-case, we created a new type of table within InnoDB called intrinsic tables, which have relaxed MVCC and ACID semantics. These tables are a special type of temporary InnoDB table that do not perform any UNDO or REDO logging (REDO logging is disabled for all temporary InnoDB tables). As the name suggests, these tables are meant for internal use only and thus can only be used by an internal MySQL module such as the Optimizer. In other words, end-users will not be able to explicitly create these new types of tables (they are also not visible in the innodb_temp_table_info table). We will, however, leverage this new type of table for future end-user focused work. So stay tuned!

A Deeper Dive Inside InnoDB Intrinsic Tables

Let’s now take a deeper dive into this new type of table to better understand some of its key features which allowed us to finally supplant the internal usage of MyISAM (and all of its related issues) for internal disk based temporary tables :

Instead of logging information about them within SYS_TABLES and the InnoDB Dictionary cache, all references to them are maintained in thread (or SESSION level) local storage. Given their limited usage and visibility, it is the perfect place to host them.
They are created within the shared temporary tablespace, which ensures that they are automatically removed when MySQL is shutdown or restarted (they are normally simply dropped automatically as part of the internal query processing, but this covers non-normal shutdowns such as a system crash or loss of power).
These tables don’t need to support ROLLBACK and thus the internal hidden DB_ROLL_PTR column that is normally appended to a table for this purpose is skipped. The only hidden columns for these tables are then DB_ROW_ID and DB_TRX_ID. For these tables, DB_TRX_ID is also not currently used, so the transaction number is simply a dummy counter that facilitates the creation of the read-only view while processing the writes to them.
Given that there is no support for ROLLBACK, atomicity can be limited to the row level (somewhat similar to MyISAM). This is really not an issue under the controlled environment used for internal temporary tables, and it also helps to provide a smooth transition for this usage, as we can then replace MyISAM usage here without changing the Optimizer semantics around how they handle the normal difference in atomicity levels between the two engines.
They follow a shared nothing architecture (as dictated by the semantics) and so are super fast and very light weight. We optimize the writes by, for example, caching the last inserted position in the default clustered index (based on DB_ROW_ID).
There are some additional new features added to InnoDB that are initially meant only for intrinsic tables. For example: delete_all_rows, and enable/disable index.
Localized DB_ROW_ID and DB_TRX_ID values are maintained at the table level, which saves a costly round trip to the global InnoDB generator for managing these values. (There are no common objects like UNDO logs generated by these tables so localized counters are fine). We also avoid unnecessary global management of the internal table_id and index_id values.
UPDATEs are executed as a series of atomic DELETE+INSERT operations. This avoids unnecessary overhead with regards to space management.
Searches are further optimized by caching the last search position and restoring the cursor directly at the block and record where it left off previously. This cursor is then simply invalidated if there is a change to the table structure while a read is happening as part of the larger UPDATE operation.
These tables do not use the double write buffer or the adaptive hash index (AHI) features, as they would simply be adding overhead without providing a performance benefit in the case of internal temporary tables.

The Performance Results

Not only has this work improved our internal semantics, making for a more robust and
reliable MySQL, but it has also improved the performance! Using the internal_tmp_disk_storage_engine option, we can easily benchmark the performance of MySQL 5.7.6 when using MyISAM or InnoDB for internal disk based temporary tables.

Let’s first look at some simple benchmarks using the 2 most popular suites for benchmarking MySQL, namely sysbench (OLTP workloads) and DBT-3 (OLAP workloads).

Sysbench

sysbench executed with 1 million seed size, read-only workload.

MySQL consistently performed better when using InnoDB for internal disk based temporary tables.

DBT-3

DBT3 with 10G seed size.

Again, these tests consistently show that MySQL performs as good or better when InnoDB is used for internal disk based temporary tables.

Conclusion

This work plays a large part in our bigger effort to make the InnoDB storage engine the native storage engine for MySQL. Stay tuned for further improvements to MySQL, as we make MySQL an MVCC and ACID compliant server from end-to-end (while allowing users to continue to use MyISAM and other non-transactional engines, but only at their explicit request).

We look forward to your feedback on this new work! You can leave a comment here on the blog post or in a support ticket. If you feel that you encountered any related bugs, please do let us know via a bug report.

As always, THANK YOU for using MySQL!