MySQL :: The MySQL 5.7.8 Release Candidate is Available

The MySQL Development team is very happy to announce that MySQL 5.7.8, the second 5.7 Release Candidate (RC2), is now available for download at dev.mysql.com (use the “Development Releases” tab).

We have fixed over 500 bugs in 5.7.8. This is on top of what we delivered in 5.7.7 and on top of bug fixes up-merged from 5.6. We have also added JSON support, which we previewed in a Labs release back in April. You can find the full list of changes and bug fixes in the 5.7.8 release notes. Here are some highlights. Enjoy!

Adding JSON Support to MySQL

With the newly added JSON support in MySQL, you can now combine the flexibility of NoSQL with the strength of a relational database.

JSON datatype and binary storage format (WL#8132) — This work by Knut Anders Hatlen develops a binary format that allows the server to efficiently store, retrieve and search JSON data. This work enhances CREATE TABLE and ALTER TABLE so that they can create JSON columns, and it extends the Field class to allow INSERT to and SELECT from JSON typed fields.

Server side JSON functions (WL#7909) — This work by Richard Hillegas and Dag Wanvik introduces a set of a built-in JSON functions. This work lets users construct JSON data values from other relational data, extract relational data from JSON data values, introspect the structure of JSON values and text (validity, length, depth, keys), search within, and manipulate JSON data.

JSON comparator (WL#8249) — This work by Knut Anders Hatlen introduces the JSON comparator, similar to the DATE/TIME/DATETIME comparator, which allow comparisons of JSON scalars vs SQL constants, and JSON scalars vs JSON scalars. The comparator relies on the DOM support added in the scope of WL#7909. The comparator converts the SQL constant to a JSON scalar and then compares the values.

Ordering of scalar JSON values (WL#8539) — This work by Knut Anders Hatlen implements a function that produces the sorting keys that the internal filesort function needs in order to properly sort JSON values. When ordering scalar JSON values with ORDER BY, they will be returned in the order defined by the JSON comparator in WL#8249.

Expression analyzer for generated columns (WL#8170) — This work by Evgeny Potempkin allows our range and ref optimizers to find opportunities to use any indexes defined over generated columns. One intended use case for this feature is to allow the creation and automatic use of indexes on JSON Documents.

Virtual Columns

B-tree Index Support on non-materialized virtual columns (WL#8149, WL#8114) — This work by Jimmy Yang supports the creation of secondary indexes on non-materialized virtual columns, as well as the usage of these indexes for fast computed-value retrieval and searches. The non-materialized virtual column support is described in WL#8114 and WL#411. These were designed in such way that they are not present in actual InnoDB index records, but their metadata is registered with InnoDB system tables and metadata caches. Virtual columns provide flexibility and space savings for the table, and more importantly, adding/dropping such columns does not require a table rebuild. These behaviors make it a much better choice for storing and processing non-relational data such as JSON. However, since the columns themselves are not materialized, a scan and search could be slower than on regular (materialized) columns. With this worklog, the virtual column value is materialized in the secondary index, thus making it much easier for value searches and processing. Thus this work greatly increases the practical value of virtual columns. With this work, creating an index on virtual generated columns also becomes an ONLINE operation.

Support SEs to create index on virtual generated columns (WL#8227) — This work by Benny Wang implements server layer support for WL#8149, for example storing information about generated columns in the internal Field::gcol_info data structure.

Callback for computation of virtual column index values from InnoDB purge threads (WL#8481) — This work by Jon Olav Hauglid provides a server layer function that can be called by InnoDB purge threads. Background: InnoDB needs to be able to compute virtual column index values in order to implement WL#8149 (B-tree Index Support on non-materialized virtual columns). Generally this is done from connection threads (see WL#8227). However, it also needs to happen from InnoDB purge threads, and these threads do not correspond to connections/sessions and thus don’t have THDs or access to TABLE objects. This work provides the necessary server layer callback which enables the purge threads to make the necessary computations. There are no user-observable changes by this worklog, it is only relevant in conjunction with WL#8149 and WL#8227.

Page Compression

InnoDB: Transparent page compression (WL#7696) — This work by Sunny Bains implements compression at the InnoDB layer. The feature works on any OS/file system combination that supports sparse files and has “punch hole” support (e.g. the FALLOC_FL_PUNCH_HOLE flag to fallocate). The high level idea is rather simple—given a 16K page we compress it using your favorite compression algorithm and write out only the compressed data. After writing out the data we “punch a hole” to release the unused part of the original 16K block back to the file system.

User Management

Increase the length of a user name (WL#2284) — This work by Robert Golebiowski increases the maximum length of MySQL user names from 16 to 32 characters.

Support IF [NOT] EXISTS clause in CREATE/DROP USER (WL#8540) — This work by Todd Farmer implements the IF [NOT] EXISTS clause in CREATE USER and DROP USER statements. This will allow for distribution of accounts using replication without triggering replication failures in the event of (intentionally) non-synchronized accounts within the replication group. It also simplifies user scripting of account management operations. See also: the feature request in Bug#15287.

Add server-side option to require secure transport (WL#7709) — This work by Todd Farmer adds the --require_secure_transport server option, which causes the MySQL Server to refuse any TCP/IP connection not using SSL. Before this work, the MySQL Server had only provided a mechanism to require SSL for individual user accounts using the REQUIRE SSL clause in various account management statements.

Control Storage Engine Usage

Provide an option to reject creation of user tables for a list of storage engines (WL#8594) — This work by Thayumanavar Sachithanantha provides a command-line option called --disabled-storage-engines which provides the means for a DBA to supply a list of storage engines to be disallowed, thus preventing users from using any particular storage engine (by mistake). This is done to support the use case where the customer has a policy to prevent the usage of, for example, MyISAM within their environment.

Super Read Only

Super-read-only that also blocks SUPER users (WL#6799) — This work by Todd Farmer introduces a new option, --super_read_only, which supplements the --read_only option. When --super_read_only is set to ON (which also automatically sets --read_only to ON) then the server also becomes READ-ONLY for admin users with the SUPER privilege.

New Data Export Utility

mysqlpump: Extend mysqldump functionalities (WL#7755) — This work by Marcin Babij and Bharathy Satish implements a new MySQL Server utility inspired by — but not 100% compatible with — mysqldump. The main feature of this new tool is the ability to parallelize backup and restore operations. (The existing mysqldump will continue to be supported.)

Cost Model

IO aware cost estimate function for data access (WL#7340) — This work by Olav Sandstå extends the optimizer cost model to use the estimates from the storage engines about approximately how much of a table’s rows and indexes are present in memory. The optimizer will then use different cost constants for calculating the cost of accessing data that is in memory and data that needs to be read from disk. In the initial implementation, these two cost constants have the same default value but can be changed by the server administrator to make the optimizer use different costs for data in a memory buffer versus data needed to be read from disk. Note that currently, the estimates about whether data is in memory or needs to be read from disk is just based on heuristics. The accuracy of these estimates will be greatly improved when support for these estimates are implemented by the individual storage engines.

Optimizer Hints

Hints for subquery strategies (WL#8244) — This work by Øystein Grøvlen adds hints for controlling subquery execution strategies. This includes whether or not to use semi-join, which semi-join strategy to use, and, in case semi-join is not used, whether to use subquery materialization or the in-to-exists transformation. This work uses the new syntax and infrastructure for hints provided by WL#8016 and WL#8017. Note that this work makes it possible to prevent any specific semi-join strategy, including the Duplicate Weed-out strategy which cannot be turned off using --optimizer_switch.

Observability / Monitoring

Instrumentation of connection type (WL#7729) — This work by Todd Farmer exposes information about connection types via standard interfaces—the PERFORMANCE_SCHEMA.THREADS table, the Audit log API, the Audit log file, and the General Query Log. Until now MySQL has not provided DBA visibility into the types of connections being established and used; e.g. to distinguish SSL connections from unencrypted TCP/IP connections, or socket, shared memory or named pipe connections.

InnoDB: Implement Information_Schema.Files (WL#7943) — This work by Kevin Lewis provides data for the INFORMATION_SCHEMA.FILES table, for all of the InnoDB datafiles that are in its internal cache. The INFORMATION_SCHEMA.FILES table contains fields to describe a tablespace file, including some statistical details. See also Bug#76182 reported by Marco Tusa.

PERFORMANCE SCHEMA, HISTORY PER THREAD (WL#7795) — This work by Marc Alff enables turning the history ON or OFF on a per thread basis. Until now, global consumer flags have been used to control whether or not to log history events. These flags are global to the server, so that the collection of historical data for different threads was all or nothing. With this feature, the DBA can now specify which sessions, accounts, users, and hosts for which they wish to collect historical data, separatly from turning the instrumentation ON or OFF. This allows the DBA to control more precisely what events are logged in the history tables, thus decreasing the runtime overhead when historical data is needed only for a subset of the instrumented sessions, as well as reducing unwanted noise in the Performance Schema tables—events_waits_history_long, events_stages_history_long, events_statements_history_long, events_transactions_history_long—which facilitate troubleshooting on busy servers (that also generate a lot of events).

Fabric Support

Detect transaction boundaries (WL#6631) — This work by Tatjana Nurnberg adds a server flag to detect that a new transaction has started. A new system variable called @@session_track_transaction_state has been introduced. In a load-balanced setup, it is necessary to know when a statement resulted in the start of a new transaction, which would then allow connectors to switch to using a different connection from the connection pool. The critical case to detect is when the transaction is “fresh” and does not have any reads or writes attached to it yet. This is the only case where a connector can switch to using a different connection. If a statement starts a new transaction and starts adding reads or writes to it, it is not possible to move the connection since it would imply a necessary ROLLBACK.

Server version token and check (WL#6940) — This work by Vamsikrishna Bhagi introduces a general synchronization mechanism based on DBA/Application defined tokens. A user with the SUPER privilege can set global tokens in a server (for example some ID) and clients can set session tokens. Before executing a query the session token is compared with the global token and an error generated when there is a discrepancy between the two.

Locking service for read/write named locks (WL#8161) — This work by Jon Olav Hauglid provides a locking service to protect global and session version token lists needed by WL#6940.

Refactoring

Refactoring of the protocol class (WL#7126) — This work by Catalin Besleaga makes an initial step towards decoupling the client-server protocol from SQL processing.

Remove fastmutex from the server sources (WL#4601) — This work by Jon Olav Hauglid removes a home grown spin-lock mutex implementation— called “fast mutex”—from the server sources because of its many shortcomings.

Test Infrastructure

Plugin(s) to test services of server (WL#8462) — This work by Horst Hunger introduces a framework for writing test plugins, for the purpose of testing new service APIs.

Changes to Defaults

Turn STRICT_MODE submodes ON by Default (WL#8596) — This work by Abhishek Ranjan reverts changes done by WL#7467, bringing back the individual NO_ZERO_DATE, NO_ZERO_IN_DATE, ERROR_FOR_DIVISION_BY_ZERO sql modes. These modes will instead be added to the set of sql modes which are enabled by default. WL#7467 merged those sql modes into STRICT mode. This added more checks to STRICT mode, causing some statements which used to pass in 5.6 with a warning in STRICT mode to to fail in 5.7. This in turn caused problems in some upgrade scenarios that could not be easily worked around. See also Bug#75439 reported by Simon Mudd.

Increase table_open_cache_instances compiled default (WL#8397) — This work by Praveenkumar Hulakund increases the --table_open_cache_instances compiled-in default from 1 (old) to 16 (new). This will improve performance on multi-core systems.

Enabled multiple page cleaners and purge threads by default in 5.7 (WL#8316) — This work by Bin Su changes the compiled-in defaults for --innodb_page_cleaners and --innodb_purge_threads from 1 (old) to 4 (new). This will improve performance and stability on multi-core systems.

Deprecation and Removal

Remove sql-bench from the 5.7 server code (WL#8406) — This work by Terje Røsten removes the sql-bench code from the 5.7 server code tree. We decided to remove it because it has not been well maintained for recent versions, and we are no longer using it internally as part of our testing process. See also Morgan Tocker’s blog post here.

Remove use of DATADIR in mysqld_safe for setting MYSQL_HOME in 5.7 (WL#7150) — This work by Yashwant Sahu removes the use of DATADIR in mysqld_safe. A warning about this had existed since MySQL 5.0. From now on, if MYSQL_HOME is not set, then it will be set to BASEDIR.

Bug Fixes

We have received a lot of valuable feedback from the MySQL community throughout the 5.7 DMR cycle. This input is invaluable, and our community members deserve a large THANK YOU! Here are some highlights of the community reported bugs that are now fixed in 5.7.8 RC2:

Bug#75913, Bug#76446, Bug#76346, Bug#76437, Bug#76436, Bug#76432, Bug#76434, Bug#76445, Bug#76424, Bug#76425, Bug#76419, Bug#76416, Bug#76406, Bug#75782, Bug#75736 — All reported by Roel Van de Paar
Bug#74891, Bug#76625, Bug#74833, Bug#74854, Bug#74843, Bug#72806, Bug#72807, Bug#72805 — All reported by Stewart Smith
Bug#73953: left join to view with <> test causing too many results — Reported by David Norman
Bug#77276: force index not working with groupby+orderby — Reported by bombzj bombzj
Bug#76748: the server crash when test st_intersects with st_buffer — Reported by zkong kong
Bug#76401: Can’t distinguish secure_file_priv = NULL and “” — Reported by tsubasa tanaka
Bug#76329: COLLATE option not accepted in generated column definition — Reported by Mario Beck
Bug#76328: Generated columns not printed correctly in SHOW CREATE TABLE — Reported by Mario Beck
Bug#76237: LOAD DATA INFILE ignores a specific row silently under Db Charset is utf8 — Reported by tsubasa tanaka
Bug#76182: The new created tablespace is not reported in the INFORMATION_SCHEMA — Reported by Marco Tusa
Bug#76164: InnoDB FTS with MeCab parser prints empty error message — Reported by tsubasa tanaka
Bug#75995: mysqld –help –verbose tries to lock files — Reported by Daniël van Eeden
Bug#75595: Compute InnoDB redo log block checksums faster — Reported by Laurynas Biveinis
Bug#75829: ST_Centroid produces incorrect results with MultiPolygon — Reported by Bryan Blakey
Bug#75539: max_allowed_packet error destroys original data — Reported by Oli Sennhauser
Bug#75372: incorrect code(or indenting) — Reported by Joshua Rogers
Bug#3083: User length limitation — Reported by Dracul Vlad
Bug#74177: –slave-preserve-commit-order causes slave to deadlock and break for some querie — Reported by Kristian Nielsen
Bug#74253: gtid_next with empty transactions does not work inside stored procedures — Reported by Davi Arnaut
Bug#70860: –tc-heuristic-recover option values are broken — Reported by Laurynas Biveinis
Bug#69425: St_Intersection produces incorrect results — Reported by Ivan Balashov
Bug#69538: Spatial geometry function returns wrong result — Reported by Vaclav Novotny
Bug#72056: Wrong comparsion on big DECIMAL values — Reported by Artem Zaytsev
Bug#69903: Stack corruption in vio_io_wait on Mac OS X — Reported by Sergei Glushchenko
Bug#67014: View evaluation is incorrect when joining to view that selects a literal — Reported by Luke Stevens
Bug#67300: table left join view, unmatched rows problem where view contains an if — Reported by David Norman
Bug#65936: wrong results on table left join view — Reported by Richard Kojedzinszky
Bug#55265: mysqlslap Option –auto-generate-sql-secondary-indexes doesnt work — Reported by Andreas Braza
Bug#37703: MySQL performance with and without Fast Mutexes using Sysbench Workload — Reported by Harita Chilukuri

You can also learn about the work that we did in 5.7, leading up to this second release candidate, through this series of milestone blog posts: 5.7.1, 5.7.2, 5.7.3, 5.7.4, 5.7.5, 5.7.6, 5.7.7.

That’s it for now, and thanks for using MySQL!