What's New in MySQL 5.6
With MySQL 5.5 generally available and being deployed all over the planet, the architects and strategists can start looking ahead to the exciting new thing with
SQL in its name: MySQL 5.6.
MySQL 5.6 builds on the momentum of 5.5, and Oracle's investment and commitment to MySQL, by delivering better performance and scalability.
At this year's MySQL Conference & Expo, you'll hear about:
- Optimizer improvements for all-around query performance.
- InnoDB improvements for higher transactional throughput.
- New NoSQL-style
- Partitioning improvements for querying and managing huge tables.
- Replication improvements covering many aspects.
- Better performance monitoring by expanding the data available through the
Here are the features that you will get as a graduated series of MySQL 5.6 development milestone releases. The first such milestone release, MySQL 5.6.2, is available for download right now at http://dev.mysql.com/downloads/. Start beta testing the 5.6 release now and give feedback to the MySQL engineering team.
Index Condition Pushdown
Moves more of the processing for
WHERE clauses to the storage engine. Instead of fetching entire rows to evaluate against a set of
WHERE clauses, ICP sends those clauses to the storage engine, which can prune the result set by examining index tuples. The result is less I/O overhead for the base table, and less internal communication overhead for the server and the storage engine. This feature works with InnoDB, MyISAM, and
NDBCLUSTER tables. Read more about index condition pushdown.
Until the day when you have all the SSDs you want, it's faster to read data sequentially from disk than to do random accesses. For secondary indexes, the order for the index entries on disk is different than the order of disk blocks for the full rows. Instead of retrieving the full rows using a sequence of small out-of-order reads, MRR scans one or more index ranges used in a query, sorts the associated disk blocks for the row data, then reads those disk blocks using larger sequential I/O requests. The speedup benefits operations such as range index scans and equi-joins on indexed columns. (Think InnoDB foreign keys.) Works all storage engines. Read more about multi-range read.
File Sort Optimization
For queries that combine
ORDER BY non_indexed_column and a
LIMIT x clause, this feature speeds up the sort when the contents of X rows can fit into the sort buffer. Works with all storage engines.
MySQL 5.6 builds on the platform of InnoDB fully integrated as the default storage engine, which happened way back in MySQL 5.5.
Persistent Optimizer Stats
Provides improved accuracy of InnoDB index statistics, and consistency across MySQL restarts. InnoDB precomputes statistics that help the optimizer decide which indexes to use in a query, by sampling a portion of the index. You can adjust the amount of sampling that InnoDB does for each index. The resulting statistics can now persist across server restarts, rather than being recomputed (and possibly changing) due to restarts and some runtime events. The more accurate statistics can improve query performance, and the persistence aspect can keep query performance stable. This feature is controlled by the configuration options
innodb_stats_transient_sample_pages. When the persistent stats feature is enabled, the statistics are only recomputed when you explicitly run
ANALYZE TABLE for the table. Read more about Persistent Optimizer Stats.
New INFORMATION_SCHEMA Tables
Once InnoDB information was made available for queries through the
INFORMATION_SCHEMA in MySQL 5.5, people clamored for more kinds of status and monitoring information. The SQL interface is more standardized and predictable than parsing the text output from
SHOW STATUS commands.
- Metrics table: Provides a generic and comprehensive resource and performance monitoring framework for InnoDB. The new I_S table is
- System Tables: Makes the InnoDB internal data dictionary available for SQL queries, for convenience of monitoring. The new I_S tables are
- Buffer Pool Information table: Displays buffer pool page information for tuning on large-memory or highly loaded systems. (Highly requested by customers and community users.) The new I_S tables are
- Read more about the new InnoDB-related
- See the corresponding documentation.
Additional Optimizer Support
InnoDB supports the ICP, MRR, and file sort optimizer features mentioned earlier.
Split Kernel Mutex
The InnoDB team continues to remove bottlenecks for busy systems. Now instead of a single mutex controlling concurrent access to core InnoDB operations, several more fine-grained mutexes and rw-locks reduce contention on a busy system. In particular, the subsystems for transaction control, MVCC views, and locking now all have separate mutexes or rw-locks. Read more about the kernel mutex split.
InnoDB now has multiple purge threads, making purge operations across multiple tables more efficient. Originally, purge operations were handled by the InnoDB master thread, leading to contention with other InnoDB operations; in MySQL 5.5, purge was moved into a single separate thread, and in MySQL 5.6 you can set
innodb_purge_threads to a value greater than 1. Read more about multiple purge threads.
Separate Flush Thread
A separate flush thread (
page_cleaner) makes flushing operations more efficient. This operation was formerly controlled by the InnoDB master thread, leading to contention with other InnoDB operations. Read more about the separate flush thread.
Pruning the InnoDB Table Cache
To ease the memory load on systems with huge numbers of tables, InnoDB now frees up the memory associated with an opened table. An LRU algorithm selects tables that have gone the longest without being accessed. To reserve more memory for open tables, increase the value of the
--table_definition_cache=# configuration option. Read more about the InnoDB table cache.
NoSQL Interface via
The ever-increasing performance demands of web-based services has generated significant interest in providing NoSQL access methods to MySQL – maintaining all of the advantages of your existing relational database infrastructure, while providing blazing fast performance for simple queries, using an API to complement regular SQL access to your data.
memcached API, web services can now directly access the InnoDB storage engine without transformations to SQL, ensuring low latency and high throughput for read/write queries. Operations such as SQL parsing are eliminated and more of the server's hardware resources (CPU, memory and I/O) are dedicated to servicing the query within the storage engine itself.
memcached, developers and DBAs are able to:
- Preserve investments in
memcachedinfrastructure by reusing existing memcached clients and eliminating the need for application changes.
- Access the full range of
memcachedclient libraries and platforms, providing maximum deployment flexibility and consistently high performance across all supported environments.
memcachedfunctionality by integrating a persistent, crash-safe, transactional database back-end offering ACID compliance.
The MySQL Cluster 7.2 Development Milestone Release also previews a
memcached interface for the
NDBCLUSTER storage engine. Read more about the NoSQL interface for NDBCLUSTER.
Data gets bigger. That's a fact of life. Partitioned tables help to segment that data into manageable portions, while keeping the convenience of a SQL interface to the full table. These new features make certain operations with partitions faster and more convenient.
Explicit Partition Selection
With partitioned tables, MySQL can restrict processing to only the relevant portions of a big data set. Now you can directly define which partitions are used in a query, DML, or data load operation, rather than repeating all the partitioning criteria in each statement.
SELECT * FROM employees PARTITION (p0, p2); DELETE FROM employees PARTITION (p0, p1); UPDATE employees PARTITION (p0) SET store_id = 2 WHERE fname = 'Jill'; SELECT e.id, s.city FROM employees AS e JOIN stores PARTITION (p1) AS s ...;
Import / Export for Partitioned Tables
To quickly bring a new data set into a partitioned table, or to export a partition or subpartition to manage it as a regular table, you can use the syntax
ALTER TABLE ... EXCHANGE PARTITION. You specify a partition or subpartition of a partitioned table, and a non-partitioned table with a compatible structure, and this operation swaps their places without any expensive copy operation.
ALTER TABLE e EXCHANGE PARTITION p0 WITH TABLE e2;
This operation works with any storage engine that supports partitioned tables. Read more about importing and exporting partitions.
Replication is a fruitful area for enhancements to data integrity, availability, and let's not forget performance.
Replication Improvements to Performance
Optimized Row-Based Replication
By only replicating partial
after images for
DELETE events where primary keys or explicit columns were set in the SQL statement, performance can be increased while binary log disk space, network resources and server memory footprint are reduced.
Replication performance is improved by using multiple execution threads to apply replication events to slave servers. The multi-threaded slave splits work between worker threads based on the database name, allowing updates to be applied in parallel rather than sequentially.
As a result, replication throughput is increased and latency is reduced which minimizes the risk of replication lag, enabling slaves to serve the freshest updates to the application.
The multi-threaded slave functionality is available now as part of the
snapshot-next-mr-wl5563-labs build which can be downloaded from Labs.mysql.com.
Replication Improvements to Data Integrity
Extends the robustness and ease-of-use of MySQL replication by making the slaves crash-safe when using transactional storage engines such as InnoDB.
The slave can automatically recover from a failure and resume replicating DML updates, without the DBA having to access the
relaylog.info files to manually roll back replication to the last successfully committed transaction, or to skip transactions.
As a result, data integrity is enhanced and DBAs can be free to concentrate on more strategic data management activities.
Ensures the integrity of data being replicated to a slave by detecting data corruption and returning an error, preventing the slave itself from becoming corrupt.
Checksums are implemented in the binary and relay logs as well as to individual replication events, allowing errors to be detected whether they are caused by memory, disk or network failures, or by the database itself. Checksum checking can be implemented on a per-slave basis, giving maximum flexibility in how and where it is deployed.
You can define a time delay for events to be replicated from a master to each slave, defined in millisecond increments up to a maximum of 68 years!
Time-Delayed Replication affords protection against operational errors made on the master, for example accidently dropping tables, in which event the slave can be promoted to the new master in order to restore the database to its previous state. Time-Delayed Replication can also be useful in testing application behavior by emulating any instances of replication lag.
Time-Delayed Replication is implemented at the per-slave level (via holding execution of the
SQL_THREAD), so you could configure multiple slaves to apply replication events immediately, and another slave to apply only after a delay of 5 minutes, therefore providing deployment flexibility.
Replication Improvements to Usability
Informational Log Events
Enhances auditing and debugging when using Row-Based Replication by writing the original query to the binary log, which is then replicated with its associated row-based event to the slave.
Remote Binlog Back-up
Enhances operational efficiency by using the replication channel to create real-time back-ups from the binary log.
By adding a
raw flag, the binlog is written out to remote back-up servers, without having a MySQL database instance translating it into SQL statements, and without the DBA needing SSH access to each master server.
Automatically generates a Universally Unique Identifier (UUID) for each server, allowing MySQL Enterprise Monitor or any other monitoring tool to retrieve information about master and slave servers in a replication configuration. The UUID is available through a SQL query and in the output of the
SHOW SLAVE STATUS command. This technique requires fewer database connections and works better with servers that are monitored remotely or that use virtual IP addresses. This feature is especially useful in large and highly dynamic replication environments, making auto-discovery more reliable and simplifying systems management.
Instrumentation and Diagnostic Improvements via
MySQL 5.6 greatly enhances the
PERFORMANCE_SCHEMA features for performance monitoring and tuning. The information in the
performance_schema tables lets you see how various low-level items factor into overall database performance, which ones are the
hottest under various workloads and system configurations, and trace issues back to the relevant file and line in the source code so you can really see what's happening behind the scenes. Read more about Performance Schema.
performance_schema code is further tuned in 5.6 to reduce the performance overhead of running with this feature enabled.
Table I/O and Instrumentation
setup_object table enables a DBA to understand, analyze, and tune how an application generates I/O load related to table access on the MySQL database server.
Table Locks Instrumentation
setup_object table enables a DBA to identify
hot tables and other object-related bottlenecks caused by the data access patterns of an application.
Session/User Level Instrumentation
setup_actors table enables a DBA to monitor the load generated from a specific user or application by selectively instrumenting specific end user/application connections. This is valuable to narrow down the monitoring data collected: a) by thread or session, b) by connections originating from a given user account, c) by connections originating from a given machine, d) by connections originating from a given user account on a given machine. Reducing the amount of instrumented connections also reduces the runtime overhead of the instrumentation in production.
Global Performance Metric Summaries
Globally aggregates the data collected by the
PERFORMANCE_SCHEMA by thread / by object / by instrumentation point. Enables DBA to write monitoring scripts and applications.
Table/Index I/O Summary
setup_object table aggregates table I/O data collected by the
PERFORMANCE_SCHEMA, by index, by table. Aggregation shows which tables or indexes are
hot and most used by an application, and helps to identify the application data access patterns.
Table Lock Wait Summary
Aggregates Table lock data collected by the
PERFORMANCE_SCHEMA by table. This aggregation shows which
hot tables are often locked by an application, and helps to identify the application bottlenecks caused by table locking.
Enables a DBA to monitor statement execution, collect per-statement metrics, and analyze statements/executions by end user session or global aggregates. Quantify which statements are generated by an application, execution times/counts and access paths. Metrics reported give insight on the data volumes, the selectivity of
WHERE clauses, and index usage patterns. Available in Labs.mysql.com.
Show Contents of Host Cache - centralized logging of connection errors
(Highly requested by customers and community users. Delivered via community code contribution.) Provides centralized logging of connection-related errors. MySQL internal host cache has been instrumented and exposed in a relational table for SQL access, providing a centralized logging of connection errors, with details about the exact root cause (SSL, DNS, Authentication plugins, ...) This enables a DBA to easily troubleshoot large deployments involving numerous users and heterogeneous applications. Available from Labs.mysql.com.
Now that you have read about all the exciting new performance and scalability improvements, it's your turn to take MySQL 5.6 for a spin: