At a glance, MySQL 5.6 is simply a better MySQL with improvements that enhance every functional area of the database kernel, including:
This article serves as a DBA and Developer guide to MySQL 5.6 as it highlights the key new features in each of these areas, many with practical use case examples.
From an operational standpoint MySQL 5.6 provides better sustained linear performance and scale on systems supporting multi-processors and high CPU thread concurrency. Key to this are improvements to Oracle's InnoDB storage engine efficiency and concurrency that remove legacy thread contention and mutex locking within the InnoDB kernel. These improvements enable MySQL to fully exploit the advanced multi-threaded processing power of today's x86-based commodity-off-the-shelf hardware.
Internal benchmarks for SysBench Read/Write and Read Only workloads show a marked improvement in sustained scale over the most current version of MySQL 5.5. The following shows that MySQL 5.6 provides "up and to the right" linear read/write transactions per second ("TPS") scale on systems that support upwards of 60 concurrent CPU threads.
Read only TPS workload sustained scale is also improved as demonstrated here:
MySQL 5.6 improves InnoDB for better performance and scalability on highly concurrent, transactional and read intensive workloads. In these cases performance gains are best measured by how an application performs and scales as concurrent user workloads grow. In support of these use cases, InnoDB has a new re-factored architecture that minimizes mutex contentions and bottlenecks and provides a more consistent access path to underlying data. Improvements include:
The net result of these improvements is reflected in the SysBench read/write benchmarks shown here:
For Linux, MySQL 5.6 shows up to a 150% improvement in transactional TPS throughput over MySQL 5.5, while similar tests run on Windows 2008 reveal a 47% performance gain.
New optimizations have been made to InnoDB for read only transactions that greatly improve the performance of high concurrency web-based lookups and report-generating applications. These optimizations bypass transactional overhead and are enabled by default when autocommit = 1, or can be atomically controlled by the developer using the new START_TRANSACTION_READ_ONLY syntax:
SET autocommit = 0; START_TRANSACTION_READ_ONLY; SELECT c FROM T1 WHERE id=N; COMMIT;
The results of these optimizations are shown here:
For Linux, MySQL 5.6 shows up to a 230% improvement in read only TPS throughput over MySQL 5.5, while similar tests run on Windows 2008 show a 65% performance gain.
For context, all benchmarks shown above were run on the following platform configuration:
The SysBench benchmark tool is freely available for application use-case specific benchmarks and can be downloaded here.
You can also get in depth MySQL 5.6 performance and feature specific benchmarks by following Mikael Ronstrom's blog and Dimitri Kravtchuk's blog. Both share the test cases and configurations they use to arrive at the conclusions drawn above.
Spinning disks are among the most common bottlenecks on any system, simply because they have mechanical parts that physically limit the ability to scale as concurrency grows. As a result, many MySQL applications are being deployed on SSD enabled systems which provide the memory-based speed and reliability required to support the highest levels of concurrency on today's web-based systems. With this in mind, MySQL 5.6 includes several key enhancements designed specifically for use with SSD, including:
The MySQL 5.6 Optimizer has been re-factored for better efficiency and performance and provides an improved feature set for better query execution times and diagnostics. They key 5.6 optimizer improvements include:
Using semi-JOINs and materialization, the MySQL Optimizer delivers greatly improved subquery performance, simplifying how developers construct queries. Specifically, the optimizer is now more efficient in handling subqueries in the FROM clause; materialization of subqueries in the FROM clause is now postponed until their contents are needed during execution, greatly improving performance. Additionally, the optimizer may add an index to derived tables during execution to speed up row retrieval. Tests run using the DBT-3 benchmark Query #13, shown below, demonstrate an order of magnitude improvement in execution times (from days to seconds) over previous versions.
select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 313 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate LIMIT 100;
See the blog: "From Months to Seconds with Subquery Materialization" for more details.
For queries with ORDER BY and small LIMIT values, the optimizer now produces an ordered result set using a single table scan. These queries are common in web applications that display only a few rows from a large result set such as:
SELECT col1, ... FROM t1 ... ORDER BY name LIMIT 10;
Internal benchmarks have shown up to a 4x improvement in query execution times which helps improve overall user experience and response times. See the blog: for more details.
By default, the optimizer now pushes WHERE conditions down to the storage engine for evaluation, table scan and return of ordered result set to the MySQL server.
CREATE TABLE person ( personid INTEGER PRIMARY KEY, firstname CHAR(20), lastname CHAR(20), postalcode INTEGER, age INTEGER, address CHAR(50), KEY k1 (postalcode,age) ) ENGINE=InnoDB; SELECT lastname, firstname FROM person WHERE postalcode BETWEEN 5000 AND 5500 AND age BETWEEN 21 AND 22;
Internal benchmarks on this type of table and query have shown up to 15x improved execution times with the ICP default behavior.
The optimizer now provides the storage engine with all primary keys in batches and enables the storage engine to access, order and return the data more efficiently greatly improving query execution times.
Together, BKA and MRR show up to 280x improvement in query execution times for DBT-3 Query 13 and other disk-bound query benchmarks. Learn more here by reading the blog: Batched Key Access Speeds Up Disk-Bound Join Queries.
For a deep technical dive into the implementation, how to enable/disable where applicable and the use case specific performance improvements you can expect with each of these new features check out the MySQL Optimizer Engineering team blog.
Today's web-based applications are designed to rapidly evolve and adapt to meet business and revenue-generation requirements. As a result, development SLAs are now most often measured in minutes vs days or weeks. So when an application must quickly support new product lines or new products within existing product lines, the backend database schema must adapt in kind, most commonly while the application remains available for normal business operations. MySQL 5.6 supports this level of online schema flexibility and agility by providing the following new ALTER TABLE DDL syntax additions:
DBAs and Developers can add indexes and perform standard InnoDB table alterations while the database remains available for application updates. This is especially beneficial for rapidly evolving applications where developers need schema flexibility to accommodate changing business requirements.
Many of the latest generation of web, cloud, social and mobile applications require fast operations against simple Key/Value pairs. At the same time, they must retain the ability to run complex queries against the same data, as well as ensure the data is protected with ACID guarantees. With the new NoSQL API for InnoDB, developers have all the benefits of a transactional RDBMS, coupled with the performance capabilities of Key/Value store.
MySQL 5.6 provides simple, key-value interaction with InnoDB data via the familiar Memcached API. Implemented via a new Memcached daemon plug-in to mysqld, the new Memcached protocol is mapped directly to the native InnoDB API and enables developers to use existing Memcached clients to bypass the expense of query parsing and go directly to InnoDB data for lookups and transactional compliant updates. The API makes it possible to re-use standard Memcached libraries and clients, while extending Memcached functionality by integrating a persistent, crash-safe, transactional database back-end. The implementation is shown here:
So does this option provide a performance benefit over SQL? Internal performance benchmarks using a customized Java application and test harness show some very promising results with a 9X improvement in overall throughput for SET/INSERT operations:
Not only do developers and DBAs get more performance and flexibility, they also reduce complexity as it is possible to compress previously separate caching and database layers into a single data management tier, as well as eliminate the overhead of maintaining cache consistency.
You can follow the InnoDB team blog for the methodology, implementation and internal test cases that generated the above results.
New MySQL 5.6 optimizations and features extend InnoDB into more use cases so developers can simplify applications by standardizing on a single storage engine.
Provided as a better alternative to MyISAM FTS, InnoDB now enables developers to build FULLTEXT indexes on InnoDB tables to represent text-based content and speed up application searches for words and phrases. InnoDB full-text search supports Natural language/Boolean modes, proximity search and relevance ranking. A simple use case example looks like:
CREATE TABLE quotes (id int unsigned auto_increment primary key , author varchar(64) , quote varchar(4000) , source varchar(64) , fulltext(quote) ) engine=innodb; SELECT author AS "Apple" FROM quotes WHERE match(quote) against (‘apple' in natural language mode);
InnoDB .ibd files created in file-per-table mode are now transportable between physical storage devices and database servers; when creating a table developers can now designate a storage location for the .idb file outside of the MySQL data directory. This enables "hot" or busy tables to be easily moved to an external network storage device (SSD, HDD) that does not compete with application or database overhead. This new feature also enables quick, seamless application scale by allowing users to easily export/import InnoDB tables between running MySQL servers, as shown here:
CREATE TABLE t(c1 INT) engine=InnoDB; FLUSH TABLE t FOR EXPORT; -- quiesce the table and create the meta data file $innodb_data_home_dir/test/t.cfg UNLOCK TABLES;
CREATE TABLE t(c1 INT) engine=InnoDB; -- if it doesn't already exist ALTER TABLE t DISCARD TABLESPACE; -- The user must stop all updates on the tables, prior to the IMPORT ALTER TABLE t IMPORT TABLESPACE;
The InnoDB improvements noted here are by no means exhaustive. Learn about the complete accounting of all MySQL 5.6 features, along with all technical documentation.
For a deep technical dive into the implementation, how to enable/disable where applicable and the use case specific improvements you can expect with each of these new features follow the MySQL InnoDB Engineering team blog.
Replication is the most widely used MySQL feature for scale-out and High Availability (HA) and MySQL 5.6 includes new features designed to enable developers building next generation web, cloud, social and mobile applications and services with self-healing replication topologies and high performance master and slaves. The key features include:
GTIDs enable replication transactional integrity to be tracked through a replication master/slave topology, providing a foundation for self-healing recovery, and enabling DBAs and developers to easily identify the most up to date slave in the event of a master failure. Built directly into the Binlog stream, GTIDs eliminate the need for complex third-party add-ons to provide this same level of tracking intelligence.
GTID positioning in the Binlog
A new set of Python Utilities are designed to leverage the new replication GTIDs to provide replication administration and monitoring with automatic fail-over in the event of a failed master, or switchover in the event of maintenance to the master. This eliminates the need for additional third party High-Availability solutions, protecting web and cloud-based services against both planned and unplanned downtime without operator intervention.
Splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems.
SysBench benchmarks using a graduated number of worker threads across 10 schemas show up to 5x in performance gain with multi-threading enabled.
In MySQL 5.6 replication masters now group writes to the Binlog rather than committing them one at a time, significantly improving performance on the master side of the topology. BGC also enables finer grained locking which reduces lock waits, again, adding to the performance gain, shown here:
MySQL 5.6 shows up to a 180% performance gain over 5.5 in master server throughput with replication enabled (Binlog=1). BGC largely eliminates the trade-off users had to make between performance overhead to the master and the scale-out, HA benefits offered by MySQL replication.
Finally, you can learn more about MySQL Replication and get practical, hands on experience using these resources:
The MySQL Performance Schema was introduced in MySQL 5.5 and is designed to provide point in time metrics for key performance indicators. MySQL 5.6 improves the Performance Schema in answer to the most common DBA and developer problems. New instrumentations include:
The MySQL 5.6 Performance Schema is now enabled by default in the my.cnf file with optimized and auto-tune settings that minimize overhead (< 5%, but mileage will vary), so using the Performance Schema a production server to monitor the most common application use cases is less of an issue. In addition, new atomic levels of instrumentation enable the capture of granular levels of resource consumption by users, hosts, accounts, applications, etc. for billing and chargeback purposes in cloud computing environments.
MySQL Engineering has several champions behind the 5.6 Performance Schema, and many have published excellent blogs that you can reference for technical and practical details. To get started see Mark Leith's blog and Marc Alff's blog.
MySQL 5.6 introduces a major overhaul to how passwords are internally handled and encrypted. The new options and features include:
The addressable items for each are:
The new GET DIAGNOSTICS command provides a standard interface into the Diagnostics Area and can be used via the CLI or from within application code to easily retrieve and handle the results of the most recent statement execution:
mysql> DROP TABLE test.no_such_table; ERROR 1051 (42S02): Unknown table 'test.no_such_table' mysql> GET DIAGNOSTICS CONDITION 1 -> @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT; mysql> SELECT @p1, @p2; +-------+------------------------------------+ | @p1 | @p2 | +-------+------------------------------------+ | 42S02 | Unknown table 'test.no_such_table' | +-------+------------------------------------+
MySQL 5.5 has been called the best release of MySQL ever. MySQL 5.6 builds on this by providing across the board improvements in performance, scalability, transactional throughput, availability and performance related instrumentation all designed to keep pace with requirements of the most demanding web, cloud and embedded use cases. MySQL 5.6 is now Generally Available and you can download the fully-functioning, production-ready product from the MySQL Developer Zone.
As stated earlier, this article introduces many of the key features in MySQL 5.6. For a complete picture, visit the MySQL Documentation.
As always, thanks for reading, and thanks for your continued support of MySQL!