DBA and Developer Guide to MySQL 5.6
Building the Next Generation of Web Applications and Services
At a glance, MySQL 5.6 is simply a better MySQL with improvements that enhance every functional area of the database kernel, including:
- Better Performance and Scalability
- Improved InnoDB storage engine for better transactional throughput
- Improved Optimizer for better query execution times and diagnostics
- Better Application Availability with Online DDL/Schema changes
- Better Developer Agility with NoSQL Access with Memcached API to InnoDB
- Improved Replication for high performance, self-healing distributed deployments
- Improved Performance Schema for better instrumentation
- Improved Security for worry-free application deployments
- And other Important Enhancements
This article serves as a DBA and Developer guide to MySQL 5.6 as it highlights the key new features in each of these areas, many with practical use case examples.
Better Performance and Scalability: Improved InnoDB Storage Engine
From an operational standpoint MySQL 5.6 provides better sustained linear performance and scale on systems supporting multi-processors and high CPU thread concurrency. Key to this are improvements to Oracle's InnoDB storage engine efficiency and concurrency that remove legacy thread contention and mutex locking within the InnoDB kernel. These improvements enable MySQL to fully exploit the advanced multi-threaded processing power of today's x86-based commodity-off-the-shelf hardware.
Internal benchmarks for SysBench Read/Write and Read Only workloads show a marked improvement in sustained scale over the most current version of MySQL 5.5. The following shows that MySQL 5.6 provides "up and to the right" linear read/write transactions per second ("TPS") scale on systems that support upwards of 60 concurrent CPU threads.
Read only TPS workload sustained scale is also improved as demonstrated here:
Better Transactional Throughput
MySQL 5.6 improves InnoDB for better performance and scalability on highly concurrent, transactional and read intensive workloads. In these cases performance gains are best measured by how an application performs and scales as concurrent user workloads grow. In support of these use cases, InnoDB has a new re-factored architecture that minimizes mutex contentions and bottlenecks and provides a more consistent access path to underlying data. Improvements include:
- Kernel mutex split to remove a single point of contention
- New thread for flushing operations
- New multi-threaded purge
- New adaptive hashing algorithm
- Less buffer pool contention
- Better, more consistent query execution via persistent optimizer statistics that are collected at more regular, predictable intervals
The net result of these improvements is reflected in the SysBench read/write benchmarks shown here:
For Linux, MySQL 5.6 shows up to a 150% improvement in transactional TPS throughput over MySQL 5.5, while similar tests run on Windows 2008 reveal a 47% performance gain.
Better Read Only Workload Throughput
New optimizations have been made to InnoDB for read only transactions that greatly improve the performance of high concurrency web-based lookups and report-generating applications. These optimizations bypass transactional overhead and are enabled by default when autocommit = 1, or can be atomically controlled by the developer using the new START_TRANSACTION_READ_ONLY syntax:
SET autocommit = 0; START_TRANSACTION_READ_ONLY; SELECT c FROM T1 WHERE id=N; COMMIT;
The results of these optimizations are shown here:
For Linux, MySQL 5.6 shows up to a 230% improvement in read only TPS throughput over MySQL 5.5, while similar tests run on Windows 2008 show a 65% performance gain.
For context, all benchmarks shown above were run on the following platform configuration:
- Oracle Linux 6
- Intel® Xeon(R) E7540 x86_64
- MySQL leveraging:
- 60 of 96 available CPU threads
- 2 GHz, 512GB RAM
The SysBench benchmark tool is freely available for application use-case specific benchmarks and can be downloaded here.
You can also get in depth MySQL 5.6 performance and feature specific benchmarks by following Mikael Ronstrom's blog and Dimitri Kravtchuk's blog. Both share the test cases and configurations they use to arrive at the conclusions drawn above.
Better Performance with Solid State Drives (SSD)
Spinning disks are among the most common bottlenecks on any system, simply because they have mechanical parts that physically limit the ability to scale as concurrency grows. As a result, many MySQL applications are being deployed on SSD enabled systems which provide the memory-based speed and reliability required to support the highest levels of concurrency on today's web-based systems. With this in mind, MySQL 5.6 includes several key enhancements designed specifically for use with SSD, including:
- Support for smaller 4k and 8k page sizes to better fit the standard storage algorithm of SSD.
- Portable .ibd (InnoDB data) files that allow "hot" InnoDB tables to be easily moved from the default data directory to SSD or network storage devices.
- Separate tablespaces for the InnoDB unlog log that optionally moves the undo log out of the system tablespace into one or more separate tablespaces. The read-intensive I/O patterns for the undo log make these new tablespaces good candidates to move to SSD storage, while keeping the system tablespace on hard drive storage.
Better Query Execution Times and Diagnostics: Improved Optimizer
The MySQL 5.6 Optimizer has been re-factored for better efficiency and performance and provides an improved feature set for better query execution times and diagnostics. They key 5.6 optimizer improvements include:
Using semi-JOINs and materialization, the MySQL Optimizer delivers greatly improved subquery performance, simplifying how developers construct queries. Specifically, the optimizer is now more efficient in handling subqueries in the FROM clause; materialization of subqueries in the FROM clause is now postponed until their contents are needed during execution, greatly improving performance. Additionally, the optimizer may add an index to derived tables during execution to speed up row retrieval. Tests run using the DBT-3 benchmark Query #13, shown below, demonstrate an order of magnitude improvement in execution times (from days to seconds) over previous versions.
select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 313 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate LIMIT 100;
See the blog: "From Months to Seconds with Subquery Materialization" for more details.
File Sort Optimizations with Small Limit
For queries with ORDER BY and small LIMIT values, the optimizer now produces an ordered result set using a single table scan. These queries are common in web applications that display only a few rows from a large result set such as:
SELECT col1, ... FROM t1 ... ORDER BY name LIMIT 10;
Internal benchmarks have shown up to a 4x improvement in query execution times which helps improve overall user experience and response times. See the blog: for more details.
Index Condition Pushdown (ICP)
By default, the optimizer now pushes WHERE conditions down to the storage engine for evaluation, table scan and return of ordered result set to the MySQL server.
CREATE TABLE person ( personid INTEGER PRIMARY KEY, firstname CHAR(20), lastname CHAR(20), postalcode INTEGER, age INTEGER, address CHAR(50), KEY k1 (postalcode,age) ) ENGINE=InnoDB; SELECT lastname, firstname FROM person WHERE postalcode BETWEEN 5000 AND 5500 AND age BETWEEN 21 AND 22;
Internal benchmarks on this type of table and query have shown up to 15x improved execution times with the ICP default behavior.
Batched Key Access (BKA) and Multi-Range Read (MRR)
The optimizer now provides the storage engine with all primary keys in batches and enables the storage engine to access, order and return the data more efficiently greatly improving query execution times.
Together, BKA and MRR show up to 280x improvement in query execution times for DBT-3 Query 13 and other disk-bound query benchmarks. Learn more here by reading the blog: Batched Key Access Speeds Up Disk-Bound Join Queries.
- Better Optimizer Diagnostics - The MySQL 5.6 optimizer also provides better diagnostics and debugging with:
- EXPLAIN for INSERT, UPDATE, and DELETE operations,
- EXPLAIN plan output in JSON format with more precise optimizer metrics and better readability
- Optimizer Traces for tracking the optimizer decision-making process.
For a deep technical dive into the implementation, how to enable/disable where applicable and the use case specific performance improvements you can expect with each of these new features check out the MySQL Optimizer Engineering team blog.
Better Application Availability: Online DDL/Schema Changes
Today's web-based applications are designed to rapidly evolve and adapt to meet business and revenue-generation requirements. As a result, development SLAs are now most often measured in minutes vs days or weeks. So when an application must quickly support new product lines or new products within existing product lines, the backend database schema must adapt in kind, most commonly while the application remains available for normal business operations. MySQL 5.6 supports this level of online schema flexibility and agility by providing the following new ALTER TABLE DDL syntax additions:
- CREATE INDEX
- DROP INDEX
- Change AUTO_INCREMENT value for a column
- ADD/DROP FOREIGN KEY
- Rename COLUMN
- Change ROW FORMAT, KEY_BLOCK_SIZE for a table
- Change COLUMN NULL, NOT_NULL
- Add, drop, reorder COLUMN
DBAs and Developers can add indexes and perform standard InnoDB table alterations while the database remains available for application updates. This is especially beneficial for rapidly evolving applications where developers need schema flexibility to accommodate changing business requirements.
Better Developer Agility: NoSQL Access to InnoDB
Many of the latest generation of web, cloud, social and mobile applications require fast operations against simple Key/Value pairs. At the same time, they must retain the ability to run complex queries against the same data, as well as ensure the data is protected with ACID guarantees. With the new NoSQL API for InnoDB, developers have all the benefits of a transactional RDBMS, coupled with the performance capabilities of Key/Value store.
MySQL 5.6 provides simple, key-value interaction with InnoDB data via the familiar Memcached API. Implemented via a new Memcached daemon plug-in to mysqld, the new Memcached protocol is mapped directly to the native InnoDB API and enables developers to use existing Memcached clients to bypass the expense of query parsing and go directly to InnoDB data for lookups and transactional compliant updates. The API makes it possible to re-use standard Memcached libraries and clients, while extending Memcached functionality by integrating a persistent, crash-safe, transactional database back-end. The implementation is shown here:
So does this option provide a performance benefit over SQL? Internal performance benchmarks using a customized Java application and test harness show some very promising results with a 9X improvement in overall throughput for SET/INSERT operations:
Not only do developers and DBAs get more performance and flexibility, they also reduce complexity as it is possible to compress previously separate caching and database layers into a single data management tier, as well as eliminate the overhead of maintaining cache consistency.
You can follow the InnoDB team blog for the methodology, implementation and internal test cases that generated the above results.
Better Developer Agility: Extended InnoDB Use Cases
New MySQL 5.6 optimizations and features extend InnoDB into more use cases so developers can simplify applications by standardizing on a single storage engine.
New Full Text Search (FTS)
Provided as a better alternative to MyISAM FTS, InnoDB now enables developers to build FULLTEXT indexes on InnoDB tables to represent text-based content and speed up application searches for words and phrases. InnoDB full-text search supports Natural language/Boolean modes, proximity search and relevance ranking. A simple use case example looks like:
CREATE TABLE quotes (id int unsigned auto_increment primary key , author varchar(64) , quote varchar(4000) , source varchar(64) , fulltext(quote) ) engine=innodb; SELECT author AS "Apple" FROM quotes WHERE match(quote) against (‘apple' in natural language mode);
New Transportable Tablespaces
InnoDB .ibd files created in file-per-table mode are now transportable between physical storage devices and database servers; when creating a table developers can now designate a storage location for the .idb file outside of the MySQL data directory. This enables "hot" or busy tables to be easily moved to an external network storage device (SSD, HDD) that does not compete with application or database overhead. This new feature also enables quick, seamless application scale by allowing users to easily export/import InnoDB tables between running MySQL servers, as shown here:
CREATE TABLE t(c1 INT) engine=InnoDB; FLUSH TABLE t FOR EXPORT; -- quiesce the table and create the meta data file $innodb_data_home_dir/test/t.cfg UNLOCK TABLES;
CREATE TABLE t(c1 INT) engine=InnoDB; -- if it doesn't already exist ALTER TABLE t DISCARD TABLESPACE; -- The user must stop all updates on the tables, prior to the IMPORT ALTER TABLE t IMPORT TABLESPACE;
The InnoDB improvements noted here are by no means exhaustive. Learn about the complete accounting of all MySQL 5.6 features, along with all technical documentation.
For a deep technical dive into the implementation, how to enable/disable where applicable and the use case specific improvements you can expect with each of these new features follow the MySQL InnoDB Engineering team blog.
Improved Replication and High Availability
Replication is the most widely used MySQL feature for scale-out and High Availability (HA) and MySQL 5.6 includes new features designed to enable developers building next generation web, cloud, social and mobile applications and services with self-healing replication topologies and high performance master and slaves. The key features include:
New Global Transactions Identifiers (GTIDs)
GTIDs enable replication transactional integrity to be tracked through a replication master/slave topology, providing a foundation for self-healing recovery, and enabling DBAs and developers to easily identify the most up to date slave in the event of a master failure. Built directly into the Binlog stream, GTIDs eliminate the need for complex third-party add-ons to provide this same level of tracking intelligence.
GTID positioning in the Binlog
New MySQL Replication utilities
A new set of Python Utilities are designed to leverage the new replication GTIDs to provide replication administration and monitoring with automatic fail-over in the event of a failed master, or switchover in the event of maintenance to the master. This eliminates the need for additional third party High-Availability solutions, protecting web and cloud-based services against both planned and unplanned downtime without operator intervention.
New Multi-threaded Slaves
Splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems.
SysBench benchmarks using a graduated number of worker threads across 10 schemas show up to 5x in performance gain with multi-threading enabled.
New Binary Log Group Commit (BGC)
In MySQL 5.6 replication masters now group writes to the Binlog rather than committing them one at a time, significantly improving performance on the master side of the topology. BGC also enables finer grained locking which reduces lock waits, again, adding to the performance gain, shown here:
MySQL 5.6 shows up to a 180% performance gain over 5.5 in master server throughput with replication enabled (Binlog=1). BGC largely eliminates the trade-off users had to make between performance overhead to the master and the scale-out, HA benefits offered by MySQL replication.
- New Optimized Row-based Replication - MySQL 5.6 provides a new option variable binlog-row-image=minimal that enables applications to replicate only data elements of the row image that have changed following DML operations. This improves replication throughput for both the master and slave(s) and minimizes binary log disk space, network resource and server memory footprint.
- New Crash-Safe Slaves - MySQL 5.6 stores Binlog positional data within tables so slaves can automatically roll back replication to the last committed event before a failure, and resume replication without administrator intervention. Not only does this reduce operational overhead, it also eliminates the risk of data loss caused by a slave attempting to recover from a corrupted data file. Further, if a crash to the master causes corruption of the binary log, the server will automatically recover it to a position where it can be read correctly.
- New Replication Checksums - MySQL 5.6 ensure the integrity of data being replicated to a slave by detecting data corruption and returning an error before corrupt events are applied to the slave, preventing the slave itself from becoming corrupt.
- New Time-delayed Replication - MySQL 5.6 provides protection against operational errors made on the master from propagating to attached slaves by allowing developers to add defined delays in the replication stream. With configurable master to slave time delays, in the event of failure or mishap, slaves can be promoted to the new master in order to restore the database to its previous state. It also becomes possible to inspect the state of a database before an error or outage without the need to reload a back up.
Finally, you can learn more about MySQL Replication and get practical, hands on experience using these resources:
- White Paper: Introduction to MySQL Replication
- Using MySQL Replication for Highly Available Applications
- MySQL Replication Hands On Tutorial: Configuration, Provisioning and Management
Improved Performance Schema
The MySQL Performance Schema was introduced in MySQL 5.5 and is designed to provide point in time metrics for key performance indicators. MySQL 5.6 improves the Performance Schema in answer to the most common DBA and developer problems. New instrumentations include:
- What are my most resource intensive queries? Where do they spend time?
- Table/Index I/O, Table Locks
- Which application tables/indexes cause the most load or contention?
- Which application users, hosts, accounts are consuming the most resources?
- Network I/O
- What is the network load like? How long do sessions idle?
- Aggregated statistics grouped by statement, thread, user, host, account or object.
The MySQL 5.6 Performance Schema is now enabled by default in the my.cnf file with optimized and auto-tune settings that minimize overhead (< 5%, but mileage will vary), so using the Performance Schema a production server to monitor the most common application use cases is less of an issue. In addition, new atomic levels of instrumentation enable the capture of granular levels of resource consumption by users, hosts, accounts, applications, etc. for billing and chargeback purposes in cloud computing environments.
MySQL Engineering has several champions behind the 5.6 Performance Schema, and many have published excellent blogs that you can reference for technical and practical details. To get started see Mark Leith's blog and Marc Alff's blog.
MySQL 5.6 introduces a major overhaul to how passwords are internally handled and encrypted. The new options and features include:
- New alternative to password in master.info - MySQL 5.6 extends the replication START SLAVE command to enable DBAs to specify master user and password as part of the replication slave options and to authenticate the account used to connect to the master through an external authentication plugin (user defined or those provided under MySQL Enterprise Edition). With these options the user and password no longer need to be exposed in plain text in the master.info file.
- New encryption for passwords in general query log, slow query log, and binary log - Passwords in statements written to these logs are no longer recorded in plain text.
- New password hashing with appropriate strength - Default password hashing for internal MySQL server authentication via PASSWORD() is now done using the SHA-256 password hashing algorithm using a random salt value.
- New options for passwords on the command line - MySQL 5.6 introduces a new "scrambled" option/config file (.mylogin.cnf) that can be used to securely store user passwords that are used for command line operations.
- New change password at next login - DBAs and developers can now control when account passwords must be changed via a new password_expired flag in the mysql.user table.
- New policy-based Password validations - Passwords can now be validated for appropriate strength, length, mixed case, special chars, and other user defined policies based on LOW, MEDIUM and STRONG designation settings. See /doc/refman/5.6/en/validate-password-plugin.html for details and available configuration options.
Other Important Enhancements
- New default configuration optimizations - MySQL 5.6 introduces changes to the server defaults that provide better out-of-the-box performance on today's system architectures. These new defaults are designed to minimize the upfront time spent on changing the most commonly updated variables and configuration options.
See Changes to Server Defaults chapter in the MySQL Documentation for a complete rundown of what is changed, what parameters are auto sized based on environment, and what settings can be set and controlled at server start up.
- Improved TIME/TIMESTAMP/DATETIME Data Types:
- TIME/TIMESTAMP/DATETIME - Now allow microsecond level precision for more precise time/date comparisons and data selection.
- TIMESTAMP/DATETIME - Improves on 5.5. by allowing developers to assign the current timestamp, an auto-update value, or both, as the default value for TIMESTAMP and DATETIME columns, the auto-update value, or both.
- TIMESTAMP - Columns are now nullable by default. TIMESTAMP columns no longer get DEFAULT NOW() or ON UPDATE NOW() attributes automatically without them being explicitly specified and non-NULLable TIMESTAMP columns without explicit default value treated as having no default value.
- Better Condition Handling - GET DIAGNOSTICS
- MySQL 5.6 enables developers to easily check for error conditions and code for exceptions by introducing the new MySQL Diagnostics Area and corresponding GET DIAGNOSTICS interface command. The Diagnostic Area can be populated via multiple options and provides 2 kinds of information:
- Statement - which provides affected row count and number of conditions that occurred
- Condition - which provides error codes and messages for all conditions that were returned by a previous operation
The addressable items for each are:
The new GET DIAGNOSTICS command provides a standard interface into the Diagnostics Area and can be used via the CLI or from within application code to easily retrieve and handle the results of the most recent statement execution:
mysql> DROP TABLE test.no_such_table; ERROR 1051 (42S02): Unknown table 'test.no_such_table' mysql> GET DIAGNOSTICS CONDITION 1 -> @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT; mysql> SELECT @p1, @p2; +-------+------------------------------------+ | @p1 | @p2 | +-------+------------------------------------+ | 42S02 | Unknown table 'test.no_such_table' | +-------+------------------------------------+
- Improved IPv6 Support
- MySQL 5.6 improves INET_ATON() to convert and store string-based IPv6 addresses as binary data for minimal space consumption.
- MySQL 5.6 changes the default value for the bind-address option from "0.0.0.0" to "0::0" so the MySQL server accepts connections for all IPv4 and IPv6 addresses. You can learn more in the MySQL Documentation
- Improved Partitioning
- Improved performance for tables with large number of partitions - MySQL 5.6 now performs and scales on highly partitioned systems, specifically for INSERT operations that span upwards of hundreds of partitions.
- Import/export tables to/from partitioned tables - MySQL 5.6 enables users to exchange a table partition or sub-partition with a table using the ALTER TABLE ... EXCHANGE PARTITION statement; existing rows in a partition or subpartition can be moved to a non-partitioned table, and conversely, any existing rows in a non-partitioned table can be moved to an existing table partition or sub-partition. Learn more by visiting the MySQL Documentation.
- Explicit partition selection - MySQL 5.6 supports explicit selection of partitions and subpartitions that are checked for rows matching a given WHERE condition. Similar to automatic partition pruning, the partitions to be checked are specified/controlled by the issuer of the statement, and is supported for both queries and a number of DML statements (SELECT, DELETE, INSERT, REPLACE, UPDATE, LOAD DATA, LOAD XML). Learn more by visiting the MySQL Documentation
- Improved GIS: Precise spatial operations - MySQL 5.6 provides geometric operations via precise object shapes that conform to the OpenGIS standard for testing the relationship between two geometric values. Learn more by referencing the MySQL Documentation
MySQL 5.5 has been called the best release of MySQL ever. MySQL 5.6 builds on this by providing across the board improvements in performance, scalability, transactional throughput, availability and performance related instrumentation all designed to keep pace with requirements of the most demanding web, cloud and embedded use cases. MySQL 5.6 is now Generally Available and you can download the fully-functioning, production-ready product from the MySQL Developer Zone.
As stated earlier, this article introduces many of the key features in MySQL 5.6. For a complete picture, visit the MySQL Documentation.
As always, thanks for reading, and thanks for your continued support of MySQL!