MySQL 5.5 is GA - Interview with Tomas Ulin
Oracle Magazine features editor Rich Schwerin interviews Tomas Ulin, Vice President, MySQL Engineering, who answers questions about the GA release of MySQL 5.5.
Q: Tomas, congratulations on releasing MySQL 5.5 GA! Tell us how MySQL 5.5 fits in with Oracle's strategy?
A: Oracle said it would invest in MySQL and make it better, the release of MySQL 5.5 GA illustrates that Oracle walks the talk. Oracle invests in MySQL because it has become the database of choice for web-based applications, and it complements very well Oracle's portfolio. MySQL helps Oracle deliver on its strategy to provide complete solutions to its customers, with best of breed components across the entire stack, from applications to disk. Plus, with MySQL Oracle now provides the most complete LAMP stack in the industry including Oracle Linux and Oracle's Unbreakable Enterprise Kernel, Oracle VM virtualization, and of course Apache and PHP. The performance & scalability improvements of MySQL 5.5 make MySQL an even better fit for demanding web-based applications.
Q: Can you explain why you made InnoDB the MySQL default storage engine and how Oracle has unified engineering efforts for MySQL and InnoDB?
A: InnoDB is the most popular storage engine for MySQL, and being able to have these two teams joined together the way they should have been from the beginning is really great. I think that already in MySQL 5.5 we're seeing some amazing performance improvements as a result of these two teams coming together. Something we decided right away, when we got together just after the acquisition was completed, was to make InnoDB our default storage engine. This is now done in MySQL 5.5, and it makes MySQL a true transactional ACID-compliant database by default. You may think that one could have used it anyway simply selecting it as storage engine, but the reality is that most people run MySQL out of the box, so we make it more convenient for them to get all the benefits InnoDB brings out of the box. This is especially important considering that most of the performance & scalability enhancements have been made within InnoDB. It's also an acknowledgement that InnoDB is part of the core of the product rather than some add-on that you choose. This is a major change as well.
Q: When speaking about performance, isn't recovery speed also an important point to consider?
A: Indeed. One of the key things around performance and availability is how fast you can recover and get your applications back up if you need to restart the database after unplanned downtime. We've optimized how InnoDB manages transactional data in order to significantly improve recovery times. The speed varies depending on the kinds and complexity of transactions running during a backup and when a restart actually happens. We are seeing customers with tenfold improvements in recovery times with MySQL 5.5 compared to 5.1, with some reporting up to 40 times faster recovery times. Obviously, we are very pleased with these results and expect 5.5 users to be happy as well.
Q: Tell us specifically about multiple buffer pool instances and rollback segments and the like…
A: We're talking about two types of scale here. The first one is when you add more client connections on top of the database, going from 5 to 10 to 100 to 1,000 connections simultaneously. The other one is when you move the database from a single-port machine to a dual-port, quad-port or 16-port. So in the first case, what you want to see is that as you add more connections to the database, you get the best performance. But then as you keep adding more connections and basically saturate it, the performance should not degrade. So if it peaks at 1,000, it continues to give you 1,000 in throughput whether you put 10 connections or 1,000 connections there. It has been great teamwork between the server team and the InnoDB team in terms of finding where the biggest bottlenecks were and removing them, one by one. And of course, one way to remove bottlenecks is to split different types of large resources shared across threads into smaller pieces such that each smaller piece is shared among fewer threads, thus giving less contention and removing the bottleneck. E.g. instead of having one big shared memory buffer pool in InnoDB, we've split it into many smaller, such that each pool thus gets less contended as many threads attempt to access the data in the memory pools. We've done such changes down from the very core of InnoDB code up into the upper layers of query execution. This has enabled us to achieve these great results with 5.5.
Regarding the second point, or type of scale, we've really improved the scalability on multi-core machines. If you look specifically at the 16-core machine, we're about three times faster on a 16-core machine than we were with 5.1.
Q: Still speaking about performance improvements, what about metadata locking within transactions?
A: Yes, the upper layer in the software handles the metadata, the metadata locking and so on, and there are a number of mutexes there. A very well known mutex, for those who are into the MySQL code, is called "lock open". The issue was that the "lock open," mutex was locking too large segments in the code base, hence we've splitted the "lock open" mutex into several mutexes so that they are being less contended, each one of them. And then there are a number of less known mutexes that we've eliminated from the code, such as "lock alarm" and "lock thread count", which were not really needed.
Q: And what about performance on Windows?
A: On Windows we have done some extra work, since a lot of the code originates from a UNIX environment and the Windows environment is somewhat different, specifically when it comes to mutexes and thread handling. So we have made changes to the code considering how Windows operates with mutexes and threads. This was really a distinct undertaking to specifically improve performance on Windows. Obviously we're extremely happy about the results. While we already obtained amazing performance improvements on Linux, +200% over 5.1 for Read Only and +370% for Read Write, at scale, the improvements we achieved on Windows are even more impressive: More than 500% performance improvement over 5.1 for Read Only operations at scale and over 1500% for Read/writes ones. This is huge.
Q: Can you walk us through some specific improvements in MySQL 5.5 around availability?
A: Yes. MySQL 5.5 introduces semi-synchronous replication to ensure data consistency and redundancy between master and at least one slave in the calling chain. In this configuration, a master and any number of its replica slaves are configured so that at least one slave in the replication topology must acknowledge updates have been received and written to its relay log before the parent master commits the transaction. In the event of a time-out, the originating master will temporarily switch to asynchronous replication until at least one of the slaves set up for semi-synchronous replication catches up. It actually originates from a patch that was contributed by Google. It was more custom-made towards InnoDB and we made it as a plug-in which is more flexible.
The second improvement is Replication Heartbeat. MySQL 5.5 provides a new replication heartbeat option that helps users know immediately when replication stops working. The heartbeat is a message sent at regular intervals from a master node to slave nodes. The slave can be configured to automatically check connection and message status; if the message is not received by the slave, it knows that a connection to the master node has failed in some way.
Q: What about improved usability in MySQL 5.5?
A: A key new usability feature is the SIGNAL-RESIGNAL support. It's being used to handle error conditions inside stored procedures. It has been a requested feature for quite some time, basically since stored procedures came out. We've also done a number of improvements in partitioning, offering more partitioning options so you have more flexibility to optimize queries.
There's also filtering on replication, so you can have more options around filtering, selecting which tables and so on. Another feature that we've added is slave-side precise data type conversions between masters and slaves for both statement-based and row-based replication. This means that column types can be different in master and slave tables as long as the underlying data has high-level compatibility, INT to TINYINT for example. This makes upgrading your schema across masters and slaves much easier as MySQL replication handles compatible conversions within integer, decimal, string, binary, BIT, ENUM and SET domains for you online, so you can upgrade masters and begin replicating to its slaves without the need to immediately upgrade the slaves.
Finally we've introduced Performance Schema to help users get much better information about MySQL performance. Previously when you wanted to get into the details of server performance you often had to rely on custom, unstructured ways of obtaining this type of information, like "Show InnoDB status" for example. And you would then just get text, which you could parse, and so on. What we've done with Performance Schema is create a schema where you can do regular select queries and ask about different parameters in the database, such as how you are doing with buffers and outstanding transactions for example.
Q: Tomas, you referred to performance benchmarks over 5.1 earlier, can you come back on those results and provide some more details?
A: Sure. We've run benchmarks at scale with about 1,000 database connections, both Read Only and Read/Write. Previously with MySQL 5.1 you would see performance drop as you added connections, but that's no longer the case with MySQL 5.5. We've managed to sustain the performance when scaling. On Linux you see, at scale with over 1000 connections, 200% performance improvement over 5.1 for Read Only operations, and about 370 % performance for Read/Write ones. On Windows, the performance improvements are even more impressive as I mentioned earlier due to the specific optimization we've performed. We're seeing at scale over 500% improvement over 5.1 for Read Only operations. So overall, both peak performance and scalability have gone up significantly with MySQL 5.5.
Q: Are there benchmarks that end users can run that are provided by MySQL or the community?
A: Yes, sysbench, the one that we normally run, is publicly available. We run DBT2 as well, which is also publicly available.
Thanks for your time, Tomas. Congratulations again on releasing MySQL 5.5 GA!