MySQL Query Analyzer: Interview with Mark Matthews
MySQL Query Analyzer
We talked with Mark Matthews, one of the architects of the MySQL Query Analyzer to give us an inside view into the product.
[Note: MySQL Enterprise is our commercial subscription offering. And, MySQL Enterprise Monitor and MySQL Query Analyzer are only available with this commercial subscription.]
Q. Hi Mark, give us a little background about yourself and tell us what you do at MySQL.
I have been using MySQL since 1996 and I started an open source project to develop a JDBC driver for MySQL which became the Connector/J driver. Before joining MySQL, I was a software consultant implementing web-based applications for retail, financial, health and publishing companies. I joined MySQL in 2002 and headed up the MySQL Connectors development team (JDBC, ODBC, ADO .Net.). Recently, I joined the MySQL Enterprise development team as an architect. Our team is responsible for MySQL Enterprise Monitor and the MySQL Proxy.
Q. Before we get to the new MySQL Query Analyzer, can you talk about why MySQL Performance Tuning is described as a "black art"?
Performance tuning is a core responsibility of DBAs and Developers. There is a feeling that performance tuning MySQL is some kind of black art. People feel that there is a club of performance tuners and you want to get in and learn what they have learned. Many are very good, having learned through experience and experimentation. They know their application, their architecture, and their performance requirements. They are willing to dive into the MySQL Server source code. They are willing to crawl across their entire application stack to see what is going on. They build a sort of mental model of how their applications work with MySQL. This involves a lot of manual effort and custom scripts.
Our goal is for the MySQL Query Analyzer to provide easy-to-use tools for DBAs and Developers and minimize the manual effort and "black art".
Q. To better understand the MySQL Query Analyzer, lets first talk about MySQL Proxy. How does MySQL Proxy fit into a MySQL environment?
When we were planning the next set of features for MySQL Enterprise Monitor, what we focused on was that nobody could see, especially across a large farm of MySQL servers, what queries were running at what time in one concise place. So we looked at how we could solve that problem. And that is how MySQL Proxy was born. The MySQL Proxy is being developed by Jan Kneschke and Kay Roepke as a generalized slicer and dicer of queries and result sets.
MySQL Query Analyzer uses Proxy to capture aggregate metadata about queries flowing through it and send that information to a repository for analysis. The rate at which people process data in MySQL sometimes makes it impossible to see what it is happening with existing tools, you can only observe the effects caused by the statements flowing through it. You have your SHOW STATUS snapshots and you graph them and you try to predict what the server is actually doing based on these factors you can measure. There has always been this desire to measure other things that have just have not been exposed by the MySQL Server.
That is where MySQL Proxy comes in. We can start to get in the middle of a query stream. With MySQL Query Analyzer we are watching from the sideline and capturing things that the MySQL server does not give you. For example, we capture every single query that flows through it in a way that we don’t touch it very much, unlike the general query log and the amount of I/O it produces, and measure things like result set sizes and row counts. Those are things that just aren’t available from the MySQL server itself for every query, but are very interesting to a DBA who is running MySQL in production.
Like everyone else, we are looking forward to the MySQL Server providing such performance metrics in a future version. However, if you don’t want to wait, you can use MySQL Proxy to get such performance metrics today for MySQL 4.1, 5.0 and 5.1.
Q. What does the MySQL Query Analyzer do and what types of metrics are monitored by MySQL Query Analyzer?
What the MySQL Query Analyzer does is capture every query, times them and reports them in aggregate, back up to the Service Manager repository. You can then use this data to do all kinds of interesting things. You can roll up these statistics across all your servers, or across groups of servers that we have automatically detected in replication topologies. You can sort by query type (
CALL, etc). You can search through them using regular expressions, you can see the minimum, maximum, average and distributions of execution times, execution counts, row and update counts, result set sizes, and actual amount of data transferred and you can even bring up
EXPLAIN plans of queries that have been captured without leaving the application.
Q. How does the MySQL Query Analyzer work?
There are a few moving parts. You have the MySQL Server and you have the Agent which you have always had if you use the MySQL Enterprise Monitor. The difference now is that the Agent has plug-ins. One of the plug-ins will load an instance of the MySQL Proxy with some scripts we have written that enable the MySQL Query Analyzer features. The Proxy runs and collects information. You basically have to redirect your application to connect to the Proxy port which by default is 4040. The MySQL Proxy is transparent to the application other than sending the queries to a different port. So the queries get looked at by the Proxy (but are not changed in any way) and forwarded on to MySQL.
We normalize the queries (similar to a prepared statement form) into a "class" of query inside the Proxy. When the results come back we count the rows and amount of data we have transferred for a given query "class" and aggregate them by "class" and store those aggregated counters locally in memory.
That data gets periodically pushed up to the Service Manager. That is where all the information is put into a repository, which enables all the powerful reporting with aggregates across servers, groups of servers and replication topologies.
Q. DBAs are always worried about optimizing performance. How does MySQL Query Analyzer help DBAs?
There have always been a set of tools that DBAs could bring to bear to solve performance problems with MySQL. Tools like Cacti, Ganglia, vmsat, the slow and general query logs, top, mytop and Innotop are in every MySQL DBA’s toolbox. However, they’re not integrated. It usually takes a lot of manual effort to either script these things together, or do ad-hoc analysis. However, with the MySQL Query Analyzer, the DBA can take a look at operating system and MySQL performance counters and correlate them with the queries that are flowing through their databases at the same time to diagnose performance issues, all in one place. It’s a big time saver.
Q. How is MySQL Query Analyzer different from using the Slow Query Log and custom scripts?
One of the things that makes MySQL Query Analyzer interesting is that it captures total execution time of total queries. We actually time queries at the microsecond level. You can capture queries that don’t take very long (e.g. in the millisecond or sub-millisecond range), which you would never want to turn the Slow Query Log on for, because that would end up capturing all queries. Since we don’t do any I/O to write each of those queries to disk, instead we operate in memory and just aggregate them and don’t hold on to them, you can get some interesting statistics.
We have a customer that is using MySQL Query Analyzer to test a application that was ready to roll out. Turns out, the application issued the same query 400,000 times per minute and it didn’t have to execute that query, yet it was consuming resources. That query would never have showed up in the Slow Query Log and they would never have known they had that overhead in their application unless they had a tool like this.
[Note: MySQL 5.1 does have microsecond slow query log]
Q. Any other exciting features that you would like to tell us about?
We have a column called "First Seen", which says that this is the first time that this query was seen by the MySQL Query Analyzer for the time span you selected. If you sort that by newest, when you see new queries pop up there, one of two things happened. Maybe somebody is trying to do a SQL injection attack on your application. As a seasoned DBA you know the family of queries the applications uses and you can see something that is odd. Or, you have a sneaky development team and they rolled out a new version of the application into production and they didn’t tell you. That has bitten a number of our customers as well.
Q. Can you share any additional use cases where the MySQL Query Analyzer could have a big impact?
The MySQL Query Analyzer is very useful for development and testing. Most developers don’t know exactly what queries their application are issuing and in what quantity. A lot of times they aren’t the queries you issue, rather they are the queries the infrastructure underneath you issues. For example, if you are using an ORM, or if you are using some kind of framework that is doing persistence for you, or if you are integrating modules from other applications that use the database, you would be pretty surprised some of the things they do that aren’t obvious from the outside.
Q. Developers often have to use 3rd party products which they do not control. How does MySQL Query Analyzer help Developers?
We have used MySQL Query Analyzer to tune MySQL Enterprise Monitor application itself. For example, we found out that there was a 3rd party component we were using which issued "
SET AUTOCOMMIT ..." way too frequently. This single statement was using up 33% of the total time that our application was interacting with the database. Once we fixed that issue, then other queries of course bubbled up to the top that we fixed. Sometimes Hibernate would generate queries in a form that wasn’t well optimized by MySQL, in some cases we could rewrite the HQL that Hibernate used, drop to "raw" SQL or more often than not just realized the query was a good candidate for caching in-memory.
Q. Would it be possible to use MySQL Query Analyzer to identify queries that would be good candidates for caching?
Whenever I do a performance tuning talk I ask this question, "What is the cheapest query you can ever execute?" Its kind of a trick question, but the answer is, "The one you don’t execute." There are different caching technologies out there – Memcached, Hibernate with it’s second level caches, etc.
How do you identify queries that are good candidates to cache? How do you know there is a query that your application runs 20,000 times a minute that it doesn’t really have to? Queries like that are not going to show up in the slow query log. With MySQL Query Analyzer they pop up at the top of the list and tell you "Hey, look at me! I’m a big time consumer" even though a single execution may be very, very short. The DBA or developer can ask themselves the question, "Overall, what is using up the most time in my MySQL database". This will help them better determine which queries they should be caching.
Q. What has the feedback been from customers?
We have been pleasantly surprised by the customers that have tried MySQL Query Analzyer. A lot of them have some form of this in the form of custom scripts. Our feedback indicates that they are finding things they did not know were there. Like the customer I mentioned earlier who found the 400,000 queries/minute that was going into production. We had another customer tell us, "This is the kind of stuff we were building anyway in a limited fashion, and now I don’t have to because its part of my MySQL Enterprise subscription. I can have my developers work on stuff that really matters to my business." We have taken the drudgery out of their work and allowed them to focus on the hard-core business and technical problems.
Q. What future enhancements can users look forward to?
In a future release, you will be able select a spike in a graph and drill into it further. So, you are graphing CPU, I/O, database activity, you highlight it and can go to a page where you have all the query analysis results that are filtered based on the time window you specified. I think that is a very powerful feature.
We would also like to have the MySQL Query Analyzer correlate with Events, so when you see Events in the MySQL Enterprise Monitor (e.g. lot of table scans, or a lot of index scans, or the query cache is getting invalidated a lot), you can look at that Event and click on it and be shown what queries were running at that same time.
One thing we don’t have is per query I/O counters in MySQL. But what you can do with the graphs is graph I/O usage on the system as a whole, and when you see a spike in I/O you can see what queries were running at the time, and then through looking at queries in that time window and EXPLAINs you can pretty quickly drill into which query was causing that I/O.
Q. Are you exploring ways to use MySQL Query Analyzer without needing Proxy middleware?
We don’t want MySQL Proxy to be the only source of the metrics we are collecting. How you get this data into the Service Manager is pretty flexible and is based on REST. There are some other sources we want to explore. We would like to consume log data, we would like to consume the slow query log, or the general query log, even if it is sampled from time to time, so that you don’t have to use MySQL Proxy to get this information.
Also, there are ways to instrument the MySQL Connectors to get this information, or to instrument your application. This would add extra information to send along that would be useful for the MySQL Query Analyzer.
That said, from the feedback we are getting, MySQL Proxy works for most people
Q. How can users start taking a look at the MySQL Query Analyzer?
We just released the MySQL Query Analyzer as part of the MySQL Enterprise Trial. It’s a 30 day trial, and anyone can try it – you don’t have to be a MySQL Enterprise customer.
Q. Any parting words you want to share?
Its been a while since I have been a DBA, but I have been in development for a long time. I would like to invite developers to try this tool. It will benefit DBAs and Developers You can learn a lot about your application using this tool. I know our team did. There were quite a few surprises.
Our application (MySQL Enterprise Monitor) is pretty much like any other Java application (e.g. uses Spring, Hibernate and Apache DBCP). When I have shown what early implementations of the MySQL Query Analyzer did at conferences, people look up and say, "I had no idea. My applications are probably doing the exact same thing and I am having performance problems because of it and just don’t know." DBAs and Developers definitely need MySQL Query Analyzer.