MySQL Blog Archive
For the latest blogs go to blogs.oracle.com/mysql
A Multi-Table Trick to Speed up Single-Table UPDATE/DELETE Statements

In MySQL, query optimization of single-table UPDATE/DELETE statements is more limited than for SELECT statements. I guess the main reason for this is to limit the optimizer overhead for very simple statements. However, this also means that optimization opportunities are sometimes missed for more complex UPDATE/DELETE statements.

Example

Using the DBT-3 database, the following SQL statement will increase prices by 10% on parts from suppliers in the specified country:

Visual EXPLAIN in MySQL Workbench shows that the optimizer will choose the following execution plan for this UPDATE statement:

update-subquery

That is, for every row in the part table, MySQL will check if this part is supplied by a supplier of the requested nationality.

Consider the following similar SELECT statement:

In MySQL 5.6, the query optimizer will apply semi-join transformation to this query. Hence, the execution plan is quite different from the similar UPDATE statement:

select-semijoin

As you can see, there is no sub-query in this plan. The query has been transformed into a three-way join. The great advantage of this semi-join transformation is that the optimizer is now free to re-arrange the order of the tables to be joined. Instead of having to go through all 179,000 parts, it will now start with the estimated 414 suppliers from the given country and find all parts supplied by them. This is obviously more efficient, and it would be good if MySQL would use the same approach for the UPDATE statement.

The Multi-Table Trick

Unlike single-table UPDATE statements, the MySQL Optimizer will use all available optimizations for multi-table UPDATE statements. This means that by rewriting the query as follows, the semi-join optimizations will apply:

Notice the extra dummy table in the first line. Here is what happens when I execute the single-table and multi-table variants on a DBT-3 database (scale factor 1):

As you can see, execution time is reduced from 2.63 seconds to 0.40 seconds by using this trick. (I had executed both statements several times before, so the reported execution times are for a steady state with all accessed data in memory.)

Multi-Table DELETE

The same trick can be used for DELETE statements. Instead of the single-table variant,

you can use the equivalent multi-table variant:

This rewrite gives a similar performance improvement as reported for the above UPDATE statement.