Documentation Home
MySQL 5.5 Reference Manual
Related Documentation Download this Manual Excerpts from this Manual

MySQL 5.5 Reference Manual  /  ...  /  How MySQL Optimizes WHERE Clauses How MySQL Optimizes WHERE Clauses

This section discusses optimizations that can be made for processing WHERE clauses. The examples use SELECT statements, but the same optimizations apply for WHERE clauses in DELETE and UPDATE statements.

Some examples of queries that are very fast:

SELECT COUNT(*) FROM tbl_name;

SELECT MIN(key_part1),MAX(key_part1) FROM tbl_name;

SELECT MAX(key_part2) FROM tbl_name
  WHERE key_part1=constant;

SELECT ... FROM tbl_name
  ORDER BY key_part1,key_part2,... LIMIT 10;

SELECT ... FROM tbl_name
  ORDER BY key_part1 DESC, key_part2 DESC, ... LIMIT 10;

MySQL resolves the following queries using only the entries from a secondary index, if the indexed columns are numeric:

SELECT key_part1,key_part2 FROM tbl_name WHERE key_part1=val;

  WHERE key_part1=val1 AND key_part2=val2;

SELECT key_part2 FROM tbl_name GROUP BY key_part1;

The following queries use the index data to retrieve the rows in sorted order without a separate sorting pass:

SELECT ... FROM tbl_name
  ORDER BY key_part1,key_part2,... ;

SELECT ... FROM tbl_name
  ORDER BY key_part1 DESC, key_part2 DESC, ... ;

You might be tempted to rewrite your queries to make arithmetic operations faster, while sacrificing readability. Because MySQL does similar optimizations automatically, you can often avoid this work, and leave the query in a more understandable and maintainable form. Some of the optimizations performed by MySQL follow:


Because work on the MySQL optimizer is ongoing, not all of the optimizations that MySQL performs are documented here.

  • Removal of unnecessary parentheses:

       ((a AND b) AND c OR (((a AND b) AND (c AND d))))
    -> (a AND b AND c) OR (a AND b AND c AND d)
  • Constant folding:

       (a<b AND b=c) AND a=5
    -> b>5 AND b=c AND a=5
  • Constant condition removal (needed because of constant folding):

       (B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
    -> B=5 OR B=6
  • Constant expressions used by indexes are evaluated only once.

  • COUNT(*) on a single table without a WHERE is retrieved directly from the table information for MyISAM and MEMORY tables. This is also done for any NOT NULL expression when used with only one table.

  • Early detection of invalid constant expressions. MySQL quickly detects that some SELECT statements are impossible and returns no rows.

  • HAVING is merged with WHERE if you do not use GROUP BY or aggregate functions (COUNT(), MIN(), and so on).

  • For each table in a join, a simpler WHERE is constructed to get a fast WHERE evaluation for the table and also to skip rows as soon as possible.

  • All constant tables are read first before any other tables in the query. A constant table is any of the following:

    • An empty table or a table with one row.

    • A table that is used with a WHERE clause on a PRIMARY KEY or a UNIQUE index, where all index parts are compared to constant expressions and are defined as NOT NULL.

    All of the following tables are used as constant tables:

    SELECT * FROM t WHERE primary_key=1;
    SELECT * FROM t1,t2
      WHERE t1.primary_key=1 AND;
  • The best join combination for joining the tables is found by trying all possibilities. If all columns in ORDER BY and GROUP BY clauses come from the same table, that table is preferred first when joining.

  • If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from tables other than the first table in the join queue, a temporary table is created.

  • If you use the SQL_SMALL_RESULT option, MySQL uses an in-memory temporary table.

  • Each table index is queried, and the best index is used unless the optimizer believes that it is more efficient to use a table scan. At one time, a scan was used based on whether the best index spanned more than 30% of the table, but a fixed percentage no longer determines the choice between using an index or a scan. The optimizer now is more complex and bases its estimate on additional factors such as table size, number of rows, and I/O block size.

  • MySQL can sometimes produce query results using data from the index, without consulting the table data. If all columns used from the index are numeric, only the index data is used to resolve the query.

  • Before each row is output, those that do not match the HAVING clause are skipped.

Download this Manual
User Comments
  Posted by André Somplatzki on April 15, 2005
Indices lose their speed advantage when using them in OR-situations (4.1.10):

SELECT * FROM a WHERE index1 = 'foo'
SELECT * FROM a WHERE index2 = 'baar';

is much faster than

SELECT * FROM a WHERE index1 = 'foo' OR index2 = 'bar';
  Posted by Duncan Simpson on April 19, 2005
The query optimiser also does badly on examples like

SELECT id FROM foo WHERE bar IN (SELECT bar FROM baz WHERE qux='foo')

where foo is a large table and baz a small one. Doing the subselect first would allow the use on an index to eliminate most of foo, which is what happens if you say

SELECT FROM foo, baz WHERE and baz.qux='foo'

However the latter move is not possible if the first operation is DELETE and you want to avoid MySQL specific syntax (DELETE FROM .. USING .. WHERE ... also does the sensible thing but is not SQL-99).
  Posted by Henrik Grubbström on March 24, 2006
Indexes are ignored for the <> operator:

SELECT * FROM tab WHERE score <> 0;

This can be a problem if the table is very slanted (eg >99% of of the rows have the value that is filtered). The obvious workaround is to use a UNION:

(SELECT * FROM tab WHERE score > 0) UNION
(SELECT * FROM tab WHERE score < 0);

  Posted by Gints Plivna on September 19, 2008
SELECT * FROM tab WHERE score <> 0;
functionally IS NOT THE SAME AS
(SELECT * FROM tab WHERE score > 0) UNION
(SELECT * FROM tab WHERE score < 0);
because UNION filters out duplicates.
If one needs this one has to use UNION ALL.
  Posted by Pól Ua Laoínecháin on June 9, 2014
I think that Henrik implicitly assumed that the table (tab) has a primary key (and that score is not part of it), in which case he's correct.
  Posted by Tim Yan on May 10, 2015
Sometime this statement is not true:

SELECT * FROM a WHERE index1 = 'foo'
SELECT * FROM a WHERE index2 = 'baar';

is much faster than

SELECT * FROM a WHERE index1 = 'foo' OR index2 = 'bar';

If you have many records with 'foo' or 'baar', to satisfy your union, mysql has to pull all 'foo' and 'baar' records and do the 'union' merge. If you have a 5 million records with 3 million 'foo' and 2 'baar', your query will be very slow, much worse than the OR statement.

When your table is very flat (many columns), the problem can be worse since the temp table used for the "union" can be huge.

It can also be an issue when you try to pull "first 100" records using "limit". MySQL has to pull all records before it can do the UNION then pick the first 100 records for you. A table scan can actually stop earlier when MySQL finds enough result.

Most people will consider it a rare case. Think about a search on Ip addresses in your web log table, you may find surprisingly how many records matching one Ip address (or range).

Also, the Union query may generate different results. You need UNION ALL and this:

SELECT * FROM a WHERE index1 = 'foo'
SELECT * FROM a WHERE index2 = 'baar' and index1 !='foo'

Sign Up Login You must be logged in to post a comment.