The MySQL query optimizer uses estimated
statistics about key
distributions to choose the indexes for an execution plan, based
on the relative
selectivity of the index.
Certain operations cause InnoDB to sample random pages from each
index on a table to estimate the
cardinality of the index.
(This technique is known as
random dives.) These
operations include the
TABLE statement, the
STATUS statement, and accessing the table for the first
time after a restart.
To give you control over the quality of the statistics estimate
(and thus better information for the query optimizer), you can now
change the number of sampled pages using the parameter
Previously, the number of sampled pages was always 8, which could
be insufficient to produce an accurate estimate, leading to poor
index choices by the query optimizer. This technique is especially
important for large tables and tables used in
full table scans for
such tables can be a substantial performance issue. See
Section 22.214.171.124, “How to Avoid Full Table Scans” for tips on tuning such
You can set the global parameter
at runtime. The default value for this parameter is 8, preserving
the same behavior as in past releases.
The value of
affects the index sampling for all InnoDB
tables and indexes. There are the following potentially
significant impacts when you change the index sample size:
Small values like 1 or 2 can result in very inaccurate estimates of cardinality.
value might require more disk reads. Values much larger
than 8 (say, 100), can cause a big slowdown in the time it
takes to open a table or execute
The optimizer might choose very different query plans based on different estimates of index selectivity.
To disable the cardinality estimation for metadata statements such
SHOW TABLE STATUS, execute the statement
SET GLOBAL innodb_stats_on_metadata=OFF (or
0). The ability to set this option dynamically
is also relatively new.
All InnoDB tables are opened, and the statistics are re-estimated for all associated indexes, when the mysql client starts if the auto-rehash setting is set on (the default). To improve the start up time of the mysql client, you can turn auto-rehash off. The auto-rehash feature enables automatic name completion of database, table, and column names for interactive users.
Whatever value of
works best for a system, set the option and leave it at that
value. Choose a value that results in reasonably accurate
estimates for all tables in your database without requiring
excessive I/O. Because the statistics are automatically
recalculated at various times other than on execution of
ANALYZE TABLE, it does not make
sense to increase the index sample size, run
ANALYZE TABLE, then decrease sample
size again. The more accurate statistics calculated by
ANALYZE running with a high value of
can be wiped away later.
Although it is not possible to specify the sample size on a
per-table basis, smaller tables generally require fewer index
samples than larger tables do. If your database has many large
tables, consider using a higher value for
than if you have mostly smaller tables.