The MySQL query optimizer uses estimated statistics about key
distributions to choose the indexes for an execution plan, based
on the relative
selectivity of the index.
Certain operations cause InnoDB to sample random pages from each
index on a table to estimate the
cardinality of the index.
(This technique is known as
random dives.) These
operations include the
TABLE statement, the
STATUS statement, and accessing the table for the first
time after a restart.
To give you control over the quality of the statistics estimate
(and thus better information for the query optimizer), you can now
change the number of sampled pages using the parameter
Previously, the number of sampled pages was always 8, which could
be insufficient to produce an accurate estimate, leading to poor
index choices by the query optimizer. This technique is especially
important for large tables and tables used in
full table scans for
such tables can be a substantial performance issue.
You can set the global parameter
runtime. The default value for this parameter is 8, preserving the
same behavior as in past releases.
The value of
affects the index sampling for all tables
and indexes. There are the following potentially significant
impacts when you change the index sample size:
Small values like 1 or 2 can result in very inaccurate estimates of cardinality.
value might require more disk reads. Values much larger
than 8 (say, 100), can cause a big slowdown in the time it
takes to open a table or execute
The optimizer might choose very different query plans based on different estimates of index selectivity.
To disable the cardinality estimation for metadata statements such
SHOW TABLE STATUS, execute the statement
SET GLOBAL innodb_stats_on_metadata=OFF (or
0). The ability to set this option dynamically
is also relatively new.
All InnoDB tables are opened, and the statistics are re-estimated
for all associated indexes, when the
client starts if the auto-rehash setting is set on (the default).
To improve the start up time of the
client, you can turn auto-rehash off. The auto-rehash feature
enables automatic name completion of database, table, and column
names for interactive users.
Whatever value of
best for a system, set the option and leave it at that value.
Choose a value that results in reasonably accurate estimates for
all tables in your database without requiring excessive I/O.
Because the statistics are automatically recalculated at various
times other than on execution of
TABLE, it does not make sense to increase the index
sample size, run
then decrease sample size again. The more accurate statistics
ANALYZE running with a high value
be wiped away later.
Although it is not possible to specify the sample size on a
per-table basis, smaller tables generally require fewer index
samples than larger tables do. If your database has many large
tables, consider using a higher value for
innodb_stats_sample_pages than if
you have mostly smaller tables.