Storage engines collect statistics about tables for use by the optimizer. Table statistics are based on value groups, where a value group is a set of rows with the same key prefix value. For optimizer purposes, an important statistic is the average value group size.
MySQL uses the average value group size in the following ways:
To estimate how may rows must be read for each
To estimate how many row a partial join will produce; that is, the number of rows that an operation of this form will produce:
As the average value group size for an index increases, the index is less useful for those two purposes because the average number of rows per lookup increases: For the index to be good for optimization purposes, it is best that each index value target a small number of rows in the table. When a given index value yields a large number of rows, the index is less useful and MySQL is less likely to use it.
The average value group size is related to table cardinality,
which is the number of value groups. The
SHOW INDEX statement displays a
cardinality value based on
N is the number of rows in the table
S is the average value group
size. That ratio yields an approximate number of value groups in
For a join based on the
NULL is not treated differently
from any other value:
NULL <=> NULL,
for any other
However, for a join based on the
NULL is different from
is not true when
expr2 (or both) are
NULL. This affects
ref accesses for comparisons
of the form
: MySQL will not access
the table if the current value of
because the comparison cannot be true.
= comparisons, it does not matter how
NULL values are in the table. For
optimization purposes, the relevant value is the average size of
NULL value groups. However, MySQL
does not currently enable that average size to be collected or
MyISAM tables, you have some control over
collection of table statistics by means of the
variable. This variable has three possible values, which differ
NULLvalues are treated as identical (that is, they all form a single value group).
NULLvalue group size is much higher than the average non-
NULLvalue group size, this method skews the average value group size upward. This makes index appear to the optimizer to be less useful than it really is for joins that look for non-
NULLvalues. Consequently, the
nulls_equalmethod may cause the optimizer not to use the index for
refaccesses when it should.
NULLvalues are not considered the same. Instead, each
NULLvalue forms a separate value group of size 1.
If you have many
NULLvalues, this method skews the average value group size downward. If the average non-
NULLvalue group size is large, counting
NULLvalues each as a group of size 1 causes the optimizer to overestimate the value of the index for joins that look for non-
NULLvalues. Consequently, the
nulls_unequalmethod may cause the optimizer to use this index for
reflookups when other methods may be better.
NULLvalues are ignored.
If you tend to use many joins that use
<=> rather than
NULL values are not special in comparisons
NULL is equal to another. In this
nulls_equal is the appropriate
variable has global and session values. Setting the global value
MyISAM statistics collection for all
MyISAM tables. Setting the session value
affects statistics collection only for the current client
connection. This means that you can force a table's statistics
to be regenerated with a given method without affecting other
clients by setting the session value of
MyISAM table statistics, you
can use any of the following methods:
Some caveats regarding the use of
You can force table statistics to be collected explicitly, as just described. However, MySQL may also collect statistics automatically. For example, if during the course of executing statements for a table, some of those statements modify the table, MySQL may collect statistics. (This may occur for bulk inserts or deletes, or some
ALTER TABLEstatements, for example.) If this happens, the statistics are collected using whatever value
myisam_stats_methodhas at the time. Thus, if you collect statistics using one method, but
myisam_stats_methodis set to the other method when a table's statistics are collected automatically later, the other method will be used.
There is no way to tell which method was used to generate statistics for a given
myisam_stats_methodapplies only to
MyISAMtables. Other storage engines have only one method for collecting table statistics. Usually it is closer to the