ANALYZE [NO_WRITE_TO_BINLOG | LOCAL]
    TABLE tbl_name [, tbl_name] ...
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL]
    TABLE tbl_name
    UPDATE HISTOGRAM ON col_name [, col_name] ...
        [WITH N BUCKETS]
    [{MANUAL | AUTO} UPDATE]
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL] 
    TABLE tbl_name
    UPDATE HISTOGRAM ON col_name [USING DATA 'json_data']
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL]
    TABLE tbl_name
    DROP HISTOGRAM ON col_name [, col_name] ...
        ANALYZE TABLE generates table
        statistics:
- ANALYZE TABLEwithout any- HISTOGRAMclause performs a key distribution analysis and stores the distribution for the named table or tables. For- MyISAMtables,- ANALYZE TABLEfor key distribution analysis is equivalent to using myisamchk --analyze.
- ANALYZE TABLEwith the- UPDATE HISTOGRAMclause generates histogram statistics for the named table columns and stores them in the data dictionary. Only one table name is permitted with this syntax. MySQL also supports setting the histogram of a single column to a user-defined JSON value.
- ANALYZE TABLEwith the- DROP HISTOGRAMclause removes histogram statistics for the named table columns from the data dictionary. Only one table name is permitted for this syntax.
        This statement requires SELECT
        and INSERT privileges for the
        table.
      
        ANALYZE TABLE works with
        InnoDB, NDB, and
        MyISAM tables. It does not work with views.
      
        If the innodb_read_only system
        variable is enabled, ANALYZE
        TABLE may fail because it cannot update statistics
        tables in the data dictionary, which use
        InnoDB. For ANALYZE
        TABLE operations that update the key distribution,
        failure may occur even if the operation updates the table itself
        (for example, if it is a MyISAM table). To
        obtain the updated distribution statistics, set
        information_schema_stats_expiry=0.
      
        ANALYZE TABLE is supported for
        partitioned tables, and you can use ALTER TABLE ...
        ANALYZE PARTITION to analyze one or more partitions;
        for more information, see Section 15.1.9, “ALTER TABLE Statement”, and
        Section 26.3.4, “Maintenance of Partitions”.
      
        During the analysis, the table is locked with a read lock for
        InnoDB and MyISAM.
      
        By default, the server writes ANALYZE
        TABLE statements to the binary log so that they
        replicate to replicas. To suppress logging, specify the optional
        NO_WRITE_TO_BINLOG keyword or its alias
        LOCAL.
          ANALYZE TABLE returns a result
          set with the columns shown in the following table.
| Column | Value | 
|---|---|
| Table | The table name | 
| Op | analyzeorhistogram | 
| Msg_type | status,error,info,note, orwarning | 
| Msg_text | An informational message | 
          ANALYZE TABLE without either
          HISTOGRAM clause performs a key
          distribution analysis and stores the distribution for the
          table or tables. Any existing histogram statistics remain
          unaffected.
        
If the table has not changed since the last key distribution analysis, the table is not analyzed again.
MySQL uses the stored key distribution to decide the order in which tables should be joined for joins on something other than a constant. In addition, key distributions can be used when deciding which indexes to use for a specific table within a query.
          To check the stored key distribution cardinality, use the
          SHOW INDEX statement or the
          INFORMATION_SCHEMA
          STATISTICS table. See
          Section 15.7.7.23, “SHOW INDEX Statement”, and
          Section 28.3.34, “The INFORMATION_SCHEMA STATISTICS Table”.
        
          For InnoDB tables,
          ANALYZE TABLE determines index
          cardinality by performing random dives on each of the index
          trees and updating index cardinality estimates accordingly.
          Because these are only estimates, repeated runs of
          ANALYZE TABLE could produce
          different numbers. This makes ANALYZE
          TABLE fast on InnoDB tables but
          not 100% accurate because it does not take all rows into
          account.
        
          You can make the
          statistics collected by
          ANALYZE TABLE more precise and
          more stable by enabling
          innodb_stats_persistent, as
          explained in Section 17.8.10.1, “Configuring Persistent Optimizer Statistics Parameters”. When
          innodb_stats_persistent is
          enabled, it is important to run ANALYZE
          TABLE after major changes to index column data, as
          statistics are not recalculated periodically (such as after a
          server restart).
        
          If innodb_stats_persistent is
          enabled, you can change the number of random dives by
          modifying the
          innodb_stats_persistent_sample_pages
          system variable. If
          innodb_stats_persistent is
          disabled, modify
          innodb_stats_transient_sample_pages
          instead.
        
          For more information about key distribution analysis in
          InnoDB, see
          Section 17.8.10.1, “Configuring Persistent Optimizer Statistics Parameters”, and
          Section 17.8.10.3, “Estimating ANALYZE TABLE Complexity for InnoDB Tables”.
        
          MySQL uses index cardinality estimates in join optimization.
          If a join is not optimized in the right way, try running
          ANALYZE TABLE. In the few cases
          that ANALYZE TABLE does not
          produce values good enough for your particular tables, you can
          use FORCE INDEX with your queries to force
          the use of a particular index, or set the
          max_seeks_for_key system
          variable to ensure that MySQL prefers index lookups over table
          scans. See Section B.3.5, “Optimizer-Related Issues”.
          ANALYZE TABLE with the
          HISTOGRAM clause enables management of
          histogram statistics for table column values. For information
          about histogram statistics, see
          Section 10.9.6, “Optimizer Statistics”.
        
These histogram operations are available:
- ANALYZE TABLEwith an- UPDATE HISTOGRAMclause generates histogram statistics for the named table columns and stores them in the data dictionary. Only one table name is permitted for this syntax.- The optional - WITHclause specifies the number of buckets for the histogram. The value of- NBUCKETS- Nmust be an integer in the range from 1 to 1024. If this clause is omitted, the number of buckets is 100.- The optional - AUTO UPDATEclause enables automatic updates of histograms on the table. When enabled, an- ANALYZE TABLEstatement on this table automatically updates the histogram, using the same number of buckets as last specified by- WITH ... BUCKETSif this was previously set for this table. In addition, when recalculating persistent statistics for the table (see Section 17.8.10.1, “Configuring Persistent Optimizer Statistics Parameters”), the- InnoDBbackground statistics thread also updates the histogram.- MANUAL UPDATEdisables automatic updates, and is the default setting if not specified.
- ANALYZE TABLEwith a- DROP HISTOGRAMclause removes histogram statistics for the named table columns from the data dictionary. Only one table name is permitted for this syntax.
Stored histogram management statements affect only the named columns. Consider these statements:
ANALYZE TABLE t UPDATE HISTOGRAM ON c1, c2, c3 WITH 10 BUCKETS;
ANALYZE TABLE t UPDATE HISTOGRAM ON c1, c3 WITH 10 BUCKETS;
ANALYZE TABLE t DROP HISTOGRAM ON c2;
          The first statement updates the histograms for columns
          c1, c2, and
          c3, replacing any existing histograms for
          those columns. The second statement updates the histograms for
          c1 and c3, leaving the
          c2 histogram unaffected. The third
          statement removes the histogram for c2,
          leaving those for c1 and
          c3 unaffected.
        
          When sampling user data as part of building a histogram, not
          all values are read; this may lead to missing some values
          considered important. In such cases, it might be useful to
          modify the histogram, or to set your own histogram explicitly
          based on your own criteria, such as the complete data set.
          ANALYZE TABLE  updates
          a column of the histogram table with data supplied in the same
          JSON format used to display tbl_name
          UPDATE HISTOGRAM ON col_name USING
          DATA 'json_data'HISTOGRAM
          column values from the Information Schema
          COLUMN_STATISTICS table. Only one
          column can be modified when updating the histogram with JSON
          data.
        
          We can illustrate the use of USING DATA by
          first generating a histogram on column c1
          of table t, like this:
        
mysql> ANALYZE TABLE t UPDATE HISTOGRAM ON c1;
+-------+-----------+----------+-----------------------------------------------+
| Table | Op        | Msg_type | Msg_text                                      |
+-------+-----------+----------+-----------------------------------------------+
| h.t   | histogram | status   | Histogram statistics created for column 'c1'. |
+-------+-----------+----------+-----------------------------------------------+
1 row in set (0.00 sec)
          We can see the histogram generated in the
          COLUMN_STATISTICS table:
        
mysql> TABLE information_schema.column_statistics\G
*************************** 1. row ***************************
SCHEMA_NAME: h
 TABLE_NAME: t
COLUMN_NAME: c1
  HISTOGRAM: {"buckets": [], "data-type": "int", "auto-update": false,
"null-values": 0.0, "collation-id": 8, "last-updated": "2024-03-26
16:54:43.674995", "sampling-rate": 1.0, "histogram-type": "singleton",
"number-of-buckets-specified": 100}   
1 row in set (0.00 sec)
          Now we drop the histogram, and when we check
          COLUMN_STATISTICS, it is empty:
        
mysql> ANALYZE TABLE t DROP HISTOGRAM ON c1;
+-------+-----------+----------+-----------------------------------------------+
| Table | Op        | Msg_type | Msg_text                                      |
+-------+-----------+----------+-----------------------------------------------+
| h.t   | histogram | status   | Histogram statistics removed for column 'c1'. |
+-------+-----------+----------+-----------------------------------------------+
1 row in set (0.01 sec)
mysql> TABLE information_schema.column_statistics\G
Empty set (0.00 sec)
          We can restore the dropped histogram by inserting its JSON
          representation obtained previously from the
          HISTOGRAM column of the
          COLUMN_STATISTICS table, and when we query
          that table again, we can see that the histogram has been
          restored to its previous state:
        
mysql> ANALYZE TABLE t UPDATE HISTOGRAM ON c1 
    ->     USING DATA '{"buckets": [], "data-type": "int", "auto-update": false,
    ->               "null-values": 0.0, "collation-id": 8, "last-updated": "2024-03-26
    ->               16:54:43.674995", "sampling-rate": 1.0, "histogram-type": "singleton",
    ->               "number-of-buckets-specified": 100}';   
+-------+-----------+----------+-----------------------------------------------+
| Table | Op        | Msg_type | Msg_text                                      |
+-------+-----------+----------+-----------------------------------------------+
| h.t   | histogram | status   | Histogram statistics created for column 'c1'. |
+-------+-----------+----------+-----------------------------------------------+
mysql> TABLE information_schema.column_statistics\G
*************************** 1. row ***************************
SCHEMA_NAME: h
 TABLE_NAME: t
COLUMN_NAME: c1
  HISTOGRAM: {"buckets": [], "data-type": "int", "auto-update": false,
"null-values": 0.0, "collation-id": 8, "last-updated": "2024-03-26
16:54:43.674995", "sampling-rate": 1.0, "histogram-type": "singleton",
"number-of-buckets-specified": 100}
          Histogram generation is not supported for encrypted tables (to
          avoid exposing data in the statistics) or
          TEMPORARY tables.
        
          Histogram generation applies to columns of all data types
          except geometry types (spatial data) and
          JSON.
        
Histograms can be generated for stored and virtual generated columns.
Histograms cannot be generated for columns that are covered by single-column unique indexes.
          Histogram management statements attempt to perform as much of
          the requested operation as possible, and report diagnostic
          messages for the remainder. For example, if an UPDATE
          HISTOGRAM statement names multiple columns, but some
          of them do not exist or have an unsupported data type,
          histograms are generated for the other columns, and messages
          are produced for the invalid columns.
        
Histograms are affected by these DDL statements:
- DROP TABLEremoves histograms for columns in the dropped table.
- DROP DATABASEremoves histograms for any table in the dropped database because the statement drops all tables in the database.
- RENAME TABLEdoes not remove histograms. Instead, it renames histograms for the renamed table to be associated with the new table name.
- ALTER TABLEstatements that remove or modify a column remove histograms for that column.
- ALTER TABLE ... CONVERT TO CHARACTER SETremoves histograms for character columns because they are affected by the change of character set. Histograms for noncharacter columns remain unaffected.
          The
          histogram_generation_max_mem_size
          system variable controls the maximum amount of memory
          available for histogram generation. The global and session
          values may be set at runtime.
        
          Changing the global
          histogram_generation_max_mem_size
          value requires privileges sufficient to set global system
          variables. Changing the session
          histogram_generation_max_mem_size
          value requires privileges sufficient to set restricted session
          system variables. See
          Section 7.1.9.1, “System Variable Privileges”.
        
          If the estimated amount of data to be read into memory for
          histogram generation exceeds the limit defined by
          histogram_generation_max_mem_size,
          MySQL samples the data rather than reading all of it into
          memory. Sampling is evenly distributed over the entire table.
          MySQL uses SYSTEM sampling, which is a
          page-level sampling method.
        
          The sampling-rate value in the
          HISTOGRAM column of the Information Schema
          COLUMN_STATISTICS table can be
          queried to determine the fraction of data that was sampled to
          create the histogram. The sampling-rate is
          a number between 0.0 and 1.0. A value of 1 means that all of
          the data was read (no sampling).
        
          The following example demonstrates sampling. To ensure that
          the amount of data exceeds the
          histogram_generation_max_mem_size
          limit for the purpose of the example, the limit is set to a
          low value (2000000 bytes) prior to generating histogram
          statistics for the birth_date column of the
          employees table.
        
mysql> SET histogram_generation_max_mem_size = 2000000;
mysql> USE employees;
mysql> ANALYZE TABLE employees UPDATE HISTOGRAM ON birth_date WITH 16 BUCKETS\G
*************************** 1. row ***************************
   Table: employees.employees
      Op: histogram
Msg_type: status
Msg_text: Histogram statistics created for column 'birth_date'.
mysql> SELECT HISTOGRAM->>'$."sampling-rate"'
       FROM INFORMATION_SCHEMA.COLUMN_STATISTICS
       WHERE TABLE_NAME = "employees"
       AND COLUMN_NAME = "birth_date";
+---------------------------------+
| HISTOGRAM->>'$."sampling-rate"' |
+---------------------------------+
| 0.0491431208869665              |
+---------------------------------+
          A sampling-rate value of 0.0491431208869665
          means that approximately 4.9% of the data from the
          birth_date column was read into memory for
          generating histogram statistics.
        
          The InnoDB storage engine
          provides its own sampling implementation for data stored in
          InnoDB tables. The default sampling
          implementation used by MySQL when storage engines do not
          provide their own requires a full table scan, which is costly
          for large tables. The InnoDB sampling
          implementation improves sampling performance by avoiding full
          table scans.
        
          The sampled_pages_read and
          sampled_pages_skipped
          INNODB_METRICS counters can be
          used to monitor sampling of InnoDB data
          pages. (For general
          INNODB_METRICS counter usage
          information, see
          Section 28.4.21, “The INFORMATION_SCHEMA INNODB_METRICS Table”.)
        
The following example demonstrates sampling counter usage, which requires enabling the counters prior to generating histogram statistics.
mysql> SET GLOBAL innodb_monitor_enable = 'sampled%';
mysql> USE employees;
mysql> ANALYZE TABLE employees UPDATE HISTOGRAM ON birth_date WITH 16 BUCKETS\G
*************************** 1. row ***************************
   Table: employees.employees
      Op: histogram
Msg_type: status
Msg_text: Histogram statistics created for column 'birth_date'.
mysql> USE INFORMATION_SCHEMA;
mysql> SELECT NAME, COUNT FROM INNODB_METRICS WHERE NAME LIKE 'sampled%'\G
*************************** 1. row ***************************
 NAME: sampled_pages_read
COUNT: 43
*************************** 2. row ***************************
 NAME: sampled_pages_skipped
COUNT: 843This formula approximates a sampling rate based on the sampling counter data:
sampling rate = sampled_page_read/(sampled_pages_read + sampled_pages_skipped)
          A sampling rate based on sampling counter data is roughly the
          same as the sampling-rate value in the
          HISTOGRAM column of the Information Schema
          COLUMN_STATISTICS table.
        
          For information about memory allocations performed for
          histogram generation, monitor the Performance Schema
          memory/sql/histograms instrument. See
          Section 29.12.20.10, “Memory Summary Tables”.
          ANALYZE TABLE clears table statistics from
          the Information Schema
          INNODB_TABLESTATS table and sets
          the STATS_INITIALIZED column to
          Uninitialized. Statistics are collected
          again the next time the table is accessed.