Documentation Home
MySQL 8.4 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 39.9Mb
PDF (A4) - 40.0Mb
Man Pages (TGZ) - 258.4Kb
Man Pages (Zip) - 365.5Kb
Info (Gzip) - 4.0Mb
Info (Zip) - 4.0Mb

MySQL 8.4 Reference Manual  /  ...  /  Optimizer Use of Generated Column Indexes

10.3.11 Optimizer Use of Generated Column Indexes

MySQL supports indexes on generated columns. For example:

CREATE TABLE t1 (f1 INT, gc INT AS (f1 + 1) STORED, INDEX (gc));

The generated column, gc, is defined as the expression f1 + 1. The column is also indexed and the optimizer can take that index into account during execution plan construction. In the following query, the WHERE clause refers to gc and the optimizer considers whether the index on that column yields a more efficient plan:

SELECT * FROM t1 WHERE gc > 9;

The optimizer can use indexes on generated columns to generate execution plans, even in the absence of direct references in queries to those columns by name. This occurs if the WHERE, ORDER BY, or GROUP BY clause refers to an expression that matches the definition of some indexed generated column. The following query does not refer directly to gc but does use an expression that matches the definition of gc:

SELECT * FROM t1 WHERE f1 + 1 > 9;

The optimizer recognizes that the expression f1 + 1 matches the definition of gc and that gc is indexed, so it considers that index during execution plan construction. You can see this using EXPLAIN:

mysql> EXPLAIN SELECT * FROM t1 WHERE f1 + 1 > 9\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: range
possible_keys: gc
          key: gc
      key_len: 5
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index condition

In effect, the optimizer has replaced the expression f1 + 1 with the name of the generated column that matches the expression. That is also apparent in the rewritten query available in the extended EXPLAIN information displayed by SHOW WARNINGS:

*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `test`.`t1`.`f1` AS `f1`,`test`.`t1`.`gc`
         AS `gc` from `test`.`t1` where (`test`.`t1`.`gc` > 9)

The following restrictions and conditions apply to the optimizer's use of generated column indexes:

  • For a query expression to match a generated column definition, the expression must be identical and it must have the same result type. For example, if the generated column expression is f1 + 1, the optimizer does not recognize a match if the query uses 1 + f1, or if f1 + 1 (an integer expression) is compared with a string.

  • The optimization applies to these operators: =, <, <=, >, >=, BETWEEN, and IN().

    For operators other than BETWEEN and IN(), either operand can be replaced by a matching generated column. For BETWEEN and IN(), only the first argument can be replaced by a matching generated column, and the other arguments must have the same result type. BETWEEN and IN() are not yet supported for comparisons involving JSON values.

  • The generated column must be defined as an expression that contains at least a function call or one of the operators mentioned in the preceding item. The expression cannot consist of a simple reference to another column. For example, gc INT AS (f1) STORED consists only of a column reference, so indexes on gc are not considered.

  • For comparisons of strings to indexed generated columns that compute a value from a JSON function that returns a quoted string, JSON_UNQUOTE() is needed in the column definition to remove the extra quotes from the function value. (For direct comparison of a string to the function result, the JSON comparator handles quote removal, but this does not occur for index lookups.) For example, instead of writing a column definition like this:

    doc_name TEXT AS (JSON_EXTRACT(jdoc, '$.name')) STORED

    Write it like this:

    doc_name TEXT AS (JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name'))) STORED

    With the latter definition, the optimizer can detect a match for both of these comparisons:

    ... WHERE JSON_EXTRACT(jdoc, '$.name') = 'some_string' ...
    ... WHERE JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name')) = 'some_string' ...

    Without JSON_UNQUOTE() in the column definition, the optimizer detects a match only for the first of those comparisons.

  • If the optimizer picks the wrong index, an index hint can be used to disable it and force the optimizer to make a different choice.