MySQL Blog Archive
For the latest blogs go to blogs.oracle.com/mysql
Generated Columns in MySQL 5.7.5

Generated Columns is a new feature available in the latest lab release. This work is based on a contribution by Andrey Zhakov. Thanks, Andrey! The Optimizer team modified it to follow the current MySQL design, and to lift a number of limitations.

The syntax is:

<type>   [ GENERATED ALWAYS ]   AS   ( <expression> )   [ VIRTUAL|STORED ]
[ UNIQUE [KEY] ]   [ [PRIMARY] KEY ]   [ NOT NULL ]   [ COMMENT <text> ]

There are two kinds of Generated Columns: virtual (default) and stored. Virtual means that the column will be calculated on the fly when a record is read from a table. Stored means that the column will be calculated when a new record is written in the table, and after that it will be treated as a regular field. Both types can have NOT NULL restrictions, but only a stored Generated Column can be be a part of an index.

There are still a few limitations on Generated Columns:

  • The generation expression used can only call native deterministic functions; stored routines and UDFs are not yet supported.
  • The length of the generation expressions for a table are subject to the limitations of the .frm file, and thus they can’t be longer than 64K in total. So you can have one field with a generation expression length of 64K, or you can have 30 fields with an average expression length of 2K each.
  • A Generated Column can’t refer to itself or to other Generated Columns that are later defined, but it can refer to any previously defined Generated Column. This limitation does not apply to regular columns. A Generated Column can refer to any regular one, no matter where it’s defined.

What can Generated Columns can be used for? Many things. To name few:

  • As a materialized cache for often used expressions:

    The result of those two queries is exactly the same, but in the first one ExtractValue(doc,’/user/username’) will be evaluated 3 times per record read, while in the second only once per record read. If the “username” columns would be defined as STORED then the generation expression will be evaluated only when a record is inserted or updated.

    Another similar case is that Generated Columns (GC) could be used to add flexibility by replacing often used expressions with a GC. For example, if you have a bunch of applications that work on the same database then it might be practical to have a unified way to access data without need to keep all apps in sync.

  • Providing indexes for joins with non-relational data:

    Here ref access over an index is used to access a table with XML data.

  • Working around a limited set of partitioning functions:

    Note that only one partition is going to be scanned due to partition pruning.

As always, there are more than one way to do things. Generated Columns adds yet another means of solving a variety of interesting challenges. It now also becomes more convenient to deal with derivatives of relational and non-relational data. We look forward to seeing all of the interesting ways that you apply the feature!