WL#9418: Permit default value to be a function or expression, support for LOBs

Affects: Server-8.0   —   Status: Complete

Currently MySQL only permits a literal value as column default.  For example:

CREATE TABLE t1 (
 a INT DEFAULT 0
);

It is useful to be able to allow functions as default values.  The most common 
use case would be to emulate a UUID-like auto_increment:

> CREATE TABLE t2 (
    ->  a BINARY(16) DEFAULT uuid_to_bin(uuid())
    -> );
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that 
corresponds to your MySQL server version for the right syntax to use near 
'uuid_to_bin(uuid())
)' at line 2


It is also a major usability problem for GIS. If an user would like to add a
geometry column to an existing table and index on it, there is no straight forward 
way to do so. rtree index requires the column to be not nullable, however MySQL
does not allow adding a non-nullable geometry column, because it does not support
expression to be the default value, and for GIS, the default value is an
expression.

In addition, LOB fields do not currently support a default value.  This is very
much a desired feature for GIS and JSON e.g. "DEFAULT POINT(0,0)" or DEFAULT 
'[]' (default to an empty array to reduce nil handling in application).

=Functional requirements.=

;F1. Deterministic functions must be allowed as default values.
;F2. Non-deterministic functions must be allowed as default values.
;F3. The order of the columns in the table matters as values for each column in
a row are evaluated from left to right: only functions that refer columns
already defined in the table(columns on the left side of the current column)
must be allowed as default values.
;F4. When renaming, changing order or dropping a column which has a GC
or generated default value dependency,
ER_DEPENDENT_BY_DEFAULT_GENERATED_VALUE error or the generic: "Unknown column
'i' in 'default value expression'" is returned.
;F5. subqueries, parameters, variables, stored routines must not be allowed as
default values
;F6. Default values may be used when inserting new rows, by omitting an explicit
column value or using the special column value DEFAULT.
;F7. Calling DEFAULT(column_name) on columns having DEFAULT defined as as 
expression will return ER_DEFAULT_AS_VAL_GENERATED error.
;F8. Using non-deterministic functions as default values causes statement to be
marked as not suitable for statement-based replication.
;F9. It must be possible to also assign default values from generated default
value expressions to columns of type BLOB, TINYBLOB, MEDIUMBLOB, LONGBLOB, TEXT,
TINYTEXT, MEDIUMTEXT, LONGTEXT, JSON, GEOMETRY, POINT, LINESTRING, POLYGON,
GEOMETRYCOLLECTION, MULTIPOINT, MULTILINESTRING, and MULTIPOLYGON.
;F10. Default values defined as expression cannot depend on columns having 
AUTO_INCREMENT.
;F11. Functions as default values on tables having NDB as storage engine should
be rejected with ER_UNSUPPORTED_ACTION_ON_DEFAULT_VAL_GENERATED.
;F12. Irrespective of GTID variables, for ALTER TABLE .. ADD COLUMN .. (with 
DEFAULT expression): if replication is either MIXED or ROW and default expression 
is Non deterministic then ER_BINLOG_UNSAFE_SYSTEM_FUNCTION is always thrown.
;F13. For ALTER TABLE .. ADD COLUMN .. (with DEFAULT expression): If replication 
is STATEMENT and default expression is Non deterministic then
  i) for (GTID_MODE=OFF) + (ENFORCE_GTID_CONSISTENCY=OFF), the statement will be 
accepted
  ii) for (GTID_MODE=OFF) + (ENFORCE_GTID_CONSISTENCY=WARN), the statement will be 
accepted with a warning
  iii) for (GTID_MODE=OFF) + (ENFORCE_GTID_CONSISTENCY=ON), 
ER_GTID_UNSAFE_ALTER_ADD_COL_WITH_DEFAULT_EXPRESSION is thrown
  iv) for (GTID_NEXT=UUID:NUMBER), 
ER_GTID_UNSAFE_ALTER_ADD_COL_WITH_DEFAULT_EXPRESSION is thrown
  v) for (GTID_NEXT=AUTOMATIC AND GTID_MODE is ON or ON_PERMISSIVE), 
ER_GTID_UNSAFE_ALTER_ADD_COL_WITH_DEFAULT_EXPRESSION is thrown
  vi) for (GTID_MODE=ON), ER_GTID_UNSAFE_ALTER_ADD_COL_WITH_DEFAULT_EXPRESSION is 
thrown
*For simplicity, we will use gen_default_val for generated default value 
expression.

Syntax
======

The SQL standard defines the DEFAULT clause as:

 ::=
DEFAULT 
 ::=

| 
| USER
| CURRENT_USER
| CURRENT_ROLE
| SESSION_USER
| SYSTEM_USER
| CURRENT_CATALOG
| CURRENT_SCHEMA
| CURRENT_PATH
| 

where:

 ::=

| 
 ::=
NULL
 ::=
ARRAY  
| MULTISET  

 ::=
CURRENT_DATE
| CURRENT_LOCALTIME
| TIME
| CURRENT_TIMESTAMP
| LOCALTIMESTAMP


Allowing expressions as default value, is non-standard. But MySQL is not the
only one doing it, e.g. Postgresql allows it.
After the implementation of this WL, MySQL's default clause will allow
expressions to be set as the default value. Deterministic and non-deterministic
built-in functions are going to be allowed in gen_default_val. Similar to GC,
subqueries, parameters, variables, UDF/SP are not going to be allowed.

The standard specifies that the subject data type of a  is the
data type specified in the descriptor identified by the
containing , , ,
,
or .

But by default MySQL does not follow this and does conversions, and this will
also be allowed for gen_default_val. So the following example will be allowed:

CREATE TABLE t3 (a INT PRIMARY KEY,
c JSON DEFAULT (concat("[", "10", "]"))
);

INSERT INTO t3(a) values (10);


How can a table be created with a column having a gen_default_val?
==================================================================

1. Using the standard CREATE TABLE syntax
eg CREATE TABLE t1 (id INT, value JSON DEFAULT (JSON_OBJECT("key", id)));
#will now execute instead of raising an error, will create a table which will
store the definition of the DEFAULT value function into the dd

2. Using the CREATE TABLE .. AS SELECT ..
creates a table that include the gen_default_val from the initial table(if that
type of column exists)

3. Using the CREATE TABLE  LIKE ;

Both 2 and 3 will also set the expression for default value as it did before
with regular literals.

How can a table be altered to have a column with gen_default_val?
=================================================================

1. using the ALTER TABLE  ADD COLUMN   DEFAULT 
2. using the ALTER TABLE  ALTER COLUMN  SET DEFAULT 
eg ALTER TABLE t1 ALTER COLUMN id SET DEFAULT (RAND()); #Sets the function to be
used to generate values for the next INSERT.
3. using the ALTER TABLE  MODIFY [COLUMN] syntax
4. using the ALTER TABLE  CHANGE [COLUMN] syntax 

eg. 
How can a table DROP a gen_default_val?
=======================================

1. using the ALTER TABLE  DROP COLUMN 
eg ALTER TABLE t1 DROP COLUMN id;
#Drops the entire column including the DEFAULT function
2. using the ALTER TABLE  ALTER COLUMN  DROP DEFAULT

How to show the table having the gen_default_val
================================================

1. SHOW CREATE TABLE ;

Indexes
=======

Does not influence indexes.

DD CHANGES
==========

No changes to the DD structure will be made.
- default_option will store either "CURRENT_TIMESTAMP" (as it already
happens now) or the gen_default_val in its original form.
- default_value_utf8 will store either "CURRENT_TIMESTAMP" or
the utf8-converted generation expression. 

INFORMATION_SCHEMA
==================

I_S.Columns.COLUMN_DEFAULT will remain unchanged and will show
columns.default_value_utf8
I_S.Columns.EXTRA: will be set to "DEFAULT_GENERATED" if column.default_option
is not null

What can be in the gen_default_val
==================================

The expression may contain a built-in function deterministic or not(unlike GC),
operator, or a reference to any base column or GC within the same table.
Similar to GC no subqueries, parameters, variables, UDFs or SP are allowed.
If gen_default_val refers to a generated column or another gen_default_val, the
column referred to must be a ordered ahead in the table as each column is
evaluated in an orderly manner.


Error messages
==============

ER_GENERATED_COLUMN_NAMED_FUNCTION_IS_NOT_ALLOWED
  eng "Expression of generated column '%s' contains a disallowed function: %s."

ER_GENERATED_COLUMN_ROW_VALUE
  eng "Expression of generated column '%s' cannot refer to a row value."

ER_GENERATED_COLUMN_VARIABLES
  eng "Expression of generated column '%s' cannot refer user or system variables."

ER_DEPENDENT_BY_DEFAULT_GENERATED_VALUE
  eng "Column '%s' of table '%s' has a default value expression dependency and
cannot be dropped."

ER_DEFAULT_VAL_GENERATED_NON_PRIOR
  eng "Default value expression of column '%s' can refer only to generated
columns and columns with default value expressions defined prior to it."

ER_DEFAULT_VAL_GENERATED_REF_AUTO_INC
  eng "Default value expression of column '%s' cannot refer to an auto-increment
column."

ER_DEFAULT_VAL_GENERATED_FUNCTION_IS_NOT_ALLOWED
  eng "Default value expression of column '%s' contains a disallowed function."

ER_DEFAULT_VAL_GENERATED_NAMED_FUNCTION_IS_NOT_ALLOWED
  eng "Default value expression of column '%s' contains a disallowed function: %s."

ER_DEFAULT_VAL_GENERATED_ROW_VALUE
  eng "Default value expression of column '%s' cannot refer to a row value."

ER_DEFAULT_VAL_GENERATED_VARIABLES
  eng "Default value expression of column '%s' cannot refer user or system
variables."

ER_DEFAULT_AS_VAL_GENERATED
  eng "DEFAULT function cannot be used with default value expressions"

ER_UNSUPPORTED_ACTION_ON_DEFAULT_VAL_GENERATED
  eng "'%s' is not supported for default value expressions."

ER_GTID_UNSAFE_ALTER_ADD_COL_WITH_DEFAULT_EXPRESSION
  eng "Statement violates GTID consistency: ALTER TABLE ... ADD COLUMN .. with
expression as DEFAULT."

ER_RPL_GTID_UNSAFE_ALTER_ADD_COL_WITH_DEFAULT_EXPRESSION
  eng "Statement violates GTID consistency: ALTER TABLE ... ADD COLUMN .. with
expression as DEFAULT."