WL#4095: Foreign keys: comparable data types

Affects: Server-6.x   —   Status: Un-Assigned   —   Priority: Very High

According to WL#148, a foreign key column must have
the same data type + the same length + the same scale
as the corresponding referenced column.

With "comparable data types", the restriction would be
looser. For example, lengths could differ. But the
columns must still be comparable. And some care must
be taken to ensure that cascading actions are still
possible.
For WL#148 "Foreign keys: Implement Foreign Keys (all storage engines)"
we said that parent-key columns and child-key columns must have
exactly the same data type and length. This worklog task is:
remove at least some restrictions.

Examples:

CREATE TABLE t1 (s1 INT PRIMARY KEY);
CREATE TABLE t2 (s1 BIGINT REFERENCES t1);

CREATE TABLE t1 (s1 CHAR(5));
CREATE TABLE t2 (s1 VARCHAR(6));

Comparable
----------

With most DBMSs, an important consideration for matching is:
are the parent and child keys "comparable"? This must be possible:
WHERE parent_column = child_column;
But in MySQL, almost everything is comparable.
Therefore, either we have to accept almost everything,
or we have to come up with some arbitrary rule like:
data types must be in the same 'class' (numbers, strings,
datetimes, etc.).

Assignable
----------

If an assignment is lossy, then after
CASCADE the foreign key won't equal the parent key.
For example:
CREATE TABLE t1 (s1 CHAR(2) PRIMARY KEY);
CREATE TABLE t2 (s1 CHAR(1) REFERENCES t1 ON UPDATE CASCADE);
INSERT INTO t1 VALUES ('a');
INSERT INTO t2 VALUES ('a');
UPDATE t1 SET s1 = 'aa';

Our possible safeguards are:
* Allow definition but not lossy assignment.
* Allow definition only for non-cascade.
* Allow definition but not any assignment
  (this seems to be what InnoDB is doing).

Other considerations
--------------------

We have to consider several details of each definition:
length, precision, scale, character set, collation.
There might be particular
combinations which affect whether items are comparable.

Other DBMSs
-----------

Extract from WL#148 Feature Comparison Chart

Feature                    ISO  SQL    DB2 Oracle Inno  Postgre   MySQL? 
                                Server            DB    SQL

Different Data Type        T201 No     Yes No     No    Yes       No

A detailed explanation of the above extract follows.

The ANSI/ISO SQL:2008 standard allows parent / child columns of
different data types, but only as part of optional Non-Core
Feature T201 "Comparable data types for referential constraints".
And, even with Feature T201, datetime can't match datetime WITH TIME ZONE.

SQL Server allows nothing. Tests showed:
Different lengths, for example CHAR(5) parent and CHAR(6) child: Error.
Char/Varchar difference, for example VARCHAR(5) parent and CHAR(5) child: Error.
Collation difference, for example latin1_general_ci_as parent and
latin1_general_ci_ai child: Error.

DB2 allows anything "compatible". The DB2 documentation says:
"There must be the same number of foreign key columns as there are
in the parent key and the data types of the corresponding columns
must be compatible (SQLSTATE 42830). Two column descriptions are
compatible if they have compatible data types (both columns are
numeric, character strings, graphic, date/time, or have the same
distinct type)."
http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000927.htm

Oracle allows a little. Tests showed:
Different lengths, for example CHAR(5) parent and CHAR(6) child: OK.
Different lengths, for example DECIMAL(6) parent and DECIMAL(5) child: OK.
Char/Varchar difference, for example CHAR(5) parent and VARCHAR(5) child: Error.

InnoDB allows a little. Tests showed:
Different lengths, for example BINARY(4) parent and BINARY(5) child: OK.
Different numeric data type, for example INT parent and SMALLINT child: Error.
Char/Varchar difference, for example CHAR(5) and VARCHAR(5): OK.

PostgreSQL allows a lot. Tests showed.
Wildly different data type, for example CHAR(5) parent and INT child: Error.
Different numeric data type, for example INT parent and SMALLINT child: OK.
Different numeric data type, for example DECIMAL parent and INT child: OK.
Different lengths, for example DECIMAL(5) parent and DECIMAL(7,1) child: OK.
Char/Varchar difference, for example CHAR(5) parent and VARCHAR(5) child: OK.

Historical notes and minutes
----------------------------

From Peter's "WL#148 Note#4"
https://intranet.mysql.com/secure/mailarchive/mail.php?folder=4&mail=19700
"
Change#17.
Section:
"Checks during CREATE TABLE or ALTER TABLE"
Remove:
"
Data type must be comparable to corresponding pk_column data type.
("Comparable" means: if you can compare without needing to cast, okay
This is looser than the InnoDB requirement that data types be the same.)
"
Add:
"
Data type must be the same as the corresponding pk_column data type
(except for char/varchar), and the same length.
This is slightly stricter than the InnoDB requirement.
"
Explanation:
DMITRI SAYS: we should consider prohibiting incompatible
types. Incompatible means different data type (except for
binary/varbinary and char/varchar). Incompatible means
different lengths? For all "different data type" stuff,
mark as pending question.
DMITRI SAYS: maybe BINARY/VARBINARY won't be okay either.
"

From minutes of a Heidelberg discussions about foreign keys
https://inside.mysql.com/wiki/HD_Mtg:_Foreign_Keys_Workshop
"
Change#17:

Proposed removal: Agreed

Proposed addition should be changed to:Data type must be the same as the
corresponding pk_column data type, the same length, and the same scale. This is
slightly stricter than the InnoDB requirement.

Action item PeterG: Create worklog entry for intelligent comparison. 
"

See also: BUG#38882 Innodb is too restrict with FK data types