WL#148: Foreign keys: Implement Foreign Keys (all storage engines)
Affects: Server-6.1
—
Status: Assigned
Foreign Keys ------------ MySQL currently "supports" foreign keys in two ways: (1) if storage engine = InnoDB then let InnoDB handle them, (2) otherwise parse and ignore foreign-key clauses. The goals of this task are: - make foreign-key support possible for any storage engine (which could handle it) via a combination of API calls and server handling. - support SQL standard syntax instead of InnoDB syntax. I use the abbreviation FK when discussing the child / referencing table (the table with the foreign key), or its indexes and columns. I use the abbreviation PK when discussing the parent / referenced table (which is often a table with a primary key), or its indexes and columns. Note that this implementation is for immediately-checked constraints only. Deferred constraints -- i.e. constraints that are checked at transaction-end rather than statement-end are another task and will be implemented for all constraint types (not just foreign key constraints) when the time comes.
WL#3288: Multiple cursors per TABLE
WL#4026: Foreign keys: mysql_upgrade for foreign keys
WL#6049: Meta-data locking for FOREIGN KEY tables
WL#4026: Foreign keys: mysql_upgrade for foreign keys
WL#6049: Meta-data locking for FOREIGN KEY tables
Syntax (Table Constraint) ------------------------- Table constraints are separate clauses in CREATE TABLE or ALTER TABLE. [ CONSTRAINT [constraint_name] ] /* NOTE 1 */ FOREIGN KEY [constraint_name] (fk_column list) REFERENCES pk_table_name (pk_column list) /* NOTE 2 */ [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] /* NOTE 3 */ [ ON UPDATE { CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION } ] [ ON DELETE { CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION } ] /* NOTE 1 */ The "CONSTRAINT constraint_name" clause is the subject of a different worklog entry, WL#2226 "Constraint Names". Searches for constraint names are case insensitive. On 2008-02-11 there was a changed from "CONSTRAINT constraint_name" to "CONSTRAINT [constraint_name]", if constraint_name is omitted then it will be the default value. /* NOTE 2 */ Currently we support "(pk_column list)" not "[(pk_column list)]". That is, we are NOT changing so that the pk column list is optional. For that, see WL#3947 Foreign Keys: Default Primary Reference. /* NOTE 3 */ The first implementation of foreign keys will not support the MATCH PARTIAL mode of key comparison. The keyword PARTIAL is recognized and causes MySQL to return ER_FK_MATCH_PARTIAL It is legal to say either "ON UPDATE ... ON DELETE ..." or "ON DELETE ... ON UPDATE ...". Examples: CREATE TABLE t2 (s1 INT, FOREIGN KEY (s1) REFERENCES t1 (s1)); CREATE TABLE t2 (s1 INT, s2 INT, FOREIGN KEY (s1,s2) REFERENCES t1 (s2,s1), FOREIGN KEY (s2) REFERENCES t1 (s3) ON UPDATE CASCADE); Syntax (Column Constraint) -------------------------- Column constraints are part of a column definition. They are a shorthand form of the table constraint when there is only one field in the declaration. [ CONSTRAINT [constraint_name] ] REFERENCES parent_table_name (parent_column list) [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] [ ON UPDATE { CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION } ] [ ON DELETE { CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION } ] These clauses are precisely the same as the clauses that follow "FOREIGN KEY (fk_column list)" in the Syntax (Table Constraint) described previously and the same notes apply. The column constraint "colx data-type REFERENCES ..." can be converted to "colx data-type, FOREIGN KEY (colx) REFERENCES ..." with no change in the behavior of the constraint. InnoDB does not support column constraints in its native implementation of foreign keys. Examples: CREATE TABLE t2 (s1 INT REFERENCES t1(s1)); CREATE TABLE t2 (s1 INT CONSTRAINT `x` REFERENCES t1 (s1)); are equivalent to: CREATE TABLE t2 (s1 INT, FOREIGN KEY (s1) REFERENCES t1(s1)); CREATE TABLE t2 (s1 INT, CONSTRAINT `x` FOREIGN KEY REFERENCES t1 (s1)); Defaults -------- 1. If ON UPDATE is omitted, assume ON UPDATE NO ACTION. 2. If ON DELETE is omitted, assume ON DELETE NO ACTION. 3. Support only MATCH SIMPLE and MATCH FULL. Emit an error for MATCH PARTIAL: ER_FK_MATCH_PARTIAL. If MATCH clause is omitted, assume MATCH SIMPLE (this is standard). Checks during CREATE TABLE or ALTER TABLE ----------------------------------------- MySQL must check table constraints at CREATE/ALTER TABLE time thus: 1. The fk table must not be temporary. The pk table must not be temporary. (This restriction is not standard.) The pk table must not be a view. The fk and pk tables may be made with different storage engines, unless the storage engine objects, or unless one table is transactional and the other table is non-transactional. 2. The number of columns in fk_column list must be between 1 and the maximum number of index columns the storage engine can support. (The number 16 is the InnoDB maximum, but Oracle allows up to 32.) 3. The number of columns in fk_column list must = number of columns in pk_column list 4. No columns in fk_column list may be duplicates, e.g. you can't say FOREIGN KEY (s1,s1,s1). 5. For each column in fk_column list: Data type must not be TIMESTAMP WITH TIME ZONE (this restriction exists in other DBMSs, I left it in for safety), and data type must not be TIMESTAMP or ENUM or SET. Data type must be the same as the corresponding pk_column data type, and length must be same, and scale must be the same. This is slightly stricter than the InnoDB requirement. See also WL#4095 Foreign keys: comparable data types. Column must be nullable if on_update_rule or on_delete_rule is SET NULL. If column is auto_increment { If on_update_rule = SET NULL | SET DEFAULT | CASCADE, error If on_delete_rule = SET NULL | SET DEFAULT, error 6. For each column in pk_column list: If any column is nullable, error If user does not have REFERENCES privilege on column, error (see WL#4099 "Foreign keys: reference privilege") (InnoDB does not have a check for REFERENCES privilege.) If the set of pk columns is not the same as a primary-key or unique-key definition, error. (This is a limit that InnoDB does not have, InnoDB allows FKs to refer to anything.) If an index covering the foreign key definition doesn't exist, it will be created automatically and a warning will be emitted. This index will have an auto-generated name. If a user-defined index is added that can serve foreign key implementation, the autogenerated index is automatically dropped. Otherwise the index will not be droppable. This behavior in regard to autogenerated indexes is contrary to a decision of March 2006 Sorrento Architecture meeting. There is a separate task for supporting unindexed foreign keys: WL#4094 Foreign keys: without indexes on the child table. There is a separate task for checking the REFERENCES privilege: WL#4099 Foreign keys: references privilege. It is legal for FK and PK to be the same table. It is legal to have multiple referential constraints. It is legal to have referential constraints with the same parent and the same child and the same columns and the same referential action. DB2 would give a warning in such a circumstance (SQLSTATE '01543'), It is legal for either parent or child table, or both, to be partitioned. There is an error message mentioned in the LLD, but nothing activates it. See the comments for BUG#14353 Partitions: foreign key can't reference a partitioned table. Checks during INSERT -------------------- The same rules apply for single-row insert, multiple-row insert, update (in which case we are concerned with the new value only), load data, replace (again we are concerned with the new value only). These checks also apply for REPLACE when the row is actually inserted. These checks also apply for CREATE TABLE at time foreign key is made. Error checks take place after triggers are activated. If (it's INSERT and not INSERT IGNORE) If (table has a FOREIGN KEY) For each column in FK-column list: If column value is NULL and it's MATCH SIMPLE It "passes" -- no need for further checks SELECT * FROM PK WHERE pk-column(s) = 'fk-column value'(s) If not found: error The word "error" means "halt with an error message if sql_mode=TRADITIONAL, which causes rollback of statement if storage engine is transactional, but otherwise the statement is stopped". If a table has both triggers and foreign keys, or if a table has multiple foreign keys, the order of processing may not be the same as it is now for InnoDB tables. For SQL:2003 requirement details see https://intranet.mysql.com/secure/mailarchive/mail.php?folder=214&mail=447 Error checks during DELETE -------------------------- If (statement doesn't contain IGNORE): For each FK that refers to table: If (FK is ON DELETE CASCADE) DELETE FROM FK WHERE fk-columns = pk-columns If (FK is ON DELETE SET NULL) UPDATE FK SET all fk-columns = NULL WHERE fk-column(s) = 'pk-column value'(s) If (FK is ON DELETE SET DEFAULT) UPDATE FK SET all fk-columns = DEFAULT WHERE fk-column(s) = 'pk-column value'(s) If (FK is ON DELETE RESTRICT | NO ACTION) SELECT * FROM FK WHERE fk-column(s) = 'pk-column value'(s) LIMIT 1 If anything is returned: Error Error checks during TRUNCATE ---------------------------- We will have the same checks for TRUNCATE as we have for DROP. Error checks during UPDATE -------------------------- The same rules apply for single-row update, multiple-row update, and REPLACE when value is actually updated. No checks if IGNORE. For each fk-column in the table, referencing another table: [ Rule for the new column value is the same as for INSERT, above ] If (UPDATE and not UPDATE IGNORE): For each FK that refers to table: If (FK is ON UPDATE CASCADE) UPDATE FK SET fk-column(s) = new-pk-column(s) WHERE fk-column(s) = 'pk-column value'(s) If (FK is ON UPDATE SET NULL) UPDATE FK SET fk-column(s) = NULL WHERE fk-column(s) = 'pk-column value'(s) If (FK is ON UPDATE SET DEFAULT) UPDATE FK SET fk-column(s) = DEFAULT WHERE fk-column(s) = 'pk-column value'(s) If (FK is ON UPDATE RESTRICT | NO ACTION) SELECT * FROM FK WHERE fk-column(s) = 'pk-column value'(s) LIMIT 1 If anything is returned: Error Error checks during DROP TABLE / DATABASE ----------------------------------------- If (table has a parent key and foreign-key references can't be taken care of within the same statement): Error Error checks during REVOKE -------------------------- It's possible to REVOKE REFERENCES ON PK FROM user, so user no longer has a right to have an FK on the PK. In other DBMSs, this would cause restrict or cascade. But we'll just allow it -- no error check. Error checks during ALTER ------------------------- If an ALTER affects indexes or columns involved in foreign key relationships, error. Error checks during RENAME TABLE -------------------------------- If table has an FK referencing it (a child), Error. Flags ----- Right now we're ignoring the foreign-key clauses. We will start to enforce them. This might affect people upgrading from earlier versions, who have the clause in the table definition, but who have data that violates the clause already. So inevitably somebody will say "let's have a flag for 'parse and ignore' (as before) or 'enforce'. We will continue to support "SET FOREIGN_KEY_CHECKS = 0". We will not check for database-integrity violations when the flag goes back on. Warning in documentation and marketing material ----------------------------------------------- At absolutely no time can we ever claim that we enforce "data integrity" for MyISAM or any other non-transactional table. For such tables, we stop after discovering an error, but that's too late, and could conceivably harm data integrity more than it helps. Errors ------ These errors already exist: ER_CANNOT_ADD_FOREIGN 1215 'HY000' "Cannot add foreign key constraint" ER_NO_REFERENCED_ROW 1216 '23000' "Cannot add or update a child row: a foreign key constraint fails", ER_ROW_IS_REFERENCED 1217 '23000' "Cannot delete or update a parent row: a foreign key constraint fails", ER_WRONG_FK_DEF 1239 '42000' "Wrong foreign key definition for '%-.64s': %s", ER_KEY_REF_DO_NOT_MATCH_TABLE_REF 1240 'HY000' "Key reference and table reference doesn't match", ER_DUP_KEYNAME 1061 '23000' "Duplicate key name '%-.64s'", ER_TOO_MANY_KEY_PARTS 1070 '42000' "Too many key parts specified. Max %d parts allowed", ER_TOO_LONG_KEY 1071 '42000' "Specified key was too long; max key length is %d bytes", There will be new errors and/or changes to existing errors, details are in the Low-Level Design. Recursion --------- There will be support for circular foreign key definitions. When UPDATE/DELETE cascade eventually cycles back so that we're changing the original table again, if a row has already been changed in this statement, stop. InnoDB has a limit to the number of recursions. This is not standard. InnoDB ------ Foreign keys have worked with InnoDB since MySQL version 3.23, and are reasonably stable, although flawed. Possibly: if storage engine = InnoDB, we could continue to pass on foreign-key checking to InnoDB. But that is not what this description specifies. But foreign keys with InnoDB, version 4.1.2, are bad: -- "FOREIGN KEY (column_name) REFERENCES table_name" is illegal -- "column_name REFERENCES table_name" doesn't work -- self-referential insertion doesn't always work -- data types of foreign key and parent key must be the same -- it's compulsory to have an index on the foreign key (well, this is relaxed now, an index is created) -- REFERENCES privilege is ignored -- default for ON UPDATE|DELETE is RESTRICT instead of NO ACTION -- parent key needn't be primary/unique (at least for 'strict' mode it should be) https://intranet.mysql.com/secure/mailarchive/index.php? go=mail&folder=5&mail=21273 -- one gets errors when referring to a table in another database. See also WL#4142 Foreign Keys and InnoDB: implement storage engine API for Foreign Keys support. Feature Comparison Chart ------------------------ I list here the features that a DBMS can have for foreign keys, and see which DBMSs support them. Feature ISO SQL DB2 Oracle Inno Postgre MySQL? Server DB SQL References Primary/Unique Core Yes Yes Yes No Yes Yes Default Primary Reference Core Yes Yes Yes No Yes No Statement Check Core Yes Yes Yes No Yes Yes Column REFERENCES Core Yes Yes Yes No Yes Yes REFERENCES Privilege Core Yes Yes Yes No Yes Yes Recursion Core Yes Yes No Yes Yes Other database Core No No No Yes No Yes All data types Core No No No No Yes No DELETE SET NULL F191 Yes Yes Yes Yes Yes Yes DELETE CASCADE F191 Yes Yes Yes Yes Yes Yes DELETE SET DEFAULT F191 Yes No No No Yes Yes DELETE NO ACTION F191 Yes Yes [Yes] [Yes] Yes Yes DELETE RESTRICT F191 No Yes No Yes Yes Yes UPDATE SET NULL F701 Yes No No Yes Yes Yes UPDATE CASCADE F701 Yes No No Yes Yes Yes UPDATE SET DEFAULT F701 Yes No No No Yes Yes UPDATE NO ACTION F701 Yes Yes [Yes] [Yes] Yes Yes UPDATE RESTRICT F701 No Yes No No Yes Yes CONSTRAINT name F491 Yes Yes Yes Yes Yes No Temporary Table F531 No No No No Yes No Deferrable constraint F721 No No Yes No Yes No MATCH clause F741 No No No No SIMPLE,FULL SIMPLE,FULL Different Data Type T201 No Yes No No Yes No Automatic Data Type No No No Yes No No No Disable No Yes Yes Yes Yes Yes Yes Duplicate Test No No Yes No No No No Automatic index No No No No Yes No Yes View Constraints No No No Yes No No No Maximum columns No 16 16 32 16 32 Varies ALTER Core Yes Yes Yes [Yes] Yes Yes PENDANT No No No No No No No Explanations: In the ISO (i.e. ANSI/ISO Standard) column, "Core" means it's a core feature, "F191" "T201" etc. are the identifiers of non-core features, and "No" means it's not in the standard. In the vendor columns, "Yes" means supported, "[Yes]" means the action is supported but the syntax isn't, and "No" means unsupported. Information comes from the SQL:2003 standard document and the vendor manuals for Oracle10g, SQL Server 2005, DB2 8.1, PostgreSQL 8.1, MaxDB, or MySQL 5.1. I performed few tests, and sometimes I had to interpret silence as 'No'. Details for each feature: References Primary/Unique The parent key must be defined as PRIMARY KEY or UNIQUE. Default Primary Reference If one just says "REFERENCES primary_key_table_name", without specifying a column list, that's okay. Statement Check Checking should occur at end of statement, not for each row. But I can't think of serious examples of foreign-key constraints that fail due to row-by-row checking, except ones that are caused by parent-key changes -- which also have row-by-row checking. So I don't regard this as a foreign-key problem. A non-serious example: create table t1 (s1 int, primary key (s1)) engine=innodb; create table t2 (s1 int, foreign key (s1) references t1 (s1)) engine=innodb; insert into t1 values (1),(2),(3); insert into t2 values (1); update t1 set s1 = case when s1=1 then 4 when s1=2 then 1 else 3 end; UPDATE should succeed because at end-of-statement there is a parent key which matches the foreign key. Column REFERENCES One can say "column1 INT REFERENCES pk" instead of "column1 INT, FOREIGN KEY (column1) REFERENCES pk". REFERENCES Privilege A 'Yes' here means: there is a REFERENCES privilege, and it works, although it may be possible to disable or subvert. Recursion The foreign-key and parent-key tables may be the same, or the foreign-key may refer to a pk which refers to another pk which refers to the foreign-key table. There's inevitably a limit, for example our manual says "cascading operations may not be nested more than 15 levels deep.". Other database The foreign key and parent key can be in different databases. All data types Can the foreign-key column have any data type? The answer is always 'no' -- SQL Server can't have XML, DB2 can't have LONG VARCHAR, Oracle can't have BLOB, etc. DELETE|UPDATE SET NULL | CASCADE | SET DEFAULT | NO ACTION | RESTRICT Only DB2 makes the distinction between NO ACTION and RESTRICT. With RESTRICT: you can't change the pk if a matching fk exists. With NO ACTION: you can change the pk, provided that, by the end of the statement, the fk is not dangling. (Possibly the fk will now be changed, due to some "on update|delete" rule.) MySQL/InnoDB's RESTRICT is really a synonym for NO ACTION. CONSTRAINT name The optional clause [CONSTRAINT constraint_name] can precede the FOREIGN KEY clause, so identification is explicit. Temporary Table The foreign-key table can be a temporary table (the parent-key table should also be temporary). Deferrable constraint This is actually a matter for general constraint checking, it is not a characteristic for "foreign keys" specifically but I added it because some people seem to think it matters. MATCH clause The MATCH { FULL | PARTIAL | SIMPLE } clause. Different Data Type The foreign-key and parent-key data types have to be comparable, but don't have to be the same. For example, SMALLINT columns can reference INTEGER columns. Automatic Data Type You can say "column REFERENCES pk" without specifying the data type, the data type will be figured out from the pk. Disable If a DBMS allows any way to disable, then this is 'Yes'. For example, SQL Server is 'Yes' because it has a NOT FOR REPLICATION clause and a NOCHECK clause. With DB2 it's SET INTEGRITY. Oracle has ENABLE | DISABLE, and lets you re-validate when you re-enable. Generally, disabling is a feature of any constraint, not just foreign keys. Duplicate Test Issues a warning if there's already a foreign key with the same columns for the same parent key. Automatic index You don't need to create an index, or it's automatic. If you don't like this 'feature' you can drop the index. (I put 'No' in the SQL Server / DB2 / Oracle columns because I didn't see an explicit statement that an index is created, and I did see such statements for PRIMARY / UNIQUE.) View constraints You can have a foreign-key reference for a view. Only Oracle allows this, and it doesn't actually enforce the integrity, such views are always DISABLE NOVALIDATE. Apparently it's an optimizer hint. Maximum columns The standard doesn't put any limit on the number of columns that a foreign key can have. With most DBMSs, it depends on the maximum number of columns in an index. ALTER It's possible to use ALTER TABLE to (at least) add and drop constraints. PENDANT Early drafts of SQL-99 allowed this, but it was cancelled. If the last foreign-key reference is deleted, the PK row is deleted. The column titled "MySQL?" shows what MySQL may be able to do after this task is complete, with a transactional storage engine. Note added 2007-05-15: Quick tests with solidDB for MySQL show that it supports Default Primary Reference and Statement Check (better than InnoDB), and update|cascade with RESTRICT or SET NULL or CASCADE or NO ACTION. Column References = No, Other Databases = Yes, Temporary Table = No, Different Data type = No, Automatic Index = Yes, Maximum Columns = 16. Notes about other DBMSs, shifted from LLD ----------------------------------------- Here are brief descriptions how other DBMSs implement foreign keys. INNODB checks foreign key constraints on a row-by-row basis. Right after inserting / updating / deleting a row from a table, InnoDB checks if this operation violates a constraint or if some CASCADE action should be taken. It uses a simple routine written in C which performs these checks using InnoDB internal methods. Since InnoDB assumes that an index is always created for the foreign key, these are mostly index lookups. As constraint checks are done in row-by-row fashion rather than at the end of statement execution, InnoDB deviates from the standard in case of recursion (see appendix 1). INGRES implements foreign-key constraints with the help of statement-level triggers written in SQL. Internal temporary tables are used to store/accumulate old and new versions of rows that are inserted/updated/deleted by the statement until the statement end. The trigger is invoked during the end-of-statement processing. POSTGRESQL uses row-level triggers. It defers execution of the triggers until end of statement (or the end of transaction in case of deferred constraints). Instead of keeping separate copies of old and new versions of the modified rows, it utilizes copies which are stored until the end of transaction by its storage engine (which implements MVCC). Triggers are written in C but use the internal prepared statement API to access/modify data. RAIMA uses a special trick to optimize updates and deletes from the parent table. A hidden column to store the number of rows referencing this row is created in the parent table. In some cases, this allows handling deletes and updates on the parent table without accessing the child table. On the other hand, it requires the parent table to be updated each time the child table is updated. - Innodb source code (as bundled with 5.0) (innobase/dict/dict0crea.c, dict0dict.c, dict0load.c, innobase/include/dict0mem.h, innobase/row/row0ins.c, row0upd.c, row0mysql.c) - Postgres 8.0 source code (backend/utils/adt/ri_triggers.c, backend/commands/tablecmds.c, trigger.c, backend/executor/execMain.c) - Ingres r3 source code (back/qef/qeq/qeaddl.c, psf/psl/pslcons.c) Ingres SQL Reference Guide (http://opensource.ca.com/projects/ingres/documents) - Firebird2 source code (src/gpre/cmd.cpp, jrd/exe.cpp, ...) References ---------- We have a description of foreign keys in the MySQL User Manual: http://www.mysql.com/doc/en/InnoDB_foreign_key_constraints.html See also WL#1438 Improving 'foreign keys' functionality for InnoDB See also WL#1444 "ON UPDATE CASCADE should be recursive" See also WL#1799 Support for Foreign Keys in MySQL Cluster (Responding to a question about WL#1799's relation to WL#148, Jonas Oreland said "... a FK implementation for ndb in mysqld is bound to be useless for a very large portion of mysql cluster users".) See also WL#4099 "Foreign Keys: references privilege". WL#4099 depends on WL#148 but also WL#148 depends on WL#4099, that is, they must occur together. See dev-private email thread "Re: WL#4099 Foreign Keys: references privilege" https://intranet.mysql.com/secure/mailarchive/mail.php?folder=4&mail=25931 See also https://intranet.mysql.com/secure/wiki/ServerDevelopmentRoadMap: " Add support to MyISAM for showing stored foreign key definitions, which is something a lot of GUI applications need (no specific WL entry, Brian thinks this is done)" Feature requests: BUG#3013 DROP DATABASE "force" option (we won't support) BUG#3742 Foreign keys: DROP CONSTRAINT as a compatibility alias for DROP FOREIGN KEY BUG#4919 Bad Foreign Key Definition does not work nor does it produce an error BUG#3558 Foreign keys: ON UPDATE/ON DELETE SET DEFAULT BUG#11232 Return a warning if creating a FOREIGN KEY on storage engines w/out support BUG#13977 InnoDB foreign keys with multiple parent rows act incorrectly BUG#15046 constraint incorrectly enforced BUG#15746 Foreign Keys: Self-referential constraints incorrectly prevent deletes BUG#16039 Data integrity is not validated after foreign key constraints are enabled ! (we won't support) BUG#6085 DROP TABLE fails if table is referenced (from Georg Richter) BUG#17943 Inline foreign key constraint definition should give a warning BUG#19173 Foreign Key reference integrity BUG#22187 Can't create FOREIGN KEY BUG#34932 CREATE TABLE xyz like zyx (we won't support) These bugs will be fixed: BUG#8878: foreigh keys in innodb BUG#10883: foreign key constraint table references are lowercase BUG#11715: naming rules for constraints lead to illstructured information_schema BUG#13301: FK definition requires definition outside of column definition BUG#11472: Triggers not executed following foreign key updates/deletes BUG#25031: No foreign key definition passed to storage engine Pages 418-431 in SQL-99 Complete, Really.
This is "WL#148 LLD, Draft 2009-06-14". It supersedes "WL#148 LLD, Draft 2009-05-14". The change is: * Change wording of ER_FK_TRUNCATE error message. FOREIGN KEYS ============ Low-Level Design Draft: 2009-05-14 Summary contents: Common objects and routines Statements in alphabetical order Effects on non-runtime code groups HLS Contents ======== Documentation Conventions Common objects Variables. Error or warning messages Common routines Parsing Structures and constants Modifications to the process of opening and locking tables fk_prepare(), fk_prepare_statement_flags(), fk_cleanup(), fk_check_constraint_added(), fk_check_common_flags(), fk_given_constraint_name_return_frm_file_name(), fk_check_parent_list(), fk_check_child_list(), fk_check_record_in_memory(), fk_put_record_to_eos_buffer(), fk_cascading_action_child_list(), fk_end_of_statement_check_foreign_keys(), fk_find_changed_columns(), fk_ignore_error(), fk_check_before_drop_tables(), fk_check_before_illegal_statement(), fk_check_before_altering_column() SQL statements ALTER BACKUP CREATE DELETE DROP GRANT INSERT KILL LOAD LOCK RENAME REPAIR REPLACE RESTORE REVOKE SELECT SET SHOW TRUNCATE UPDATE Effects on non-runtime teams Storage engine, optimizer, qa, replication, docs HLS Additional tests ========== We start with a "Documentation" section which will be the seed for the MySQL Reference Manual additions. We then describe "common objects" -- variables and error messages held in common by more than one routine, the sort of thing you find in .h files. There are a lot of new error messages. We then describe "common routines". This is the longest section. It's about half pseudocode, half comment. The big sections are the functions for statement preparation, the functions for checks in the row-by-row loop and at end-of-statement, and the functions for metadata storage. We then list all statements that foreign keys affect, in alphabetical order. So if you want to know what happens specifically with TRUNCATE, look under 'T'. Finally we note what effects should interest developer teams other than dev-runtime. CONVENTIONS. In pseudocode a two-space indentation implies braces, so don't expect {}s everywhere. Parameters are also implied, for example, when you see a reference to 'constraint_list', you may assume that constraint_list was either passed when calling, or is part of the thd. There is a /* comment */ for every piece of code. There are some changes since the previous LLD version. They are mainly due to requests in a dev meeting in Heidelberg in September 2007. In a few places you will see "[ PRIVATE NOTE TO KONSTANTIN AND DMITRI ...]", which you may skip if you are not Konstantin and Dmitri. Documentation ============= Monty wrote: "We need to produce full documentation how the foreign keys should, work seen from the end user level in all the possible scenarios that MySQL offers." Reasonable. We can't list all possible scenarios but we can try to be as detailed as the current manual. The rest of this section is what we think end users should see. Foreign keys for all storage engines ------------------------------------ Till now, all foreign key support was in storage engines. The storage engine interpreted the foreign-key clauses, and handled foreign-key referential actions entirely by itself at data-change time. Starting with MySQL version 6.1 the server can support foreign keys. We call this the "foreign-key-all-engines" option because it means you can define and process foreign keys with the same syntax no matter what "engine=" clause you use. Here are the new things that you should know before switching to the "foreign-key-all-engines" option. SETTING. You must explicitly ask for the option. For example: mysqld --foreign-key-all-engines=1 This is not the default setting, although it might be the default setting in some later version. With 6.1, if you don't say "--foreign-key-all-engines" when you start the server, then the server will behave exactly the same way it did in 6.0. You cannot change the setting at runtime and we recommend that, when you switch, you switch permanently. The old and new ways are not fully compatible (we'll fully explain below), so don't commit to the new way until you're sure you won't need to uncommit later. "Maybe we'll reconsider (default] before 6.1 goes out." FOREIGN KEY CLAUSE IN 'CREATE TABLE'. The syntax is: [CONSTRAINT [constraint_name]] FOREIGN KEY [constraint_name] (child column list) REFERENCES parent_table_name (parent column list) [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] [ ON UPDATE { CASCADE | RESTRICT | SET NULL | SET DEFAULT | NO ACTION } ] [ ON DELETE { CASCADE | RESTRICT | SET NULL | SET DEFAULT | NO ACTION } ] If you used FOREIGN KEY with MyISAM before, you must notice: * "FOREIGN KEY ..." is no longer simply "parsed and ignored". With --foreign-key-all-engines, the server always processes and stores the definition. So there are more error tests. For example, the 'parent_table_name' must exist. If you used FOREIGN KEY with InnoDB before, you must notice: * The parent table must have a primary key or unique key. It's no longer acceptable to refer to any indexed table. * If you don't specify a "referential action" with an ON UPDATE or ON DELETE clause, the default is NO ACTION. It is no longer RESTRICT. * The parent columns must have exactly the same data types as the child columns. It is no longer legal to match, say, DECIMAL against NUMERIC or CHAR against VARCHAR. * The parent columns must have exactly the same lengths as the child columns. It is no longer legal to match, say, CHAR(5) and CHAR(6). * The "FOREIGN KEY foreign_key_name" clause is still supported for backward compatibility, but it's translated to "CONSTRAINT constraint_name" so (a) it's the same name space as other constraints like primary keys, (b) it's for the whole schema. * TRUNCATE does not do a row-by-row delete, it simply removes rows in a child table. PRIVILEGES. Privilege checks will occur both at CREATE time and for every statement, for tables that reference or are referenced by foreign keys, i.e. are children or parents. If you used FOREIGN KEY with InnoDB before, you must notice: * The REFERENCES privilege actually matters now, so breaching security will be more difficult. THREE STORAGE ENGINE CATEGORIES. Here is a chart. Capability Transactional Non-Transactional Basic ---------- ------------- ----------------- ----- Read. Write. Test uniqueness. YES YES NO Statement Rollback. Savepoint. YES NO NO "Transactional" engines (examples: Falcon, InnoDB, and most add-ons from third parties) can rollback to start-of-statement or to last-savepoint. "Non-transactional" engines (examples: Memory, MyISAM) can't do that, but at least they can read and write and check whether a value is unique, as is necessary for parent tables. "Basic" engines (examples: Archive, Blackhole, CSV, Example) lack one or more of the requirements, for example Archive can't have unique keys and Blackhole can't "read". BASIC STORAGE ENGINE CAPABILITIES. If you say CREATE TABLE ... (... FOREIGN KEY ...) engine=blackhole; that's fine. The "--foreign-key-all-engines" switch works with all storage engines, so MySQL parses the foreign-key definition, checks it for errors, and stores it. That is, you'll see the foreign-key information when you SELECT from INFORMATION_SCHEMA tables. But MySQL will ignore the foreign key during DML statements, or return errors. The net positive effect of this behaviour is that you can "round trip", that is, you can temporarily alter a table from engine=myisam to engine=csv and then back to engine=myisam again, without losing constraint information. NON-TRANSACTIONAL STORAGE ENGINE CAPABILITIES. If you say CREATE TABLE ... (... FOREIGN KEY ...) engine=myisam; then not only is the information stored, it's also used, with these restrictions: UPDATE and DELETE statements will fail if the referential action is CASCADE or SET NULL or SET DEFAULT. Both NO ACTION and RESTRICT are okay, but there is no "end of statement" checking -- so effectively everything is RESTRICT, and data-change on self-referencing tables won't always work. For example: CREATE TABLE t1 (s1 CHAR, s2 CHAR, PRIMARY KEY (s1), FOREIGN KEY (s2) REFERENCES t1 (s1)); (The parent and child tables are the same so this is a "self-referencing" table.) INSERT INTO t1 VALUES ('a','b'),('b','a'); (The INSERT fails for the first row because initially the child value 'b' doesn't exist in s1. It won't exist until the second row is inserted. With transactional tables there wouldn't be a failure because checks are at end of statement. This is a "false negative". There is no simple script for making a "false positive".) TRANSACTIONAL STORAGE ENGINE CAPABILITIES. If you say CREATE TABLE ... (... FOREIGN KEY ...) engine=innodb; then the information is stored, and it's used, and there are no restrictions. All referential actions including CASCADE and SET NULL and SET DEFAULT are okay, and checks are at end of statement. If you used FOREIGN KEY with InnoDB before, you must notice: * Old foreign-key information is gone. You'll have to make a script to reproduce the "FOREIGN KEY" clauses after you switch to "all storage engines". The script may encounter errors due to the incompatibilities described elsewhere. CROSS-STORAGE-ENGINE CAPABILITIES. If you say CREATE TABLE t1 (s1 INT PRIMARY KEY) engine=myisam; CREATE TABLE t2 (s1 INT, FOREIGN KEY (s1) REFERENCES t1 (s1)) engine=falcon; the reference "crosses storage engines" -- the parent table is non-transactional, the child table is transactional. The result will be an error message. 'Non-transactional' or 'basic' tables may not reference 'transactional' tables, or vice versa. However, it is legal for an InnoDB table to reference a Falcon table (both transactional), for a MyISAM table to reference a CSV table (both non-transactional or basic), and so on. ERRORS AND WARNINGS. There are about forty new error messages beginning with the words "Foreign key error ..." or "Foreign key warning ...". The things that cause errors are the same, but this was MySQL's opportunity to make messages more specific and explanatory. If you used FOREIGN KEY with InnoDB before, you must notice: * Error numbers are different, so any code that checked with hard-coded integers, e.g. "if error==1005", will need adjustment. Generally the SQLSTATE errors will remain the same, for example '23000' for "foreign key error". ILLEGAL STATEMENTS. The following statements will cause error returns if they would cause a change in a foreign key definition, or change of an object (column, table, or database) which depends on a foreign key definition. Usually the error is ER_FK_STATEMENT_ILLEGAL. ALTER ... CHANGE ALTER ... CONVERT TO CHARACTER SET ALTER ... DISABLE KEYS ALTER ... DROP COLUMN ALTER ... DROP PARTITION ALTER ... DROP PRIMARY KEY ALTER ... MODIFY COLUMN ALTER ... RENAME COLUMN DROP DATABASE DROP TABLE RENAME DATABASE RENAME TABLE REPAIR TABLE TRUNCATE EFFECT OF 'IGNORE'. If IGNORE is used in INSERT or DELETE or UPDATE or REPLACE, then "errors" cause a row to be skipped. (The specific errors that are applicable here will be specified in a future update of the manual.) If there is a BEFORE trigger, it happens before the error checks and its effects are rolled back if possible. Any use of 'IGNORE' turns off end-of-statement checking, and some referential actions (CASCADE, SET NULL, SET DEFAULT) are illegal (the error check for this happens at start of statement). See WL#4103 "Define IGNORE". EFFECT of 'MULTI-TABLE'. When a statement mentions multiple tables, for example with "DELETE FROM t1,t2,t3 ..." or UPDATE t1,t2,t3 ...", any "end-of-statement" checking happens when processing ends "for each table". In other words, "end-of-statement" really means "end of table processing". EFFECT of 'DELAYED'. We ignore the keyword DELAYED if a foreign key is involved. ALTER TABLE ... DROP CONSTRAINT. The proper way to remove a foreign key is by finding out its constraint name (which is either assigned automatically or specified by the user), and saying ALTER TABLE table_name DROP CONSTRAINT constraint_name. ALTER TABLE ... DROP FOREIGN KEY will mean the same thing. CREATE TABLE ... SELECT. There will be a two-stage process. First all the selections are made, then all the INSERTs are done. DELETE. When you delete from a parent table, there can be effects (referential actions) on child tables. That's all as before. If you used FOREIGN KEY with InnoDB before, you must notice: * "Modifications" caused by ON DELETE CASCADE will activate "delete" triggers; "modifications" caused by ON DELETE SET NULL or ON DELETE SET DEFAULT will activate "update" triggers. DROP INDEX. It will be illegal to drop an index which is used to enforce uniqueness for a parent table of a foreign-key constraint, or used for a child table. EXPLAIN. Changes to UPDATE|DELETE EXPLAIN are in WL#706 "Add EXPLAIN support for other statements (UPDATE/DELETE)". WL#148 is not dependent on WL#706. INSERT. When you insert into a child table, there must be a corresponding value in the parent table. That's all as before. INSERT ON DUPLICATE KEY. If an insert action occurs, then the foreign-key rules for INSERT apply. If an update action occurs, then the foreign-key rules for UPDATE apply. REPLACE. If a delete action occurs, then the foreign-key rules for DELETE apply. If an insert action occurs, then the foreign-key rules for INSERT apply. SHOW. The statements SHOW CREATE TABLE and SHOW TABLE STATUS will continue to have information about foreign keys, so that mysqldump will work. UPDATE. When you update a parent table, there can be effects (referential actions) on child tables. When you update a child table, there must be a corresponding value in the parent table. That's all as before. If you used FOREIGN KEY with InnoDB before, you must notice: * The change must be a change to something "distinct" from the original. Trivial changes, for example from 'A' to 'a ' when a collation is case insensitive, will no longer cause any referential actions. * "Modifications" caused by ON UPDATE CASCADE or ON UPDATE SET NULL or ON UPDATE SET DEFAULT will activate "update" triggers. VARIABLES. There is one new global variable. "@@foreign_key_all_engines" (possible values = 0 or 1, i.e. false or true, default 0, not settable at runtime) shows whether you used --foreign-key-all-engines when starting the server. There is one old variable which is still important: "@@foreign_key_checks" (possible values = 0 or 1, i.e. false or true, default 0, settable at runtime) turns off foreign-key checking completely, regardless of the value of the other @@foreign_key_... variables. There is a change in behaviour for the existing variable @@foreign_key_checks: if @@foreign_key_all_engines=1 then an attempt to execute "set @@foreign_key_checks=(either 0 or 1)", inside a function or trigger, will cause ERROR 1336 (0A000): SET FOREIGN_KEY_CHECKS is not allowed in stored function or trigger. MIGRATION FROM INNODB. InnoDB has its own foreign-key code, so we expect InnoDB users to say --foreign-key-for-all-engines=0, and use the InnoDB code. Migrating foreign key support from old (InnoDB) to the new system is hard. We do not supply a script. Instead we recommend that users take a dump of their old database (which will include foreign key definitions), re-start mysqld with --foreign-key-for-all-engines=1, and restore. The restoration will have problems because of the incompatibilities described earlier. Summarizing the big ones: * With InnoDB, defaults are ON UPDATE RESTRICT and ON DELETE RESTRICT. With all_engines, defaults are ON UPDATE NO ACTION and ON DELETE NO ACTION. Since mysqldump does not output defaults, metadata changes. * With InnoDB, NO ACTION is treated like RESTRICT anyway. With --foreign-key-all-engines, NO ACTION foreign keys are effectively checked at end of statement, RESTRICT foreign keys are effectively checked before end of statement. * With InnoDB, it's possible to refer to a non-unique 'key'. With --foreign-key-all-engines, it's impossible. * With InnoDB, there is no check for privileges of any kind. With all_engines, there is a check for REFERENCES privilege. * With InnoDB, cascading updates/deletes don't activate triggers. With all engines, they do. * With InnoDB, the name space for constraints may be different. With all engines, the name space is the schema. * With InnoDB, parent and child keys may have different lengths. With all engines, that won't be true. * With InnoDB, RENAME TABLE is allowed for parents and children. With all engines, that won't be true. * InnoDB does not have an "overlapping constraints" error. * With InnoDB, TRUNCATE of a parent table is row-by-row deletes. With all engines, TRUNCATE of a parent table is illegal unless it's a self reference. [ Note added 2010-12-03: now TRUNCATE of a parent table usually causes an error, see BUG#54678 comments. ] Regardless of the storage engine, you should always run mysql_upgrade before moving to any new version. See MySQL 4.4.8. mysql_upgrade ≈ Check Tables for MySQL Upgrade http://dev.mysql.com/doc/refman/5.1/en/mysql-upgrade.html If you start the server with --foreign-key-for-all-engines, and then run mysql_upgrade --force, you will see a list of all incompatibilities that will occur with the new option. 'Internals' ----------- We will change existing files in the /sql directory (notably table.h sql_base.cc delete.cc insert.cc and update.cc), and will request storage engine teams to support three 'handler interface' points, and exchange more information. The value of @@foreign_key_all_engines controls whether the old code remains effective so storage engines can handle foreign keys themselves, or the new code becomes effective and takes over foreign keys for all engines. MySQL already parses and discards foreign-key clauses; the parser will change only slightly. The clauses will be appended to tables' .FRM files in text form, for example, if table A has a foreign key referencing table B, the full "CONSTRAINT ..." clause appears in a.FRM and a "referenced by" note appears in B.FRM. INSERT/UPDATE/DELETE statements have a new 'prepare' phase (make constraint list and pre-lock as you would for triggers), changes in the row-by-row loop where each affected row is added/changed/removed, and a new end-of-statement (final check) phase. For TRANSACTIONAL actions (on tables controlled by storage engines that can handle statement rollback) ... The row-by-row loop may call fk_check_parent_list() to see if a parent key exists in the parent table, may call fk_check_list() if parent values change, may call fk_check_cascading_action_list() to take 'cascading-action referential action' (CASCADE SET NULL SET DEFAULT). The end-of-statement check, fk_eos_validity_check(), may re-check rows that failed during the row-by-row loop. For NON-TRANSACTIONAL actions (on tables controlled by storage engines that can't handle statement rollback) ... The row-by-row loop may 'check' but it may not do cascading action, so MySQL will reject CASCADE + SET NULL + SET DEFAULT. MySQL checks the constraint first for errors, then does the insert/update/delete (with "transactional" it's the other way around, first do the operation then do the check after that, which we call "optimistic" checking). The end-of-statement check does nothing. Statements with IGNORE are like non-transactional actions -- checking happens before the insert/update/delete, cascading-action referential actions won't happen, and end-of-statement checks aren't necessary. As the name implies, rows that 'fail' (due to integrity checks) are skipped, they cause no errors. Glossary -------- ACTION: data-change. The three possible actions are: "enum Action_type { ACTION_INSERT, ACTION_DELETE, ACTION_UPDATE };". Obviously DELETE statements cause DELETE actions, UPDATE statements cause UPDATE actions, INSERT statements cause INSERT actions. But 'statement' and 'action' are not synonyms! DELETE actions can happen due to ON DELETE CASCADE. UPDATE actions can happen due to ON DELETE SET NULL, etc. See also "referential action". BASIC: not only non-transactional, but also incapable of some other essential operation, such as unique checking. CASCADING ACTION: a referential action that affects child tables, that is, CASCADE or SET NULL or SET DEFAULT. The other referential actions, NO ACTION and RESTRICT, are "check" actions. CHECK: a referential action that looks for matches between parent and child, that is, RESTRICT or NO ACTION. The other referential actions, CASCADE and SET NULL and SET DEFAULT, are "cascading" actions. CHILD: also called "foreign key" or "referencing" or "dependent". The table which is mentioned after the words FOREIGN KEY in CREATE TABLE ... FOREIGN KEY ... REFERENCES, or the specified columns of that table, or the key for those columns of that table. Example: after "CREATE TABLE t2 (s1 INT, FOREIGN KEY (s1) REFERENCES t1 (s1))" table t2 is a child table, and t2.s1 is a child key. The standard word is 'referencing'; MySQL prefers 'child'. CONSISTENT READ: access which contradicts Heraclitus's saying "You can't step into the same river twice." Also called "snapshot". This is a requirement if we want to read "as at the start of the statement" (for searching). CROSS-STORAGE-ENGINE: affecting more than one storage engine. For example, if t1 is InnoDB and t2 is Falcon, and t2 references t1, then the foreign-key relationship is "cross-storage-engine". DATA-CHANGE: any DML except SELECT, that is, insertion or updating or deletion of data in the database. DML: Data Manipulation Language. The most common DML statements are DELETE + INSERT +SELECT + UPDATE + REPLACE. Most statements that change metadata (ALTER + CREATE + DROP) are not DML, but "CREATE ... SELECT" might be. END OF STATEMENT: after all rows are processed for a table, the "end-of-statement validity checks" occur for the "end-of-statement buffer", which contains rows that weren't matched in the row-by-row loop. The term "end of statement" can be misleading. For example, with "UPDATE t1,t2", the end-of-statement process occurs for t1 before anything happens with t2. EOS: abbreviation for end of statement. Use only in code. FK: abbreviation for foreign key. Use only in code. HLS: abbreviation for high-level specification of worklog task. MATCHING: If there is a value 'x' in row#1 of the child table and a value 'x' in row#7 of the parent table, then child-table row#1 and parent-table row#7 are matching rows. Any parent/child row which contains a value which has no match is a 'non-matching' row. NON-TRANSACTIONAL. A storage engine which cannot perform statement rollback is non-transactional. OPTIMISTIC: word used in earlier versions of this document when referring to FK_STATEMENT_FLAG_AFTER_CHECK. OVERLAPPING. Two constraints are overlapping if some or all of the child columns are the same (which means that they have same child table). PARENT: also called "target" or "referenced". In information_schema, it's called "unique" table. The table which is mentioned after the word REFERENCES in CREATE TABLE ... REFERENCES, or the specified columns of that table, or the key for those columns of that table. For example: after "CREATE TABLE t2 (s1 int, FOREIGN KEY (s1) REFERENCES t1 (s1))" table t1 is a parent table, and t1.s1 is a parent key. PK: abbreviation for 'primary key'. The meaning is always 'parent key' (i.e. key of parent table). We'll phase out use of 'PK' eventually. RECURSION: also called "cycling" or "definition looping". Reference to what's already been referenced. Unless the reference is CASCADE / SET NULL / SET DEFAULT, recursions are self-references. REFERENCED: synonym for PARENT. REFERENCING: synonym for CHILD. REFERENTIAL ACTION: what to do with a child table when updating or deleting a parent table. The possible actions are RESTRICT, CASCADE, SET DEFAULT, SET NULL -- not NO ACTION. [ This is a slight departure from SQL:2003 terminology. ] The end-of-statement check is not "referential action", it always occurs in theory, regardless how the foreign key is defined. SELF-REFERENCE: a type of recursion where the parent and child tables are the same (although keys may differ). For example: CREATE TABLE t1 (s1 INT PRIMARY KEY REFERENCES t1 (s1)); or: CREATE TABLE t1 (s1 INT, s2 INT, UNIQUE(s1), FOREIGN KEY (s2) REFERENCES t1 (s1)); STANDARD. "Standard" always means SQL:2003, never SQL:2008. STATEMENT ROLLBACK: restore database to where it was when the statement started. This is not standard terminology. MyISAM is an example of a storage engine that cannot do statement rollback. TABLE NAME: The schema identifier plus the table identifier. So when the LLD says "the table names must be equal", by implication "the schema identifier must be equal". TRANSACTIONAL. A storage engine which can do statement rollback is transactional. ============================================================= Common objects and routines =========================== Common objects: Variables ------------------------- Affected files: sql/set_var.cc There is one new variable. It begins with the prefix 'foreign_key_' so it'll sort with the existing variable, @@foreign_key_checks. @@foreign_key_all_engines is especially significant -- if it is off, none of the code in this worklog task takes effect. Name: @@foreign_key_all_engines Type: Boolean Visible: yes Global: yes Changeable: no, must be set with mysqld --foreign-key-all-engines=1 Default: 0 (FALSE) Effects: If it's on, then operate in the new style: * You can't see any InnoDB foreign keys * InnoDB is not supposed to do its own foreign-key work, as if foreign_key_checks=0 for it. * The code in this WL takes effect If it's off, then operate in the old style: * InnoDB as in version 5.1, MyISAM parse and ignore * The old code, from prior to this WL, remains in effect. Relation to HLS: HLS says: "So inevitably somebody will say "let's have a flag for 'parse and ignore' (as before) or 'enforce'." This is that flag. HLS subtitle is "Implement Foreign Keys (all storage engines)". That's the inspiration for the name @@foreign_key_all_engines. HLS also says there should be something for upgrade/downgrade. This is the only aid for upgrade/downgrade. @@foreign_key_all_engines is visible to users. Monty suggested one more variable foreign_key_checks_after_alter, for checking whether integrity is okay after an ALTER. We won't do that. Instead, we'll have WL#4175 "Foreign Keys: Check tables". Common objects: Error or warning messages ----------------------------------------- Affected files: include/mysqld_error.h, include/sql_state.h, sql/share/errmsg.txt There will be no use of these old errors (they are vague or obsolete): ER_CANNOT_ADD_FOREIGN HY000 Cannot add foreign key constraint ER_CANT_CREATE_TABLE HY000 Can't create table %s (errno: 150) ER_DROP_INDEX_FK HY000 Cannot drop index '%-.192s': needed in a foreign key constraint ER_FOREIGN_DUPLICATE_KEY 23000 Upholding foreign key constraints for table '%.192s', entry '%-.192s', key %d would lead to a duplicate entry ER_FOREIGN_KEY_ON_PARTITIONED HY000 Foreign key condition is not yet supported in conjunction with partitioning ER_NO_REFERENCED_ROW 23000 Cannot add or update a child row: a foreign key constraint fails ER_NO_REFERENCED_ROW_2 23000 Cannot add or update a child row: a foreign key constraint fails (%s) ER_ROW_IS_REFERENCED 23000 Cannot delete or update a parent row: a foreign key constraint fails ER_ROW_IS_REFERENCED_2 23000 Cannot delete or update a parent row: a foreign key constraint fails (%s) ER_WRONG_FK_DEF 42000 Incorrect foreign key definition for '%-.192s': %s There will be new warnings and errors. They'll be mentioned in later sections. SQLSTATE will usually be '42000' for definitions, '23000' for checks, 'HY000' for as-yet-undecided cases. We will ignore the more specific SQLSTATE values requested via BUG#21403 or WL#3421. ER_FK_CASCADING_ACTION_AND_SELF_REFERENCE Found during UPDATE|DELETE. The implementor has said he may need to specify different error-code and wording. sqlstate '42000' errmsg 'Foreign key error: cascading action and self-reference.' ER_FK_CHANGE_BY_SUBQUERY Found during INSERT|UPDATE|DELETE sqlstate 'HY000' errmsg 'Foreign key error: Subquery exists which might change an affected table' ER_FK_CHANGE_BY_TRIGGER Found during INSERT|UPDATE|DELETE sqlstate 'HY000' errmsg 'Foreign key error: trigger exists which might change an affected table' ER_FK_CHILD_COLUMN_DUPLICATED Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Child column duplicated' /* Since the statements that would cause ER_FK_CHILD_COLUMN_EXISTS are all in the list of "ILLEGAL STATEMENTS", it currently has no importance. */ ER_FK_CHILD_COLUMN_EXISTS Found during ALTER TABLE MODIFY, ALTER TABLE DROP, ALTER TABLE RENAME sqlstate '23000' errmsg 'Foreign key error: constraint X: cannot change because foreign key refers to column %s' ER_FK_CHILD_DATA_TYPE Found during CREATE|ALTER, for ENUM / TIMESTAMP / SET / BLOB / TEXT. GEOMETRY, etc.? sqlstate '42000' errmsg 'Foreign key error: data type of 'column_name' column is illegal for foreign key' ER_FK_CHILD_NO_MATCH Found during INSERT, UPDATE, or variations thereof. Controversial (user finds out what's in parent table). sqlstate '23000' errmsg 'Foreign key error: constraint X: no matching key for value X, it is not in parent table' ER_FK_CHILD_NO_MATCH_FULL sqlstate '23000' errmsg 'Foreign key error: constraint X: no matching key for partly null value with MATCH FULL' ER_FK_CHILD_SET_DEFAULT Found during ALTER DROP DEFAULT sqlstate '42000' errmsg 'Foreign key error: Constraint X: DROP DEFAULT illegal while foreign key exists with SET DEFAULT' ER_FK_CHILD_SO_CHILD_INDEX_UNDROPPABLE Found during DROP INDEX sqlstate '42000' errmsg 'Foreign key error: Constraint X: you cannot drop an index that the child key depends on" ER_FK_CHILD_TABLE_EXISTS Found during DROP DATABASE, DROP TABLE, RENAME DATABASE, RENAME TABLE sqlstate '23000' errmsg Foreign key error: constraint '%-.192s': you cannot use '%s' statement because table '%s' has a foreign key that refers to it. ER_FK_CHILD_TEMPORARY Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Child table is temporary' ER_FK_CHILD_TWICE Found during CREATE|ALTER. Obsolete. Will eentually be removed from LLD. sqlstate '01000' errmsg 'Foreign key warning: Constraint X: two references to the same parent table' ER_FK_CHILD_VALUE_EXISTS Found during DELETE, UPDATE, or variations thereof. Controversial (user finds out what's in parent table). sqlstate '23000' errmsg 'Foreign key error: constraint X: cannot change because foreign key refers to value %s' ER_FK_CONSTRAINT_NAME_DUPLICATE Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Duplicate constraint name' ER_FK_CONSTRAINT_NAME_ILLEGAL Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Illegal constraint name' ER_FK_CORRELATED Found during UPDATE|DELETE sqlstate '42000' errmsg 'Foreign key error: Correlated subquery refers to table affected by cascading' ER_FK_EOS_BUFFER Found during INSERT|UPDATE|DELETE sqlstate 'HY000' errmsg 'Foreign key error: End-of-statement buffer maximum length exceeded' ER_FK_FOREIGN_KEY_ID Found during CREATE|ALTER sqlstate '01000' errmsg 'Foreign key warning: Constraint X: Syntax 'FOREIGN KEY [id]' is old style, changing to CONSTRAINT [id]' ER_FK_IGNORE_AND_CASCADING_ACTION Found during UPDATE|DELETE. sqlstate 'HY000' errmsg 'Foreign key error: cannot IGNORE, foreign key has cascade / set null / set default" ER_FK_IGNORE_AND_NO_ACTION Found during UPDATE|DELETE. sqlstate '01000' errmsg 'Foreign key warning: IGNORE, so NO ACTION will be treated like RESTRICT" /* HLS says ER_UNSUPPORTED_FEATURE. We should change HLS. */ ER_FK_MATCH_PARTIAL Found during CREATE|ALTER. sqlstate '42000' errmsg 'Foreign key error: Constraint X: MATCH PARTIAL not supported' ER_FK_NO_CONSISTENT_READ Found during insert/update/delete. sqlstate '42000' errmsg 'Foreign key error: storage engine doesn't support consistent reads (snapshots)' ER_FK_NON_TRANSACTIONAL_CASCADING_ACTION Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Non-transactional engine and cascade / set null / set default' ER_FK_NON_TRANSACTIONAL_CHILD Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Child non-transactional, parent transactional' ER_FK_NON_TRANSACTIONAL_NO_ACTION Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Engine of table 't2' is non-transactional and NO ACTION is specified' ER_FK_NON_TRANSACTIONAL_NO_EOS Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Non-transactional engine so there is no end-of-statement check' ER_FK_NON_TRANSACTIONAL_PARENT Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Child transactional, parent non-transactional' /* This warning is unused. If it's never used, delete it. */ ER_FK_NON_TRANSACTIONAL_UNRELIABLE Found during CREATE|ALTER sqlstate '01000' errmsg 'Foreign key warning: Constraint X: Non-transactional engine so integrity checking is unreliable' ER_FK_PARENT_AUTO_AND_CASCADE Found during CREATE|ALTER. Obsolete. sqlstate '42000' errmsg 'Foreign key error: Constraint X: Parent column is AUTO_INCREMENT and referential action is CASCADE' ER_FK_PARENT_COLUMN_COUNT Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Parent column count (%s) not equal to child column count (%s) ER_FK_PARENT_COLUMN_DUPLICATED Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Parent column duplicated' /* Since the statements that would cause ER_FK_PARENT_COLUMN_EXISTS are all in the list of "ILLEGAL STATEMENTS", it currently has no importance. */ ER_FK_PARENT_COLUMN_EXISTS Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Column is referenced by %s' ER_FK_PARENT_COLUMN_MANDATORY Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: '(parent column list)' is mandatory in MySQL' ER_FK_PARENT_DATA_TYPE Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Parent data type %s not same as child data type %s ER_FK_PARENT_KEY_NOT_ALL Found during CREATE|ALTER. Synonym for ER_FK_PARENT_KEY_NO_PK_OR_UNIQUE sqlstate '42000' errmsg 'Foreign key error: Parent columns don't correspond to a PRIMARY KEY or UNIQUE constraint' ER_FK_PARENT_KEY_NOT_INDEXED Found during CREATE_ALTER. Synonym for ER_FK_PARENT_KEY_NO_PK_OR_UNIQUE sqlstate '42000' errmsg 'Foreign key error; Parent columns don't correspond to a PRIMARY KEY or UNIQUE constraint' ER_FK_PARENT_KEY_NOT_UNIQUE Found during CREATE|ALTER. Synonym for ER_FK_PARENT_KEY_NO_PK_OR_UNIQUE sqlstate '42000' errmsg 'Foreign key error: Parent columns don't correspond to a PRIMARY KEY or UNIQUE constraint' ER_FK_PARENT_MYSQL Found during CREATE|ALTER sqlstate 'HY000' errmsg 'Foreign key error: Constraint X: Parent table is in mysql database' ER_FK_PARENT_NULLABLE Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Parent key has nullable column' ER_FK_PARENT_SO_UNIQUE_INDEX_UNDROPPABLE Found during DROP INDEX sqlstate '42000' errmsg 'Foreign key error: Constraint X: references table, so you cannot drop any of its unique indexes' ER_FK_PARENT_TEMPORARY Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Parent table is temporary' ER_FK_PARENT_VIEW Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Parent table is a view' ER_FK_SET_NULL_NOT_NULL Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: SET NULL for a NOT NULL column' /* ER_FK_SET_NULL_TIMESTAMP won't happen in scope of this worklog task, Dmitri says TIMESTAMP is illegal anyway. */ ER_FK_SET_NULL_TIMESTAMP Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: SET NULL for a TIMESTAMP column' ER_FK_SKIP_CHECK_IS_0 Found during data-change sqlstate '01000' errmsg 'Foreign key warning: constraint %s: skipped because foreign_key_check = 0' /* ER_FK_SKIP_NON_TRANSACTIONAL_IS_0 won't happen in scope of this worklog task. */ ER_FK_SKIP_NON_TRANSACTIONAL_IS_0 Found during CREATE|ALTER. Controversial (might be better in EXPLAIN). sqlstate '01000' errmsg 'Foreign key warning: Constraint X: skipped because @@foreign_key_non_transactional=0' ER_FK_STATEMENT_ILLEGAL Found during RENAME etc. sqlstate 'HY000' errmsg 'Foreign key error: you cannot use %s statement because %s table is in a relationship' /* ER_FK_TOO_MANY_LEVELS_OF_RECURSION is not currently used. */ ER_FK_TOO_MANY_LEVELS_OF_RECURSION Found during ?. really 'too many levels of cascade'? sqlstate 'HY000' errmsg 'Foreign key error: Constraint X: Too many levels of recursion' /* ER_FK_TRIGGER is not currently used. */ ER_FK_TRIGGER Found during CREATE|ALTER sqlstate '42000' errmsg 'Foreign key error: Constraint X: Trigger and referential action on same table' ER_FK_OVERLAP Found during CREATE|ALTER. sqlstate '42000' errmsg 'Foreign key error: Constraints X and Y: Have overlapping columns and CASCADE/SET NULL/SET DEFAULT as cascading action' ER_FK_TRUNCATE Found during TRUNCATE. sqlstate '42000' errmsg 'Foreign key error: constraint '...': you cannot use 'TRUNCATE TABLE' statement because table '...' has a foreign key reference from table '...' Common routines --------------- common routines: Parsing ------------------------ Affected routines: sql_yacc.* * There will be a change regarding the parser. Currently MySQL accepts and ignores column references, but not with standard syntax. The requirements are; 1. Change parser to allow multiple occurrences of reference_definition. 2. Ensure parser allows reference_definition(s) after [COMMENT 'string'] [COLUMN_FORMAT {FIXED|DYNAMIC|DEFAULT}] [STORAGE {DISK|MEMORY}]. 3. Allow CONSTRAINT constraint_name in reference_definition. * MATCH PARTIAL is okay (error ER_FK_MATCH_PARTIAL happens later) * absence of parent column list is okay (ER_FK_PARENT_COLUMN_MANDATORY happens later) * 'FOREIGN KEY foreign_key_name' is okay (ER_FK_CONSTRAINT_NAME happens later) There is a transformation: 'FOREIGN KEY foreign_key_name' becomes 'CONSTRAINT constraint_name'. common routines: structures and constants ----------------------------------------- Affected files: sql/table.h We keep an in-memory list of constraints that are relevant for the current statement. We do not keep much foreign-key information in TABLE_LIST because: when a foreign key REFERENCES table t it isn't really "pointing to t", it is "pointing to a unique or primary key of t". We'll call this structure St_constraint_list rather than St_foreign_key_list because someday we might want other 'constraints' here, e.g. check constraints. If you UPDATE t1, and t1 is referenced by t2 (cascading), and t2 is referenced by t3 (cascading), then constraint_list occurs 3 times. Usually that's about as big as it gets. The list is created at statement start (see fk_prepare_... functions), and removed at statement end. It's not shared among connections. It's backed up by new information in .FRM files. /* Formerly this was 'enum Constraint_type'. We will often ask "is type = (x or y or z)" so it's more convenient to define bits. */ FK_FLAG_NO_ACTION 1 FK_FLAG_RESTRICT 2 FK_FLAG_CASCADE 4 FK_FLAG_SET_NULL 8 FK_FLAG_SET_DEFAULT 16 enum enum_fk_match_options { FK_SIMPLE, FK_FULL, FK_PARTIAL } /* Flag values for fk_mysql_flags. Inspired by other DBMSs. We will not use any of these flags but they will appear in the .h file anyway, so readers can see what the eventual purpose of 'fk_mysql_flags' is. */ FK_MYSQL_FLAG_DISABLED 1 /* See also WL#3992. */ FK_MYSQL_FLAG_NOT_FOR_REPLICATION 2 FK_MYSQL_FLAG_NOT_TRUSTED 4 FK_MYSQL_FLAG_SYSTEM_NAMED 8 FK_MYSQL_FLAG_COPIED_FROM_INNODB 16 /* The decision was that a foreign key may have no more than 16 columns. */ FK_MAX_COLUMNS 16 /* The in-memory list of constraints that are relevant for the current statement. Many fields are "names" and will be UTF-8 with up to 64 characters, but we leave storage details to the implementor. The fk_parent_table_columns and fk_child_table_columns fields must represent lists of columns (up to FK_MAX_COLUMNS), so they can be arrays of names, arrays of ordinal numbers, or bit maps. */ St_constraint_list { constraint_schema /* only needed for error messages */ constraint_name /* only needed for error messages */ constraint_definer /* not used in this LLD */ fk_parent_table_schema fk_parent_table_name fk_parent_table_columns fk_parent_constraint_name fk_child_table_schema /* technically unnecessary */ fk_child_table_name fk_child_table_columns fk_match_option /* see enum_fk_match_options */ fk_on_update_referential_action /* FK_FLAG_NO_ACTION etc. */ fk_on_delete_referential_action /* FK_FLAG_NO_ACTION etc. */ fk_child_index_name /* not used in this LLD */ fk_parent_index_name /* not used in this LLD */ fk_mysql_flags /* see FK_MYSQL_FLAG_DISABLED etc. */ fk_actions_handled /* bitmap which we use to mark actions of foreign key constraint which have been already processed by open_tables(). For details see fk_add_tables_for_fks(). */ } CONSTRAINT_LIST enum Action_type { ACTION_INSERT, ACTION_DELETE, ACTION_UPDATE }; /* bits for fk_statement_flags. see fk_prepare_statement_flags(). */ FK_STATEMENT_FLAG_NO_ACTION 1 /* assert = FK_FLAG_NO_ACTION */ FK_STATEMENT_FLAG_RESTRICT 2 /* assert = FK_FLAG_RESTRICT */ FK_STATEMENT_FLAG_CASCADE 4 /* assert = FK_FLAG_CASCADE */ FK_STATEMENT_FLAG_SET_NULL 8 /* assert = FK_FLAG_SET_NULL */ FK_STATEMENT_FLAG_SET_DEFAULT 16 /* assert = FK_FLAG_SET_DEFAULT */ FK_STATEMENT_FLAG_NO_READ 32 FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK 64 FK_STATEMENT_FLAG_NO_SAVEPOINT 128 FK_STATEMENT_FLAG_NO_CONSISTENT_READ 256 FK_STATEMENT_FLAG_NO_UNIQUE_CHECK 512 FK_STATEMENT_FLAG_AFTER_CHECK 1024 FK_STATEMENT_FLAG_CASCADING_ACTION 2048 FK_STATEMENT_FLAG_RECURSION 4096 FK_STATEMENT_FLAG_NO_EOS 8192 FK_STATEMENT_FLAG_IGNORE 16384 FK_STATEMENT_FLAG_PESSIMISTIC 32768 FK_STATEMENT_FLAG_OPTIMISTIC 65536 FK_STATEMENT_FLAG_FALCON 131072 /* The "record information" structure is for eos_buffer and row_recursion_stack. Don't be fooled by 'char *'; MySQL stores complete contents somewhere in Record_info. */ struct Record_info; { Action_type action_type; const char *table_name; void *record; Bitmap changed_column_map; }; /* The 'end-of-statement buffer' contains row values that eos_validity_check() will re-check after failures in the row-by-row loops. */ typedef ListEos_buffer; Eos_buffer eos_buffer; /* The 'recursion stack' contains row values that must be pushed before a cascading action starts and popped when a cascading action ends. */ typedef List Row_recursion_stack; Row_recursion_stack row_recursion_stack; /* Also, sql/handler.h currently has: " /* Bits in table_flags() to show what database can do */ #define HA_NO_TRANSACTIONS (1 << 0) /* Doesn't support transactions */ #define HA_CAN_INDEX_BLOBS (1 << 10) #define HA_CAN_INSERT_DELAYED (1 << 14) #define HA_HAS_OWN_BINLOGGING (LL(1) << 33) " We'll have to add new flags for questions in fk_prepare_statement_flags(). */ Modifications to the process of opening and locking tables ---------------------------------------------------------- The current implementation of table-level locking assumes that a statement opens and locks all tables that it uses directly or indirectly (via routines/triggers/views) at the beginning of its execution. To discover and open a full set of these tables, we use a recursive process. After that this set (actually a list) is used to lock all these tables at once (so-called "prelocking"). This (table open+lock) process also obtains (or at least should obtain) appropriate metadata locks on the objects, ensuring that there is proper protection from concurrent DDL statements. We will adjust this (table open+lock) process to take into account foreign key constraints, and open appropriate parent and child tables. Opening of tables for foreign keys has to be part of this recursive process since cascading action might mean that triggers have to be involved, and therefore more tables/routines/foreign keys have to be processed. We assume here that information about the foreign key constraints in which a table participates is loaded from .FRM when we open the table, and stored in the TABLE_SHARE object. Also we suggest there will be no separate metadata lock object for each foreign key; instead we rely on exclusively locking both the child table and the parent table when we want to add/modify/drop foreign-key constraints. Of course this somewhat reduces concurrency between DDL and DML statements. But such an approach should work well for storing data about foreign keys in .FRM files and in TABLE_SHARE, and it is simpler to implement. As a side effect of the above process we will build constraint_list -- a list of constraints which are going to be involved in the handling of this statement (see fk_add_tables_for_fks() for details). Here is the pseudocode describing the (sql_base.cc) open_tables() function which implements this process. Proposed modifications are marked by "+". /* As input, open_tables() gets table list containing all tables which are directly used by the statement (e.g. for multi-delete it will contain all tables from FROM clause and all tables from subqueries). As result of this function we will get a table list containing all tables which are directly and indirectly (through routines, views, triggers and foreign keys) used by the statement. */ open_tables(table_list) ... sp_cache_routines_and_add_tables() ... for (each table in statement's table_list) /* By opening a table we also implicitly take a shared meta-data lock on it. By opening both parent and child tables, and requiring that DDL operations that need to add/change/drop a foreign-key constraint should take exclusive metadata locks on both parent and child tables, we should provide proper isolation for both DDL and DML operations. */ open_table(table) if (table is a view) sp_cache_routines_and_add_tables_for_view(table) if ((table has triggers) and (table is going to be write-locked)) sp_cache_routines_and_add_tables_for_triggers(table) + /* Add tables which are going to be involved in foreign-key checks + and cascading actions to the table list. This ensures that these + tables will be open and processed in the same fashion as the + rest of the tables (e.g. that we will process triggers on these + tables). This also ensures that they will be locked along with + the rest of the tables used by the statement. */ + if ((table participates in FK) and (table is going to be write-locked)) + fk_add_tables_for_fks(table) And below are newly introduced functions: /* For particular element of table list process all FK constraints which are going to be involved in operation performed on this table and if needed add approapriate tables to the table list. Also if necessary add FK to statement's constraint_list. */ /* We might know not only [qualified.]table_name but also the column list. For INSERT|DELETE this is "all columns". For UPDATE statement it's "only the columns of the SET clause" plus "columns that can change automatically" (e.g. TIMESTAMP although currently TIMESTAMP is irrelevant) plus "columns that are targets in triggers". If we're not actually updating a column, then obviously a foreign key based on it won't be affected. Since we do not allow TIMESTAMP in foreign keys, nothing can change "automatically". When it's too hard to figure this out, the column list is "all columns". Unfortunately it *is* too hard to figure this out when the table is the main UPDATE statement, or the indirect object of a trigger or routine. So the 'if (table->trg_event_map ...)' lines in fk_add_tables_for_fks() are only checking thus: If the table element which we are handling was added as result of handling of some foreign-key action, we know "the list of columns to be updated". We can use this list to optimize out foreign keys which won't be affected and therefore should not be checked or processed when this foreign-key action takes place. In other cases, we don't know the "list of columns to be updated". For example, when the table is the direct object of the main UPDATE statement, or the indirect object of a trigger or routine, during open_tables we don't yet know "the list of columns to be updated". If we did try to find out now, or if we tried to optimize out foreign keys after finding the "list of columns to be updated", we'd have to take into account that a table's BEFORE trigger might change the list. */ */ fk_add_tables_for_fks(table) /* 'trg_event_map' bitmap in table list elements indicates what kind of operations are going to be carried out on this table. */ if (table->trg_event_map & (UPDATE | INSERT)) for (each fk from table->table->s->fk_as_child) if (table->trg_event_map & INSERT or table->list_of_fields_to_be_updated is not known or ((table->list_of_fields_to_be_updated + columns updated by before update triggers on table) intersects with fk->fk_child_table_columns)) /* Check if we already have processed this type of action for this constraint. Update constraint_list if we haven't. */ if (fk_add_to_constraint_list(constraint_list, fk, CHECK_PARENT) add_table_to_table_list(fk->parent_table_name, TL_READ, 0, 0); if (table->trg_event_map & (UPDATE | DELETE)) for (each fk from table->table->s->fk_as_parent) if (table->trg_event_map & UPDATE and (table->list_of_fields_to_be_updated is not known or ((table->list_of_fields_to_be_updated + columns updated by before update triggers on table) intersects with fk->fk_parent_table_columns)) and fk_add_to_constraint_list(constraint_list, fk, CHILD_UPDATE)) if (fk->fk_update_referential_action != NO ACTION and fk->fk_update_referential_action != RESTRICT) add_table_to_table_list(fk->child_table_name, TL_WRITE, UPDATE, fk->fk_child_table_columns); else add_table_to_table_list(fk->child_table_name, TL_READ, 0); if (table->trg_event_map & DELETE and fk_add_to_constraint_list(constraint_list, fk, CHILD_DELETE)) if (fk->fk_update_referential_action == NO ACTION or fk->fk_update_referential_action == RESTRICT) add_table_to_table_list(fk->child_table_name, TL_READ, 0); else if (fk->fk_update_referential_action == CASCADE) add_table_to_table_list(fk->child_table_name, TL_WRITE, DELETE, 0); else add_table_to_table_list(fk->child_table_name, TL_WRITE, UPDATE, fk->fk_child_table_columns); /* Check if this constraint is already in the list and we have already handled this specific type_of_action for it. This allows us to avoid infinite loops when building the list of used tables/constraints. It also ensures that for each constraint/type_of_action pair we have only one appropriate table that we open and later lock. */ bool fk_add_to_constraint_list(constraint_list, new_constraint, type_of_action) for(each constraint in constraint_list) if (constraint->constraint_schema == new_constraint->constraint_schema and constraint->constraint_name == new_constraint->constraint_name) if (constraint->fk_actions_handled & type_of_action) return FALSE; else constraint->fk_actions_handled|= type_of_action; return TRUE; /* This constraint has not been handled for any type of action. Add it to the list. */ new_constraint->fk_actions_handled= type_of_action; add(constraint_list, new_constraint) return TRUE; As a result of the above process, for each foreign key and action which is going to be involved in the processing of our statement, we will have one instance of the appropriate table opened (by adding an element to the statement's table list) and later locked. These instances are going to be used during statement execution for performing checks or cascading actions for corresponding foreign keys. Also, as a side effect of this process, we will get a list of the foreign key constraints (without duplicates) which are going to be involved in the execution of our statement. common routines: fk_prepare() ----------------------------- Called from: insert/update/delete/replace, and sometimes others. Affected files or routines: none. this is a new function. This fk_prepare() routine relies on the fact that the preceding call to open_tables() builds constraint_list for the statement. The fk_prepare() routine also uses the fk_prepare_statement_flags() subroutine, which decides the process to follow depending on capabilities of the weakest storage engine. Sometimes it might seem efficient to bypass fk_prepare() and invoke a subroutine directly, for example EXPLAIN doesn't need to lock like UPDATE does. But we'll make this a rule: The code for every SQL statement must begin with a call to fk_prepare() if there is any chance that it might use any other foreign-key routine later. (A "foreign-key routine" is any routine described in this LLD, with the prefix "fk_".) fk_prepare() /* Assume that all flags and persistent areas are reset by fk_cleanup(). */ /* Check @@foreign_key_all_storage_engines. If it's off, then all the later fk_... routines will see an empty constraint list and do nothing. */ if (@@foreign_key_all_storage_engines == 0) return; fk_prepare_statement_flags(); /* set fk_statement_flags */ /* The words "We do no special 'bulk' code" appear in later comments about update etc. The hope is to avoid things like bulk_update_row(). Comments elsewhere suggest that, if there is an AFTER trigger, the turnoff is possible with this flag. Dmitri would know. */ HA_EXTRA_UPDATE_CANNOT_BATCH flag = true; /* The rest of this routine has all the error tests and warning tests that we can make before actually going through the rows. */ /* Error if: all data-change operations are illegal if there is a trigger that changes one of the tables in table_list, or a subquery that changes one of the tables in table_list. The original LLD said, in a "Q+A" section: "Q. Should we handle in a special way the situation when foreign key constraint actions should affect a table which is used in statement causing these actions (we are not talking about self-referencing tables here, but about cases when we have delete from parent table which uses subquery with child table, or when one does multi-update which should update parent table but also uses child table for reading)? Or should we simply emit error like we do it for triggers? ... A. This is a restriction of this worklog, we will simply emit an error." Since Dmitri wrote the trigger check, he knows how to copy, here. */ for (each table in table_list other than main-statement tables) if (dmitri_check("trigger a change to another table_list table?")) error(ER_FK_CHANGE_BY_TRIGGER); if (dmitri_check("statement uses subquery which changes this?")) error(ER_FK_CHANGE_BY_SUBQUERY); /* Konstantin explains that more changes of this nature are not necessary: "The conflict will be discovered on the level of pre-locking. It will be a read/write lock conflict." */ /* Error if: correlated subquery table also in foreign-key list. This error is due to Heidelberg foreign-key meeting #4. See also WL#4100 Foreign Keys: correlated subqueries. */ for (each table in foreign-key list) for (each correlated subquery in statement) if (dmitri_check("table is in subquery?") error(ER_FK_CORRELATED); /* Error if: UPDATE|DELETE|REPLACE + IGNORE + 'cascading action' action. For example, if you "delete ignore from t1", and t2 references t1 with "on delete set null", that's an error. See comments for fk_ignore_error() function). This is very much like the test for ER_FK_NON_TRANSACTIONAL_CASCADING_ACTION. */ The FK_STATEMENT_FLAG_CASCADING_ACTION flag only goes on if such action is really possible. */ if (fk_statement_flags&FK_STATEMENT_FLAG_CASCADING_ACTION != 0) if (fk_statement_flags&FK_STATEMENT_FLAG_IGNORE != 0) error(ER_FK_IGNORE_AND_CASCADING_ACTION); /* Error if: UPDATE|DELETE|REPLACE, 'cascading action' action, self-reference. See comments in fk_eos_validate_foreign_key(). */ if (fk_statement_flags&FK_STATEMENT_FLAG_SELF_REFERENCE != 0) if (fk_statement_flags&FK_STATEMENT_FLAG_CASCADING_ACTION != 0) error(ER_FK_CASCADING_ACTION_AND_SELF_REFERENCE); /* Error if: missing appropriate privilege. Once WL#4099 "Foreign keys: references privilege" is complete we will have appropriate check for REFERENCES privilege here. */ /* Warning if: IGNORE and transactional. The original LLD said there should be "a warning for ignore and a transactional engine - that no eos checks are performed". There's no need to warn for INSERT, because INSERT doesn't cause cascading action anyway. */ if (action!=INSERT) if (fk_statement_flags&FK_STATEMENT_FLAG_IGNORE) if (fk_statement_flags&FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK == 0) warning(ER_FK_IGNORE_AND_NO_ACTION); /* Warning if: self-referencing columns and ON DELETE CASCADE. When a foreign key is self-referencing and the foreign-key columns = the parent-key columns, ON DELETE CASCADE is meaningless. There was consideration of adding a warning here, but the initial feeling is: it's too trivial to bother checking. */ /* We leave this reminder in because a certain person didn't get it: We have a list of the constraints that this operation will affect. And we have a list of the tables that this operation will affect. These are two different things, because one table might have multiple constraints. Tables have constraints that point to constraints, rather than "tables point to other tables via constraints". */ Common routines: fk_prepare_statement_flags ------------------------------------------- The decision "what process to use" depends on flags exported by the storage engine(s). Cross-storage-engine references are possible, so rather than depending on one storage engine's capability, we depend on a combination for all storage engines that a statement needs. So after MySQL knows what tables are involved in INSERT/UPDATE/DELETE, it sets a single set of flags that it will use in the row-by-row and end-of-statement processes. This set of flags is 'fk_statement_flags', which is per-statement. Capability flags are negative, that is, they're about what the storage engine cannot do. On the principle that "the strength of a chain depends on its weakest link", the loop causes fk_statement_flags to depend on the least capable engine. For example: if child-table storage engine can do savepoints, and parent-table storage engine cannot do savepoints, then fk_statement_flags=FK_STATEMENT_FLAG_NO_SAVEPOINT. The "fk_prepare_statement_flags" routine is also callable when a foreign key is defined in CREATE/ALTER, so that MySQL can "warn" what will happen later. We pass the constraint_list made in fk_prepare_foreign_key_tables(). We pass the table_list made in fk_prepare_foreign_key_tables(). We pass Action_type, which at this point is main-statement action. /* Assume "fk_statement_flags= 0" happened earlier, in fk_cleanup(). */ for (each table in table_list) if (engine cannot read) /* Probably blackhole + example engines will say "cannot read". In this case, matching rows will never be found. So assume all checks are TRUE. (If MySQL assumed that all checks are FALSE, every operation would fail. That's just as good. Nobody cares.) These are examples of BASIC storage engines. This is like running with all constraints disabled. */ set fk_statement_flags|=FK_STATEMENT_FLAG_NO_READ; if (engine cannot statement rollback) /* Probably the myisam, memory, archive and csv engines will say "cannot statement rollback" or will say "cannot support unique keys" etc. In this case, end-of-statement checks won't work. So assume that NO ACTION really means RESTRICT, and ignore all other referential actions. */ set fk_statement_flags|=FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK; if (engine cannot savepoint) /* Probably the pbxt engine will say "cannot savepoint". In this case, MySQL cannot back up after a failure with CASCADE. That would be important if we allowed cascading to happen with UPDATE IGNORE or DELETE IGNORE. But we don't. So at the moment "engine cannot savepoint" is unimportant. Monty suggested it's like no-transaction. */ set fk_statement_flags|=FK_STATEMENT_FLAG_NO_SAVEPOINT; if (engine cannot unique check) /* The original LLD had this item: "* support for unique index and index access operations for checking of parent side of foreign key relationships". It wasn't mentioned again, so perhaps it's regarded as unimportant. But an engine that can't check uniqueness is in the BASIC category (not even as good as "non-transactional"). This applies to some storage engines "forever", and it applies to some storage engines "if @@unique_checks=0 and the storage engine respects @@unique_checks". */ set fk_statement_flags|=FK_STATEMENT_FLAG_NO_UNIQUE_CHECK; if (engine cannot support "update_row()") /* The original LLD had this item: "non-transactional engines and/or engines without support for update_row() or delete_row() (MyISAM, Archive, CSV) can not be used as child tables in foreign key relationships with ON [DELETE|UPDATE] [CASCADE|SET NULL|SET DEFAULT] actions. There is no check for "cannot support update_row()" because (I hope) there is no engine that can't support transactions, can't support unique checks, and yet supports update_row(). */ /* The original LLD said: "We ask the engine for a consistent snapshot. If it's not supported, we error out." Okay, but it's commented out. 1. Engines like SolidDB with pessimistic flag (not MVCC) will say 'no' at isolation level READ COMMITTED. 2. Some people got the idea that foreign-key checks are not supposed to be affected by AFTER triggers. That's not what the standard says. It's not how Oracle works. 3. As noted in the original LLD, for self-referential constraints one does need to see rows that were changed during the statement, not "as of" statement start. 4. The original LLD's "test case" was: create table t1 (id int); create table t2 (sum int); create trigger t1_ai after insert on t1 for each row insert into t2 values ((select sum(id) from t1)); create trigger t1_au after update on t1 for each row insert into t2 values ((select sum(id) from t1)); insert into t1 (id) values (1), (2), (3); ... which gives different results for different storage engines. But there's no foreign key here! The test case shows that subqueries and unions are a general problem. Sure, they hurt foreign key handling, but only because they hurt everything. Write a bug report. [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: the original LLD had a statement: "InnoDB + MySQL currently allow a subquery or a trigger [to] read the table that is updated through a cascading foreign key relationship. Subqueries or triggers see the actual data of the table at each invocation. Since the standard requires that all changes made by a statement are not visible when it is evaluated, this behavior is incorrect ..." and then you give an example, which I have copied to the section labelled "Additional Tests". But the statement about the standard is wrong, and InnoDB behaviour is CORRECT. I can't believe your recommendations for "consistent read" until you have an example of a real foreign-key problem. So I removed several paragraphs about the "rationale" and "provided level of support" for consistent read. If you make a new worklog task, restore all the "rationale" stuff when you do. ] 5. In Heidelberg, Dmitri gave a better test case: create table parent (pk int primary key, b int); create table child (fk int, foreign key (fk) references parent (pk) on delete cascade, b int); insert into parent values (1,2),(2,2); insert into child values (1,2),(2,2); delete from parent where (select count(*) from child where b = parent.b) > 1; We will "disallow cascading change to a table which is referenced in a correlated subquery". That's what ER_FK_CORRELATED is for. */ /* if (engine cannot consistent read) error(ER_FK_NO_CONSISTENT_READ); or set fk_statement_flags|=NO_CONSISTENT_READ; and warn later. */ /* Probably Solid/Pessimistic will say "cannot consistent read" if isolation level is below REPEATABLE READ. MySQL will change the isolation level (for the duration of the statement) to REPEATABLE READ or higher if this is the flag setting. If that's still not enough, there will be no warning that some anomalies are possible, because with such an engine they are always possible. ??? it's not that. by 'consistent read' we mean 'snapshot' */ /* Flag if the storage engine is basically 'pessimistic' (like SolidDB with a pessimistic flag setting), or basically 'optimistic' (like Falcon and most engines that say 'MVCC'). Peter just fears that combining pessimistic+optimistic might cause a problem, but he has no evidence. So: fk_prepare() won't generate a warning if both flags are set. But EXPLAIN will show that both flags are on, so users can be made aware of this situation (if WL#706 is done). */ if (engine is pessimistic) set fk_statement_flags|=FK_STATEMENT_FLAG_PESSIMISTIC; else if (engine is optimistic) set fk_statement_flags|=FK_STATEMENT_FLAG_OPTIMISTIC; else /* we don't know or it's not applicable) /* do nothing */ /* Flag if the storage engine is Falcon, so we can warn. */ if (engine is Falcon) set fk_statement_flags|=FK_STATEMENT_FLAG_FALCON; /* Once the initial loop for fk_flags settings is complete, MySQL can set fk_statement_flags|=FK_STATEMENT_FLAG_AFTER_CHECK. Generally this flag is on for transactional engines. Usually it means that MySQL may be "optimistic", that is, it can do things knowing that it can undo them later if necessary. As it turns out, though, we tend to use a variety of other flags instead. In fact, the only significant use of FK_STATEMENT_FLAG_AFTER_CHECK is with IGNORE, where with some storage engines (e.g. Falcon) we can 'change-then-check', rolling back row change if error. */ if (fk_statement_flags&FK_STATEMENT_FLAG_NO_SAVEPOINT is off and fk_statement_flags&FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK is off) fk_statement_flags|=FK_STATEMENT_FLAG_AFTER_CHECK; /* If any parent table is also a child table, then recursion is possible. If there's a trigger that changes a parent table, that's recursion too, but that will be illegal. See also WL#1444 "ON UPDATE CASCADE should be recursive". */ for (each constraint in constraint_list) x= constraint->parent_table_name; for (each constraint in constraint list) if (constraint->child_table_name == x) fk_statement_flags|=FK_STATEMENT_FLAG_RECURSION; /* We will want to know "were there any cascading-action constraints". We will want to know "were there any NO ACTION constraints". To avoid going through all the constraints multiple times, and to allow for later questions based on the general formula "were there any X constraints", we'll turn on FK_STATEMENT_FLAG_NO_ACTION if there are any NO ACTION constraints, we'll turn on FK_STATEMENT_FLAG_SET_NULL if there are any SET NULL constraints, etc. This test is a bit vague, since not all deletes can cause an update referential action, but it's good enough for most cases. This test assumes FK_FLAG_NO_ACTION==FK_STATEMENT_FLAG_NO_ACTION etc. */ for (each constraint in constraint list) fk_statement_flags|= constraint->fk_on_update_referential_action; fk_statement_flags|= constraint->fk_on_delete_referential_action; /* If any constraint involves cascade|set_null|etc., then the time-consuming steps of fk_cascading_action_child_list() are possible. This won't happen with INSERT, only with UPDATE or DELETE or REPLACE. */ if (fk_statement_flags&(FK_STATEMENT_FLAG_CASCADE|FK_STATEMENT_FLAG_SET_NULL|FK_STATEMENT_FLAG_SET_DEFAULT)) fk_statement_flags|=FK_STATEMENT_FLAG_CASCADING_ACTION; /* If there is only one foreign key, and no self-referencing, and no recursion, and no triggers, then there is no effective difference between NO ACTION and RESTRICT. */ if (fk_statement_flags&FK_STATEMENT_FLAG_RECURSION==0) if (there is only one constraint in constraint_list) for (each constraint in constraint_list) /* Dmitri will know how to know if there are triggers. */ if (constraint->parent_table_name "has no triggers") and (constraint->child_table_name "has no triggers") fk_statement_flags|=FK_STATEMENT_FLAG_NO_EOS; /* If it is an INSERT statement, and there is no chance of a row appearing later in parent table (i.e. if there is no self-referential constraint), then errors will be found during row-by-row processing, end-of-statement is unnecessary. This is not true for REPLACE. */ if (INSERT) if (fk_statement_flags&FK_STATEMENT_FLAG_RECURSION==0) fk_statement_flags|=FK_STATEMENT_FLAG_NO_EOS; /* If it is a DELETE statement, and there is no chance of the DELETE turning into an "update" with a new foreign-key value that has to be matched, then EOS buffer will be unnecessary. Only ON DELETE SET DEFAULT clauses are trouble. If we supported MATCH PARTIAL, SET NULL would be trouble too. This is not true for REPLACE. */ if (DELETE) X= 0 for (each constraint in constraint list) if (constraint->on_delete_referential_action==FK_SET_DEFAULT) X= 1; break; if (X == 0) fk_statement_flags|=FK_STATEMENT_FLAG_NO_EOS; if (thd->lex->ignore) /* Keep IGNORE in the flags. Currently it affects only main statements because we disallow cascade+ignore. But if we someday allow it, cascading actions will need to know it is on. */ if (thd->lex->ignore) fk_statement_flags|=FK_STATEMENT_FLAG_IGNORE; /* But if a process is operating on a simple foreign-key check (only two tables, no recursion, no triggers) then turn on one more flag, FK_SIMPLE. It means "there is no effective difference between NO ACTION and RESTRICT", so if it is on, FK_STATEMENT_FLAG_AFTER_CHECK might be unnecessary. if (there are only two tables, the parent and child) and (neither table has a trigger) and (the tables are different, i.e. this is not a recursion) fk_flags|=FK_FLAG_SIMPLE /* If there is a regular index on the table for the same columns that are in the foreign key, with no prefixing, then this would be a good time to set, for each constraint, the index name. There are fields for that in constraint_list structure. But at the moment we do nothing, thinking that the storage engine might be responsible for the choice of index. */ common routines: fk_cleanup() ----------------------------- The fk_prepare_... routines assume that flags are initialized, buffers are cleared. This document does not mention fk_cleanup() in each statement description. Assume a call happens at end of every statement which affects foreign keys. /* Reset for all flags and areas that persist throughout the statement. The exact meaning of "clear" is up to the implementor; it could mean "free", it could mean "memset", it could merely mean "let counter=0". */ clear(constraint_list); clear(table_list); clear(eos_buffer); clear(fk_statement_flags); common routines: fk_add_constraint() ------------------------------------ Affect files: mysql_create_frm in sql/unireg.cc During CREATE|ALTER, the server stores parseable foreign-key definitions in the child table's .FRM file, and in the parent table's .FRM file. Each definition is a variable-length utf8 string, preceded by a length. The clause order will always be: CONSTRAINT, FOREIGN KEY, REFERENCES, [MATCH], [ON UPDATE], [ON DELETE]. Object names will be delimited with ``s (this happens so that appearance will be like SHOW CREATE TABLE appearance and so that names may be reserved words or contain spaces, it does not happen due to case sensitivity, MySQL assumes that lower_case_table_name rules will still be as in version 6.0, WL#922 "Delimited Identifiers" is for a later version of MySQL). Qualifiers, e.g. "database_name.", will appear only for tables in other databases. MATCH and UPDATE clauses will appear only if they describe non-defaults. There are no comments. For example, in a child table: CONSTRAINT `constraint1` FOREIGN KEY (`a`,`b`) REFERENCES `database2`.`table1` (`c`,`d`) ON UPDATE CASCADE ON DELETE CASCADE The same text is in the parent table, with slight differences: CONSTRAINT `constraint1` UNIQUE OR PRIMARY KEY (`c`,`d`) REFERENCED FROM `database2`.`table2` (`a`,`b`) ON UPDATE CASCADE ON DELETE CASCADE Parseable text in .FRM might not be the most efficient of all alternatives, but the decision was "Monty's requirement for foreign keys in 5.2: text format of frms files will be implemented". It is possible that storage engines or other tools will try to read .FRM files, so we will try to keep the format consistent. But MySQL may add clauses later to correspond to standard extensions (e.g. INITIALLY IMMEDIATE), or for internal use (e.g. DISABLED, INDEX=name, or an indication that constraint name is arbitrary). [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: In the original LLD, you had "In the parent-table .FRM, store identifier of child-table." That was too little. When we open a parent table, we want to know whether the particular columns that we're updating are referenced. So we need more than the child-table identifier. ] common routines: fk_check_constraint_added() -------------------------------------------- /* Called for CREATE TABLE ... FOREIGN KEY * Called for ALTER TABLE ... ADD FOREIGN KEY Called for ALTER TABLE ... ADD CONSTRAINT */ Affected functionss: mysql_create_table(), mysql_alter_table() /* Produce foreign-key text for the .FRM of table for which mysql_create_frm() invoking this routine is being called. */ /* This function is responsible for performing all checks before creation of a foreign key. It relies on the caller to perform proper metadata locking for child and parent tables, to pre-open parent tables for checks, and to perform proper metadata locking for constraint names. Because of the last requirement ("perform proper metadata locking for constraint names"), it also has to rely on the caller to pre-generate names for any constraints for which names were not explicitly specified. Finally this function might adjust the definition of the table which is going to be created (in the case of ALTER it is a new version of the table) to contain indexes which are necessary for foreign key functioning. */ fk_check_constraint_added(constraint_name, flag_if_constraint_name_is_foreign_key_id, user_supplied_match_clause, parent_table, create_info_of_child) { /* CONSTRAINT constraint_name clause */ /* Constraint name must be unique within a database. Check whether database already contains constraint named constraint_name. Method: Look for a .CNS file. */ /* Either user entered a "CONSTRAINT constraint_name" / "FOREIGN KEY [constraint_name]" clause, or caller generated name of constraint automatically -- this is done to avoid complexity with meta-data locking. See ALTER TABLE ADD CONSTRAINT for details. Assume that constraint_name is not NULL. This is not the way WL#2226 suggests. */ if (constraint_name = 'PRIMARY') /* All the primary-key constraints are named 'PRIMARY' */ error(ER_FK_CONSTRAINT_NAME_ILLEGAL) if (fk_given_constraint_name_return_frm_file_name()!=NULL) error(ER_FK_CONSTRAINT_NAME_DUPLICATE) /* Now we're sure constraint_name is legal and unique */ /* FOREIGN KEY [id] clause */ /* MySQL allows the non-standard syntax "FOREIGN KEY [id] ...". Here the 'id' is not the constraint name. InnoDB interprets it as an index name, that is, it makes 'id' the index for the foreign key, but it's not the constraint name, and ALTER TABLE DROP FOREIGN KEY doesn't use it. The HLS doesn't have the clause in the syntax description. So we parse it, and transform it to CONSTRAINT constraint_name, but it can lead to troubles. */ if (flag_if_constraint_name_is_foreign_key_id == TRUE) warning(ER_FK_FOREIGN_KEY_ID); if (child table is temporary) /* Probably a matter of checking create_info.options & HA_LEX_CREATE_TMP_TABLE. */ error(ER_FK_CHILD_TEMPORARY) /* (CHILD COLUMNS) clause */ for (i = 0; i < number_of_child_columns; ++i) /* It's illegal to specify the same child column twice */ if (i > 0) for (j = 0; j < i; ++j) if (user_supplied_child_column[j] == user_supplied_child_column[i]) error(ER_FK_CHILD_COLUMN_DUPLICATED) /* Dmitri said the foreign key cannot be ENUM or SET or TIMESTAMP. Most storage engines will also disallow GEOMETRY or BLOB or long VARCHAR, but we'll find that out later when we try to create an index. */ if (child_table.column[i].data_type == ENUM | TIMESTAMP | SET ) error(ER_FK_CHILD_DATA_TYPE) /* Dmitri said "maybe BINARY/VARBINARY won't be okay either." But in December 2008 we decided to allow. See email thread "Re: WL#148 Foreign Keys Milestone 7". So the check for BINARY | VARBINARY is commented out, and will eventually be removed from LLD. */ /* if (child_table.column[i].data_type == BINARY | VARBINARY ) error(ER_FK_CHILD_DATA_TYPE) */ /* PARENT TABLE clause */ /* We rely on caller to open the parent table and therefore we can assume that the .FRM file exists. */ /* Here we have error checks for non-referenceable tables. We forbid 'temporary table' because of HLS. We forbid 'viewed table' because of standard. We forbid 'table in `mysql` database because `mysql` data changes can occur without regular DML statements. We don't need to forbid references to information_schema tables because they are technically views, so they should cause ER_FK_PARENT_VIEW anyway. Someday, when such objects exist, we must add checks to forbid synonyms and encrypted tables. It's interesting that other DBMSs (e.g. DB2 v5r4) forbid partitioned tables, which may be a sign that they know something we don't know. */ if (parent description says it's a temporary table) error(ER_FK_PARENT_TEMPORARY) if (parent description says it's a view) error(ER_FK_PARENT_VIEW) if (parent file is in `mysql` database) error(ER_FK_PARENT_MYSQL); /* (PARENT COLUMNS) clause */ /* If there's no clause, that's an error until we do WL#3947 */ if (number_of_child_columns = 0) error(ER_FK_PARENT_COLUMN_MANDATORY) /* The parent count must equal the child count */ if (number_of_child_columns <> number_of_parent_columns) error(ER_FK_PARENT_COLUMN_COUNT) for (i=0; i < number_of_parent_columns; ++i) /* It's illegal to specify the same parent column twice */ if (i > 0) for (j = 0; j < i; ++j) if (user_supplied_parent_column[j] == user_supplied_parent_column[i]) error(ER_FK_PARENT_COLUMN_DUPLICATED) if (parent_column[i] is nullable) error(ER_FK_PARENT_NULLABLE) /* The following check is commented out and will eventually be removed from LLD. See email thread "Re: WL#148 Foreign Keys Milestone 7". */ /* if (parent_column[i] is auto_increment) if (user_supplied_update_clause == CASCADE) or (user_supplied_delete_clause == CASCADE) error(ER_FK_PARENT_AUTO_AND_CASCADE) */ /* Parent data type must equal child data type. The HLS now says: "Data type must be the same as the corresponding pk_column data type, and length must be same, and scale must be the same. This is slightly stricter than the InnoDB requirement. See also WL#4095 Foreign keys: comparable data types." */ if (parent_column[i].data_type <> child_column[i].data_type) or (parent_column[i].sign <> child_column[i].sign) or (parent_column[i].zerofill <> child_column[i].zerofill) or (parent_column[i].character_set <> child_column[i].character_set) or (parent_column[i].collation <> child_column[i].collation) or (parent_column[i].length <> child_column[i].length) or (parent_column[i].scale <> child_column[i].scale) error(ER_FK_PARENT_DATA_TYPE) /* All of the parent columns must be in a (single) unique key. A "good" unique index has all the columns in the parent-key list, and no more. A "bad" unique index has all the columns in the parent-key list, and more (InnoDB might accept that). This should be a function, we might need it again later. */ set index_count= unique_index_count= good_unique_index_count= 0 for (u=0; u < number of keys for parent table; ++u) if (key[u] is btree i.e. not hash and not rtree) if (number of key columns >= number_of_parent_columns) set number_of_matching_columns= 0 for (i=0; i < number_of_parent_columns; ++i) for (j=0; j < number of columns in key) if (column[i] <> column[j]) break if (column[j] is for a 'prefix' index) break ++number_of_matching_columns if (number_of_matching_columns==number_of_parent_columns) ++index_count if (key[u] is unique) ++unique_index_count if (number of key columns == number_of_parent_columns) ++good_unique_index_count if (good_unique_index_count == 0) if (index_count == 0) error(ER_FK_PARENT_KEY_NOT_INDEXED) else if (unique_index_count == 0) error(ER_FK_PARENT_KEY_NOT_UNIQUE) else error(ER_FK_PARENT_KEY_NOT_ALL) /* User must have appropriate privilege on all parent table, or on all columns of parent table. */ /* Once WL#4099 "Foreign keys: references privilege" is complete we will have appropriate check for REFERENCES privilege here. */ /* MATCH clause */ if (user_supplied_match_clause != NULL) /* User entered MATCH FULL or MATCH SIMPLE or MATCH PARTIAL */ if user_supplied_match_clause = 'PARTIAL') /* HLS says: parse MATCH PARTIAL but don't allow it. */ error(ER_FK_MATCH_PARTIAL) match_clause = user_supplied_match_clause else /* server must use default match clause option */ match_clause = 'SIMPLE' /* Now match clause = either 'SIMPLE' or 'FULL' */ /* ON UPDATE clause and ON DELETE clause */ /* As agreed in Heidelberg meetings, high-level specification says: "With all_engines, defaults are ON UPDATE NO ACTION and ON DELETE NO ACTION." */ if (user_supplied_update_clause == NULL) update_clause = NO ACTION; else update_clause= user_supplied_update_clause; if (user_supplied_delete_clause == NULL) delete_clause = NO ACTION; else delete_clause= user_supplied_delete_clause; /* For ON UPDATE SET NULL and ON DELETE SET NULL, the HLS says: "For each column in fk_column list: ... Column must be nullable if on_update_rule or on_delete_rule is SET NULL." That's InnoDB requirement too. DB2 v5r4 documentation seems to say that only one column needs to be NULL; we don't care. */ if (update_clause == SET NULL or delete_clause == SET NULL) for (each child-key column) if (column is not nullable) error(ER_FK_SET_NULL_NOT_NULL); /* The HLS says: "It is legal to have referential constraints with the same parent and the same child and the same columns and the same referential action," DB2 would give a warning in such a circumstance (SQLSTATE '01543'), MySQL should consider doing the same." However, a later meeting, see email "WL#148 Foreign Keys Milestone 7 meeting", decided: Duplicate constraints with 'no action' and 'restrict' will not cause warnings. Therefore the ER_FK_CHILD_TWICE check is commented out, and will eventually be removed entirely. */ /* for (each foreign-key constraint already in child table) if (it refers to the same table as this constraint) warning(ER_FK_CHILD_TWICE); */ /* In Orlando we decided "... we will also prohibit, when there are overlapping foreign keys and one of them has UPDATE | DELETE ... CASCADE or SET NULL or SET DEFAULT." We should allow overlapping constraints, including duplicate constraints, if and only if referential action is either 'no action' or 'restrict'. */ for (each foreign-key constraint already in child table) if (any or all referencing columns are the same as in this constraint) if (either constraint has UPDATE|DELETE CASCADE|SET NULL|SET DEFAULT) error(ER_FK_OVERLAPPING); if (parent table is transactional and child table is non-transactional) error(ER_FK_NON_TRANSACTIONAL_CHILD); if (parent table is non-transactional and child table is transactional) error(ER_FK_NON_TRANSACTIONAL_PARENT); /* Error tests that were moved due to Orlando Foreign Key Meeting #2 */ fk_check_common_flags(); /* Passed all syntax checks. */ /* After all the comments before this, you might expect that there would be a little more detail than "add foreign-key text in child .FRM". However, WL#103 "New .frm files format" (scheduled for 6.1) has not passed architecture review. Let's keep the description vague here. */ Add Foreign-key text in child .FRM. Add Foreign-key text in parent .FRM. release mutex ("constraint change") if (there is no index already which matches the foreign key column list) create index /* The HLS says to use an "auto-generated" name, so it doesn't have to be the same as the constraint-name / foreign-key-name. */ create index /* If the 'create index' fails, fk_add_constraint() must fail. Exception: if the 'create index' fails for a storage-engine specific reason, e.g. FK_STATEMENT_FLAG_NO_READ=true, then that is not a problem -- we don't support, but keep metadata, the constraint is 'disabled'. */ if (there is no index already which matches the foreign key column list) alter definition of table which is going to be created to create index /* The HLS says to use an "auto-generated" name, so it doesn't have to be the same as the constraint-name / foreign-key-name. If we won't be able to 'create index' due to a storage-engine-specific reason, e.g. FK_STATEMENT_FLAG_NO_READ=true, then that is not a problem -- we don't support, but keep metadata, the constraint is 'disabled'. Dmitri asks: should not we handle such error in the same way as errors from fk_check_common_flags()? Peter comments: probably Dmitri is right, the idea of "disabling" seems contrary to the decision in the Orlando meeting. */ common routines: fk_check_common_flags() ---------------------------------------- /* The error tests in fk_check_common_flags() are due to decisions made during Orlando Foreign Key Meeting #2. Originally these checks were in fk_prepare() for DML statements like UPDATE or DELETE. The decisions said: make them errors for CREATE or ALTER. But there was an additional decision: if @@foreign_key_checks = 0, no checks. Therefore it's probable that people will eventually realize that we have to do the checks during fk_prepare() too because @@foreign_key_checks can change after CREATE or ALTER. Therefore this is a separate routine which currently is called from fk_check_constraint_added() but could be called from fk_prepare(). */ /* Assume that we called fk_prepare_statement_flags(). */ if (@@foreign_key_checks==0) return; /* Error if: CREATE or ALTER, 'cascading action' action, non-transactional. This error check is controversial. Monty would like to at least see ON DELETE CASCADE with MyISAM. So this comment is long. With non-transactional update/delete, we disallow cascading (CASCADE | SET NULL | SET DEFAULT) and we treat NO ACTION as RESTRICT, so in effect only RESTRICT works. Some cases we feared are: * Some BEFORE trigger changes a table, then foreign-key check fails, without being able to rollback the trigger change. * End-of-statement checks can't do anything except say there was a failure during row-by-row processing. (The suggestion for "compensating" i.e. "reverse update" operations is rejected, since it hasn't been implemented for other MyISAM failures in the history of MySQL.) * A change to t1 causes a cascade change to two rows of t2, which fails for the second row. * A change to t1 causes a cascade change to t2 which causes a restrict change on t3, which fails. * A change to t1 causes a set-default change to t2, to a value which doesn't exist in t1. That doesn't matter for INSERT, which can't cause a 'cascading action'. */ if (fk_statement_flags&FK_STATEMENT_FLAG_CASCADING_ACTION != 0) if (fk_statement_flags&FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK != 0) error(ER_FK_NON_TRANSACTIONAL_CASCADING_ACTION); /* Error if: CREATE or ALTER, NO ACTION, non-transactional. The original LLD said: "ON DELETE NO_ACTION and a non-transactional engine behaves the same way as RESTRICT - and produces false negatives. We give a warning when a constraint is defined about it." Until 2008-01-23 this was a warning: ER_FK_NON_TRANSACTIONAL_RESTRICT, sqlstate '01000', message = 'Foreign key warning: Constraint X: Non-transactional engine so NO ACTION will be treated as RESTRICT'. This used to be a warning, now it's an error. */ if (action!=INSERT) if (fk_statement_flags&FK_STATEMENT_FLAG_NO_ACTION) if (fk_statement_flags&FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK) error(ER_FK_NON_TRANSACTIONAL_NO_ACTION); /* Error if: non-transactional. The original LLD said we want "a warning on anything and a non-transactional table - that [there are] no end-of-statement [checks]". That might be an error in the description, since MySQL wouldn't have end-of-statement checks anyway for INSERTs or for RESTRICT constraints, and we've already checked for UPDATE|DELETE and NO ACTION constraints. But what the heck. This used to be a warning, now it's an error. */ if (fk_statement_flags&FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK!=0) error(ER_FK_NON_TRANSACTIONAL_NO_EOS); common routines: fk_given_constraint_name_return_frm_file_name() ---------------------------------------------------------------- Called from: fk_add_constraint(), ALTER TABLE DROP CONSTRAINT, SELECT FROM INFORMATION_SCHEMA. Constraint names must be unique within a database. The HLS says: "The "CONSTRAINT constraint_name" clause is the subject of a different worklog entry, WL#2226 "Constraint Names". Searches for constraint names are case insensitive." [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: WL#2226 is in Server-Backlog, Supervisor='KonstantinOsipov', Implementor='Sanja'. I doubt it will be accepted for version 6.1 in its present form. So I don't think "Naming" is fully decided. I of course propose: * Case and accent sensitivity as in dev-private discussion "Identifier equivalence and events" (not something like .TRN files, and not something like ""BUG#15187 "Triggers: names are case sensitive") * 1 to 64 UTF8 characters * If non-quoted, has some restrictions as other objects, e.g. can't be reserved word, can't start with a digit, etc. * Unique within a database. ] To search for constraint names, we try to open a file named constraint_name.CNS. This contains a constraint definition. The mechanism is similar to the .TRN mechanism for triggers. Search may be case sensitive. As expected, Peter wrote a bug report similar to BUG#15187 "Triggers: names are case sensitive". It was BUG#35520 "Foreign keys: case sensitive constraint names". Peter and Dmitri discussed this bug on 2008-06-28 and agreed that a resolution can be deferred for a long long time. The creation and maintenance of .CNS files (for CREATE or ALTER or DROP) is not described elsewhere -- assume "Dmitri knows how to handle .CNS since he handled .TRN". /* Called from fk_add_constraint */ fk_given_constraint_name_return_frm_file_name () Construct a filename with concat(constraint_name, ".CNS") open .CNS file if (open succeeds) parse constraint description to discover table_name Construct a filename with concat(table_name, ".frm") return frm_file_name else return NULL common routines: fk_check_parent_list() --------------------------------------- old name = check_parent_list. For INSERT or UPDATE, we have the contents of a new row, and we want to know: for every case where the table we're changing has a foreign key defined on it, is there a matching row? That is, is the foreign-key value in the parent table? For REPLACE, we'll get to fk_check_parent_list() only if this is "insert", that is, there's no duplicate key. /** Go over the list of foreign key relationships for given table/record where given table/record is a child. */ bool fk_check_parent_list(TABLE *table, void *record, bool use_eos, Bitmap changed_column_map) { bool need_eos_check= false; for (each constraint in constraint_list) if (table == constraint->child_table_name) /* There's a foreign key defined on this table. */ /* Foreign key / match simple doesn't need to match if any column is NULL. Foreign key / match full doesn't need to match if all columns are NULL. Sometimes we could check for NULLs during preparation, rather than in this row-by-row loop, but probably that wouldn't save much. */ null_column_count= 0; for (each column of the foreign key) if (new value is NULL) if (constraint->fk_match_option == FK_SIMPLE) /* MATCH SIMPLE: one NULL, so "do not look for parent" */ continue 'for each constraint in constraint list' loop ++null_column_count; else break; if (constraint->fk_match_option == FK_FULL) if (null_column_count != 0) if (null_column_count == foreign_key_column_count) /* MATCH FULL: all NULL, so "do not look for parent" */ continue 'for each foreign key defined on this table' loop /* MATCH FULL: one NULL, but not all NULL, so lookup fails */ error(ER_FK_CHILD_NO_MATCH_FULL) for (each column of the foreign key) if (constraint->fk_child_table_columns[n] is in changed_column_map) /* Either it's INSERT, or it's REPLACE, or it's UPDATE and fk_find_changed_columns() says this column changed. */ goto lookup_child_value_in_parent_table; /* No column was significantly changed */ continue 'for each constraint in constraint list' loop lookup_child_value_in_parent_table: /* Foreign key needs to match, and at least one foreign-key column changed. */ We must look up the new value in the parent table. */ /* Handler interaction point #1: look up child value in parent table */ if (parent->check_parent_exists() == YES) continue 'for each constraint in constraint list' loop /* Handler returned "the parent does not exist" */ if (fk_statement_flags&FK_STATEMENT_FLAG_NO_EOS == 0) /* End-of-statement check is possible. So set flag + continue. */ need_eos_check= true; continue 'for each constraint in constraint list' loop else /* End-of-statement check is not possible. */ if (parent->relationship == SELF_REFERENCING_TABLE) /* Lack of end-of-statement checks limits the set of supported foreign key operations to non-recursive relationships, and recursive relationships for which record consistency may be verified using the current snapshot of the table at the time of checking plus the new value of the subject record. InnoDB currently operates with a similar limitation. */ if (fk_check_record_in_memory(record) == TRUE) /* Self-reference and it's in memory. */ continue 'for each constraint in constraint list' loop error(ER_FK_CHILD_NO_MATCH); if (need_eos_check) /* Foreign key needs to match, and at least one foreign-key-column changed, and the new value was not found in the parent table, and end-of-statement check is possible. So save record contents to be checked at end-of-statement. WE SAVE ONLY ONCE PER ROW -- we do not record which foreign key failed, so if there are many foreign keys the end-of-statement validity checker will check them all. */ fk_put_record_to_eos_buffer(table, record, changed_column_map, ACTION_INSERT); common routines: fk_check_child_list() -------------------------------------- old name = check_child_exists. For UPDATE or DELETE, we have the contents of an old row, and we want to know: for every case where the table we're changing is referenced by a foreign key, is there a matching row? That is, is the parent-key value in any child tables? For REPLACE, we'll get to check_child_exists() only if the action has changed to "delete" due to a duplicate. When an update is optimistic (because it's a transactional table), the call to fk_check_child_list() is *after* the invocation of table->file->ha_update_row. That is: first we check for duplicates, then we check for foreign-key violations. /** Go over the list of foreign key relationships for given table/record where given table/record is a parent. RESTRICT/NO ACTION keys */ bool fk_check_child_list(TABLE *table, void *record, bool use_eos, Bitmap changed_column_map) { bool need_eos_check= false; for (each constraint in constraint list) if (table == constraint->parent_table_name) for (each column of the parent key) if (constraint->fk_child_table_columns[n] is in changed_column_map) /* A parent column has significantly changed */ goto lookup_parent_value_in_child_table; /* No column was significantly changed */ continue 'for each foreign key that references this table' loop lookup_parent_value_in_child_table: /* Handler interaction point #2: look up parent value in child table */ if (child->check_child_exists(old record) == NO) /* For this constraint, for this row, there is no foreign-key referencing. That is, the handler returned "no child exists". */ continue 'for each constraint in constraint list' loop /* Handler returned "a child exists" */ if (fk_statement_flags&FK_STATEMENT_FLAG_NO_EOS == 0) { /* End-of-statement check is possible. But that doesn't matter unless it's a NO ACTION constraint. */ if (UPDATE && constraint->fk_on_update_referential_action==FK_FLAG_NO_ACTION) or (DELETE && constraint->fk_on_delete_referential_action==FK_FLAG_NO_ACTION) need_eos_check= true; continue 'for each constraint in constraint list' loop; else /* referential_action is RESTRICT */ error(ER_FK_CHILD_VALUE_EXISTS); } /* End-of-statement check is not possible. */ if (child->relationship == SELF_REFERENCING_TABLE) if (child->is_self_referencing_relationship) if (fk_check_record_in_memory(record) == TRUE) /* The child row is in memory. That is, it's being deleted too. */ /* This is the reverse of the check we could make when inserting. */ continue 'for each constraint in constraint list' loop; /* End-of-statement check is not possible, and there is no self-reference -- so the effect is "as if referential action is RESTRICT". */ error(ER_FK_CHILD_VALUE_EXISTS); } } if (need_eos_check) /* At least one parent-key-column changed, and the change was significant, and the new value was not found in the child table, and end-of-statement check is possible, and the constraint is NO ACTION rather than RESTRICT. So save record contents to be checked at end-of-statement. WE SAVE ONLY ONCE PER ROW -- we do not record which foreign key failed, so if there are many NO ACTION foreign keys the end-of-statement validity checker will check them all. */ fk_put_record_to_eos_buffer(table, record, changed_column_map, ACTION_DELETE); return OK; } common routines: fk_check_record_in_memory() -------------------------------------------- Called from fk_check_parent_list() or from fk_check_child_list(), if end-of-statement check is not possible and foreign key is self-referencing. When inserting a child row, or when updating either a child or a parent row, the conventional test is "is this value already in the child/parent table". But suppose we have a self-reference: CREATE TABLE t1 (s1 CHAR PRIMARY KEY, s2 CHAR, FOREIGN KEY (s2) REFERENCES t1(s1)); Now "INSERT INTO t1 VALUES ('a','a')" will fail because 'a' is not already in t1. But we know that 'a' would be in t1 if we allowed the insert to succeed -- because 'a' is the very thing that we are inserting! So fk_check_record_in_memory() will let us succeed, where we would otherwise have failed, for a non-transactional storage engine. And later "DELETE FROM t1" works even though foreign key exists. But only if the row references the same row. If the row references another row in the same table (the more common 'self-referencing' case), e.g. "INSERT INTO t1 VALUES ('a','b'),('b','a')", then error will occur anyway. So usually this function is useless. bool fk_check_record_in_memory(void *record) { /* Skip this procedure if we got here via a DELETE statement, because that could be buggy. The problem is: CREATE TABLE t1 (s1 CHAR PRIMARY KEY, s2 CHAR, FOREIGN KEY (s2) REFERENCES t1(s1)); INSERT INTO t1 VALUES (1,1),(2,1); DELETE FROM t1 WHERE s1 = 1; The DELETE statement should FAIL. But fk_check_record_in_memory() would return TRUE, so the DELETE would SUCCEED. So: when it's DELETE, always return FALSE. */ if (statement that caused invocation is "DELETE") or (statement that caused invocation is "REPLACE") return FALSE; for (each column of foreign key) if (child-value <> parent-value) return FALSE; return TRUE; common routines: fk_put_record_to_eos_buffer() ---------------------------------------------- Previous name: put_record_to_eos_buffer() We get here if fk_check_parent_list() fails to find a parent-table row but we hope to find it at end-of-statement. We get here if fk_check_child_list() fails to find a child-table row but we hope to find it at end-of-statement. Originally the thought was: "We get a copy of the whole row, not only a pointer to the row, and not only the foreign-key column values, and not only the columns necessary to identify the row. But we could skip BLOB contents." But the post-Orlando thought is: "We get a copy of the necessary parts of the row, which are (a) all columns which are in any child key. (b) all columns which are in any parent key." Peter suggested that a storage engine should track what has changed and if so, eos_buffer is not necessary. See dev-runtime email thread "Re: Coherent LLD for WL#148 "Foreign keys" 2007-07-05. Waiting for reply! fk_put_record_to_eos_buffer(table,record,changed_column_map,action) point to eos_buffer if (adding this row will cause buffer overflow) /* Since one parent-table row might have millions of child rows, we could need hundreds of megabytes for this buffer. At some point we have to say 'too much' even if the disk is big. */ error(ER_FK_EOS_BUFFER); /* If there are both CASCADE and NO ACTION foreign keys, there might be more than one kind of record in the eos_buffer. */ put(table_name, to eos_buffer) put(changed_column_map, to eos_buffer) put(action i.e. ACTION_INSERT or ACTION_DELETE, to eos_buffer) for (each column in table) if column is in a parent key or column is in a child key) put(complete column contents, to eos_buffer) update pointer to end of eos_buffer common routines: fk_cascading_action_child_list() ------------------------------------------------- Called for UPDATE | DELETE, or called for REPLACE if REPLACE is "deleting", for referential actions that can do cascading action on the child table (CASCADE, SET NULL, SET DEFAULT), for each row. For the actions that merely check the child table (RESTRICT, NO ACTION) see fk_check_child_list(). The standard doesn't specify, for multiple foreign keys, in what order they must be processed. We chose to do 'foreign keys that check' before 'foreign keys that do cascading action' because checking is less expensive. The Handler Problem. For the main statement, we opened a handler and are going through the rows via that handler. But now we need to open a second handler, possibly for a different storage engine, in order to go down the line. (That's not a problem that's specific to cascading-action actions, but it becomes particularly acute here.) The Traversal Problem. There is a set of rows that we are "traversing" in the main statement, e.g. Row#1, Row#2, Row#3, Row#4. We have reached Row#2. During the cascading-action action, with a different handler instance, we update Row#3. After the cascading-action action, we continue to traverse in the main statement -- so we read Row#3. The question came up "does this mean cursor stability is needed" and the answer was "no". (In fact we try to make that impossible by saying that triggers can't change anything in the constraint list, and saying that the same row won't be updated twice.) /* This function must perform all cascading actions or return an error. It may only return OK if not everything is checked. It may never return OK if some of the cascading operations are incomplete. This is critical for EOS checks to work. */ bool fk_cascading_action_child_list(TABLE *table, void *oldrecord, void *record, Bitmap changed_column_map, Action_type action_type) { DBUG_ASSERT(action_type == ACTION_UPDATE or ACTION_DELETE); if (fk_recursion_stack.find_record(oldrecord, action_type, changed_column_map, found) == ERROR) return ERROR; if (found) /* It's a DELETE recursing into itself */ return OK; fk_recursion_stack.push_record(oldrecord, action_type, changed_column_map); for (each foreign-key constraint in constraint_list) if (action_type==UPDATE) referential_action= constraint->on_update_referential_action; else (action_type==DELETE) referential_action= constraint->on_delete_referential_action; if (referential_action &(FK_FLAG_CASCADE|FK_FLAG_SET_NULL|FK_FLAG_SET_DEFAULT)) if (constraint->parent_table_columns intersects changed_column_map) /* It's a cascading-action constraint and a parent column has changed. */ /* Handler interaction point #3: cursor over child records */ /* Tell the handler: "we want to find (for UPDATE or DELETE purposes) all the rows in table (constraint->child_table_name) where values = (oldrecord values). Please set up a cursor." for (each child-table row that the handler returns) if (action_type == DELETE && referential_action == FK_FLAG_CASCADE) if (delete_record_and_check_foreign_keys(child->table, old_child_record) == ERROR) else /* Everything except DELETE CASCADE is a sort of "update". Set up the appropriate update instruction based on thee referential_action, and get it done. */ if (action_type == UPDATE) if (referential_action == FK_FLAG_CASCADE) ... if (referential_action == FK_FLAG_SET_NULL) ... if (referential_action == FK_FLAG_SET_DEFAULT) ... else /* action_type == DELETE */ /* if (referential_action == FK_FLAG_CASCADE) already done */ if (referential_action == FK_FLAG_SET_NULL) ... if (referential_action == FK_FLAG_SET_DEFAULT) ... construct_record(child_record, child_record, record, oldrecord); update_record_and_check_foreign_keys(child->table, child_record); fk_recursion_stack.pop_record(oldrecord); The 'recursion_stack' is used only by fk_cascading_action_child_list(). It is only necessary if cascading action (or chain of actions) can twice change the same table, that is, foreign-key definitions have recursion. The point should be: "if you get back to the original row, stop". Actually, that is not what is here. There are a few things we have to settle before getting into detail about what is pushed and what is popped and when it is popped, see comments elsewhere. Until we've settled the details, the routines are deliberately blank. /** If there is a record for this table, for this record, with this particular row, that intersects UPDATEs conflict with UPDATEs on per-column basis, UPDATEs conflict with DELETEs on per-row basis. Anything else does not conflict. In particular, DELETEs do not conflict with DELETEs. @retval ok the record is not found or an action is allowed @retval error there is a record and two actions conflict */ bool Recursion_stack::find_record(void *record, Action_type action_type, Bitmap changed_column_map, bool [out] found) { } bool Recursion_stack::push_record() { } bool Recursion_stack::pop_record() { } " /* vim: foldmethod=syntax */ /* To ensure the standard requirement that the same record is not updated twice through a cascading action in the same statement, we keep track of all the records updated in the statement. Example: t1 has two children: t2 and t3, t4 is a child of both t2 and t3, and all actions are CASCADE. When updating a record, make sure that table->write_set is updated from changed_column_map before the actual write. If we plan to reuse the same handler object for recursion, table->write_set also has to be restored after the write, since then it's a part of the recursive state. */ common routines: fk_end_of_statement_check_foreign_keys() --------------------------------------------------------- Called at end-of-statement. If end-of-statement checks are possible, then there might be rows in eos_buffer, because fk_put_record_to_eos_buffer() was called during the row-by-row loop. These are rows that failed during row-by-row check, so this is a second chance. We don't need to check again whether we need to match, and we don't need to check again whether there was a significant change, and we won't do fk_check_record_in_memory() again, and we won't need to find out whether end-of-statement checks are possible, so we won't call fk_check_parent_list() or fk_check_child_list() again. But we are re-testing for every defined foreign key. bool fk_end_of_statement_check_foreign_keys() { for each (record info in eos_buffer) /* The only possible actions are INSERT and DELETE. UPDATE is impossible. */ DBUG_ASSERT(action_type == ACTION_INSERT or ACTION_DELETE); if (info->action_type == ACTION_INSERT) /* TODO: we can't in a general case re-check only the failed columns because of recursion, but this can be optimized. (This comment apparently refers to the fact that the call is with ALL_COLUMNS instead of info->changed_column_map. It means that the server will be checking constraints for columns that were never updated.) */ /* We know info->table, info->record, info->changed_column_map */ if (fk_check_parent_list(info->table, info->record, disable_eos, ALL_COLUMNS) == ERROR) /* For this constraint, for this row, there is no foreign-key referenced (parent), and all hope of finding it is gone. */ /* PRIVATE NOTE TO KONSTANTIN AND DMITRI'S GRANDCHILDREN: When MySQL allows deferred constraints, it won't return an error here and won't clear all of eos_buffer when the next statement starts. Call fk_end_of_statement_check_foreign_keys() again during COMMIT. */ return ERROR; else /* if info->action_type == ACTION_DELETE) */ /* In the original LLD, the call was to child->check_child_exists(). */ if (fk_check_child_list(info->table, info->record, disable_eos, changed_column_map) == ERROR) /* For this constraint, for this row, there is a foreign-key referencing (child), and all hope of eliminating it is gone. */ [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: I changed the name from "eos_record_list" to "eos_buffer" because it was called a "buffer" in other places. ] [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: I could not think of a good place to "pop" eos_buffer. I don't think it's impossible, but I wrote to Konstantin about my fears. Waiting for reply! " Hi Konstantin, This is how I now understand your plan to store all rows that changed "in the same cycle", and how they are useful. * This happens only for cascades (CASCADE, SET NULL, SET DEFAULT) exist. * Push values in primary key, or in first unique key. If there are no unique keys: Store values for all columns, except columns that cannot be in a foreign key. (This would mean: non-indexable things like BLOBs aren't stored.) * Store values "as of start of statement", before any changes. * Store like a stack, that is, rows can be "pushed" and "popped". * When you're about to cascade: push "start of cascade" marker * When you're changing a row (either updating or deleting): check all values in the stack. if any are equal, error. push "values for all columns" * For determining if it's equal: comparison is 'binary' (?), and NULL=NULL. * When you're ending a cascade: pop until you see "start of cascade" marker * When you run out of rows to process at top level: clear the stack. These are my objections. * As I already indicated, a storage engine should be able to discover whether a row has changed. * You should not say that there is an "error" if the row has already been changed. You should only ignore it. That is, don't change it. * Probably saying that the idea is "for cascades" is too broad, it might only be valuable "for recursions", but it's hard to tell. (Actually you might already have said this is only "for recursions", and my understanding is incomplete.) * A foreign-key table doesn't need to have a PRIMARY KEY or UNIQUE column. So suppose there are two foreign-key rows, {'a',BLOB#1}, {'a',BLOB#2}. The foreign-key relation is ON UPDATE CASCADE. The user is updating the primary key 'a', changing it to 'b'. The cascade happens for {'a',BLOB#1}. So 'a' goes on the stack. The cascade does not happen for {'a',BLOB#2} because the rules say that "if any [non-BLOB] values are equal, error". (Possible answer -- don't compare everything in the stack, compare only for "higher levels" in the stack.) * Here is a way to cause two updates to the same row. CREATE TABLE t1 (s1 CHAR PRIMARY KEY); CREATE TABLE t2 (s1 CHAR DEFAULT 'b', FOREIGN KEY (s1) REFERENCES t1 (s1) ON UPDATE SET DEFAULT); INSERT INTO t1 VALUES ('a'),('b'); INSERT INTO t2 VALUES ('a'); 1 default 2. default always = 'b'. UPDATE t1 SET s1 = CASE s1 WHEN 'a' THEN 'c' ELSE 'b' END; Update t1,r#1='c'. So t2,r#1='b'. Update t1,r#2='b'. So t2,r#1='b'. Thus t2,r#1 has been updated twice. Your rules don't stop it. (Possible answer: it's harmless, the update is always to the same value. But if it were possible to activate triggers, it would not be harmless.) * Here is another way to cause two updates to the same row. CREATE TABLE t1 (s1 CHAR PRIMARY KEY) CREATE TABLE t2 (s1 CHAR, s2 CHAR, FOREIGN KEY (s1) REFERENCES t1 (s1) ON UPDATE CASCADE, FOREIGN KEY (s2) REFERENCES t1 (s1) ON UPDATE CASCADE) INSERT INTO t1 VALUES ('a'),('b') INSERT INTO t2 VALUES ('a','b') UPDATE t1 SET s1 = 'c' Again, your rules don't stop it, because you 'pop' too quickly. (Possible answer: disallow two references to the same table.) * Here is another way to cause two updates to the same row. CREATE TABLE pk1 (s1 CHAR PRIMARY KEY) CREATE TABLE pk2 (s1 CHAR PRIMARY KEY) CREATE TABLE fk1 (s1 CHAR, s2 CHAR, FOREIGN KEY (s1) REFERENCES pk1 (s1) ON UPDATE CASCADE FOREIGN KEY (s2) REFERENCES pk2 (s1) ON UPDATE CASCADE) INSERT INTO pk1 VALUES ('a') INSERT INTO pk2 VALUES ('a') INSERT INTO fk1 VALUES ('a','a') UPDATE pk1,pk2 Again, your rules don't stop it, because you 'pop' too quickly. (Possible answer: disallow multi-update.) " I will now add this example, which is the reason that I added an error check for 'cascading action' and self-reference. Let's keep it illegal until we are sure that odd cases cause no trouble. Example: CREATE TABLE t1 (s1 INT, s2 INT, PRIMARY KEY (s1), CONSTRAINT A FOREIGN KEY (s2) REFERENCES t1 (s1) ON UPDATE CASCADE); CREATE TABLE t2 (s1 INT, CONSTRAINT B FOREIGN KEY (s1) REFERENCES t1 (s1) ON UPDATE CASCADE, CONSTRAINT C FOREIGN KEY (s1) REFERENCES t1 (s2) ON UPDATE NO ACTION); INSERT INTO t1 VALUES (1,1),(2,3),(3,2); INSERT INTO t2 VALUES (1); UPDATE t1 SET s1 = 4 WHERE s1 = 1; /* FAILS. can't update same row twice! */ UPDATE t1 SET s1 = 5 WHERE s1 = 2; /* FAILS? or SUCCEEDS? */ The second UPDATE statement succeeds if the constraint evaluation order is A B C, but fails if the constraint evaluation order is B C A, right? Konstantin and Peter agree that, if multiple rows are changed, end-of-statement buffer can't be cleared until row loop ends. */ ] common routines: fk_find_changed_columns() ------------------------------------------ Called for UPDATE. (For DELETE, or for REPLACE with "delete" action, we know that "all" columns are changed.) Produce a bitmap of all columns that changed. We could check first whether the column is in a foreign-key|parent-key definition, or whether the column could be in such a definition (i.e. its data type is not BLOB/ENUM/TIMESTAMP/etc. and it's indexed) -- otherwise we don't care. But we don't check. It's too much trouble. We could check using a binary comparison, as InnoDB does. But that would be wrong. The standard says that the only changes that matter are the ones that make the new value distinct from the old value, so (for example) changing from 'a' to 'A ' does not matter if the column is varchar with case-insensitive collation. Bitmap changed_column_map is part of struct record_info. fk_find_changed_columns(record, old_record) { turn all bits off for (each column) /* Compare the same way that you would compare when doing an '=' search with a full-table scan. Except that NULL == NULL. */ if (compare(record-column,old_record-column) != NULL) turn bit on for this column common routines: fk_ignore_error() ---------------------------------- Called from: all row-by-row loops for INSERT | UPDATE | DELETE | REPLACE. Not called from fk_end_of_statement_check_foreign_keys(). Return TRUE if "error is ignorable", that is, one can skip this row and go on. Return FALSE if "error is not ignorable", that is, the statement must fail. The MySQL Reference Manual describes IGNORE vaguely and inconsistently, as described in the docs-private email "Inconsistency with IGNORE" https://intranet.mysql.com/secure/mailarchive/mail.php?folder=101&mail=10699 -- but we do not depend on what the manual says. We use the new definition in WL#4103 "Define IGNORE". Since this is a proposal about foreign keys, we will only give special consideration for errors related to foreign or unique constraints (sqlstate '23xxx'), following InnoDB behaviour. IGNORE can only happen for a main statement, because we disallow IGNORE when a relevant cascade | set_null | set_default foreign key exists (ER_FK_IGNORE_AND_CASCADING_ACTION). So we avoid inconsistencies that could result from ignoring a cascading operation, when the main statement depends on the cascading operation. If the return is TRUE, then the row is skipped. Effects of BEFORE triggers are cancelled if possible. (Or they're not ... it depends on flags.) When IGNORE is in effect, the constraint is checked after the row is changed, as if FK_STATEMENT_FLAG_AFTER_CHECK (or equivalent) is true. So it will matter if the storage engine can't statement-rollback. A 'true' return from fk_ignore_error() does not cause a warning, and there is no count of ignored foreign-key errors (that is, there is no equivalent to the 'Duplicates:' count for ignored unique errors). bool fk_ignore_error(error number) if (fk_statement_flags&FK_STATEMENT_FLAG_IGNORE != 0) /* IGNORE keyword was used in main statement */ if (this error is associated with sqlstate class '23') return TRUE return FALSE [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: I'm following the original LLD, e.g. "DELETE IGNORE and UPDATE IGNORE statement will fail if there is a foreign key relationship that involves a cascading action." But I threw away all the "rationale" stuff. ] The original LLD had "..._aux" functions with pseudocode like this: " if (is_update_ignore) transaction->savepoint(); bool res= update_record_and_check_foreign_keys(table, record, oldrecord); if (res == ERROR && is_update_ignore) transaction->rollback_to_savepoint(); return is_update_ignore ? ok : res; " I skipped it. Partial rollback won't happen since there's no cascading. Complete rollback is always going to happen for statement errors anyway. These "..._aux" codes look unnecessary. But the following (quote from original LLD) has bad implications: " * before triggers are called for skipped records ...". Well, if a BEFORE trigger changes a row, a transactional engine could roll that back, and the "..._aux" code will do that. Is that the point? If so, I have made a mistake. ] [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: This "rationale for IGNORE behaviour" appeared in the original LLD: " This is the only simple way to support this clause in foreign keys and not sacrifice the guiding principles of the task - never leave bad data around and be consistent across engines. IGNORE keyword is primarily used for non-transactional engines, and these engines do not support end-of-statement checks anyway. IGNORE keyword is used to insert as many records as possible, and skip bad records. This use case scenario will be preserved. New foreign keys will be compatible with InnoDB foreign keys in this scenario. " I don't think it's 100% true. You can't be consistent across engines, but that was never a realistic "guiding principle" anyway. If an engine can support savepoints, then before the cascading action you can set a savepoint, and if the cascading action fails you can rollback to savepoint. I'm not advocating that, I'm just saying that I don't like the "rationale", so I don't think it should end up in the code comments. ] common routines: fk_check_before_drop_tables() ---------------------------------------------- Affected files: sql/sql_table.cc Called for: DROP DATABASE, DROP TABLE For "DROP TABLE t1,t2" we want to make sure that there is no foreign-key reference to either t1 or t2, but if they are self-referencing tables it's okay, and if the reference is from t2 to t1, it's okay. For "DROP DATABASE d" we want to make sure that there is no foreign-key reference from something outside the database. fk_check_before_drop_tables(list-of-tables-to-drop) for (every table in list-of-tables-to-drop) fk_prepare(); for (every constraint in constraint list) if (child table is not in list-of-tables-to-drop) error(ER_FK_CHILD_TABLE_EXISTS); common routines: fk_check_before_illegal_statement() ---------------------------------------------------- Affected files: sql/sql_table.cc Called for: RENAME etc. For "RENAME" we want to make sure that there is no foreign-key reference to the tables in the list, and no tables which contain references, that is, can't be a parent and can't be a child. Even if they are self-referencing tables it's not okay, and if the reference is from t2 to t1, it's not okay. This is more severe than fk_check_before_drop_tables(). fk_check_before_illegal_statement(list-of-tables) for (every table in list-of-tables) fk_prepare(); if (constraint list is not empty) error(ER_FK_STATEMENT_ILLEGAL); common routines: fk_check_before_altering_column() ---------------------------------------------------- Affected files: sql/sql_table.cc Called for: ALTER TABLE For "ALTER" we want to ensure that it does not changes any column which is part of foreign or parent key. fk_check_before_altering_column(table, column) for (each fk from table->fk_as_child) if (column in fk->fk_child_table_columns) error(ER_FK_STATEMENT_ILLEGAL); for (each fk from table->fk_as_parent) if (column in fk->fk_parent_table_columns) error(ER_FK_STATEMENT_ILLEGAL); ================================================================= SQL statements ============== This section is a list, in alphabetic order, of every SQL statement that WL#148 might affect significantly. ALTER TABLE ----------- The main effects which our implementation will have on the ALTER TABLE statement stem from the following: - To avoid complex checks for changes in parent/foreign key values, we don't allow ALTER to change in any way columns which participate in foreign key relationships, either as part of the foreign key or as part of the parent key. - We don't allow any operations with indexes that support foreign or parent keys. - We don't allow deletion of rows in ALTER TABLE if the table serves as the parent in a foreign key relationship. This means that ALTER TABLE with IGNORE should emit an error when applied to tables which serve as a parent in a foreign key relationship. - And last but not least, when adding or dropping a foreign key constraint we should follow the metadata locking protocol to ensure proper isolation of our statement from other DML or DDL statements. Please note that, altough below we discuss changes for each clause of ALTER TABLE individually, actually from the implementation point of view there is no such thing as a separate ALTER TABLE ADD CONSTRAINT or ALTER TABLE CHANGE statement. Instead there is one ALTER TABLE implementation which is able to handle any combination of clauses described below. So in reality the code will be closer to a combination of pseudocode and descriptions for all clauses mentioned below. ALTER TABLE ... ADD CONSTRAINT ------------------------------ /* ALTER TABLE .. ADD FOREIGN KEY gets here too. Add the constraint.*/ /* In addition to opening and taking TL_WRITE_ALLOW_READ lock on ALTER's target table, which is done by most kinds of ALTER, we also open and take TL_WRITE_ALLOW_READ lock on the parent table. Also to avoid problems with concurrent ALTER statements which might try to do something with a constraint with the same name, we need to obtain an exclusive meta-data lock on the constraint name. This can be a bit problematic if we don't know this name in advance, e.g. when we are iterating through the names of existing constraints and trying to choose the first free name. In such cases we will have either to obtain a database-wide metadata lock or also take a lock on each name which, as we find during this iteration, is occupied. To avoid doing this, it is simpler to deterministically generate a unique name from the parameters of ALTER and its environment (e.g. based on table-name + time_t + thread_id + rand). Peter adds: but a user might have inserted such a name. That's why, in the original plan, he had a loop, looking for the first free name. */ open_and_lock_tables(); /* Check that the definition of the foreign key that we are adding is OK (e.g. number of columns and their types match etc.). */ fk_check_constraint_added() /* Create a new version of the table (i.e. with the foreign key and supporting index) with a temporary name. */ mysql_create_table_no_lock() /* Copy all rows from the current version of the table to the new one. If @@foreign_key_checks<>0, then this operation should also make sure the parent rows exist by calling fk_check_parent_list() for each row. If this fails for any of the rows, then the whole ALTER TABLE should be cancelled. */ copy_data_between_tables() /* For the parent table, create a new version of the .FRM file which includes information about the foreign key which we are adding. */ /* Upgrade shared metadata locks on both child (also known as target) and parent tables to exclusive metadata locks. */ /* Swap old and new versions of the table which we are ALTERing. */ /* Replace old versions of .FRM file for parent table with new version. */ /* Add an event to the binary log. */ /* Release all metadata locks which we hold. */ ALTER TABLE ... ADD FOREIGN KEY ------------------------------- /* Set a flag so that fk_check_constraint_added() will get flag_if_constraint_name_is_foreign_key_id = TRUE. This will cause warning(WARNING_ER_FOREIGN_KEY_ID). */ /* the rest is the same as for ALTER TABLE ADD CONSTRAINT */ ALTER TABLE ... ALTER [COLUMN] {SET DEFAULT literal | DROP DEFAULT} ------------------------------------------------------------------- If there is a foreign key with a reference to the column, with ON UPDATE|DELETE SET DEFAULT, error(ER_FK_CHILD_SET_DEFAULT). ALTER TABLE CHANGE | MODIFY --------------------------- The HLS says: "If table has an FK referencing it ... If ALTER TABLE ... CHANGE and column is in PK If column data type, size, or collation changes: Error. (It would be possible to allow if columns are still comparable, but one would have to re-check every row.)" But we have a simpler, much more general, error check for the moment. alter() +fk_check_before_altering_column(table, column_to_be_changed) ALTER TABLE ... CHECK | NOCHECK ------------------------------- The original WL#148 LLD had a "Q+A" entry: Q. Should we check that the table satisfies to the constraint being added when we add it with ALTER TABLE? HLS says nothing about it. A. A feature request from Monty: ALTER TABLE should have an option that says that you add a constraint without having to check it. Default mode: CHECK. " Therefore there is a new worklog task: WL#3991 ALTER TABLE ... CHECK | NOCHECK. It is not scheduled for MySQL version 6.1. ALTER TABLE CONVERT TO CHARACTER SET ------------------------------------ The HLS says: "If table has an FK referencing it ... If ALTER TABLE ... CONVERT TO CHARACTER SET: Error." ALTER TABLE DISABLE CONSTRAINT ------------------------------ Monty had some suggestions about validating and disabling. Therefore there is a new worklog entry: WL#3992 ALTER TABLE ... DISABLE CONSTRAINT. It is not scheduled for MySQL version 6.1. ALTER TABLE DISABLE KEYS ------------------------ The HLS says: "If table has an FK referencing it ... If ALTER TABLE ... DISABLE KEYS: Error." fk_check_before_altering_column(table, column_to_be_deleted) ALTER TABLE DROP COLUMN ----------------------- The HLS says: "If table has an FK referencing it ... If ALTER TABLE ... DROP [COLUMN] column_name and column is in PK: Error." fk_check_before_altering_column(table, column_to_be_deleted) ALTER TABLE DROP CONSTRAINT --------------------------- /* First we open the table which is mentioned as target table for this ALTER (i.e. the table which is supposed to be the child table in the relationship). From the definition of this table we find the name of the parent table and open it. After that we obtain TL_WRITE_ALLOW_READ locks on both tables. Also, similarly to the case of ALTER TABLE ADD CONSTRAINT, we should obtain an exclusive metadata lock on the name of the constraint which we are about to drop. */ /* We prepare new versions of .FRM filess for child and parent tables. */ /* We upgrade our shared metadata locks on the child and on the parent to exclusive meta-data locks. */ /* We replace old versions of .FRM files for child and parent tables with new versions. */ /* Write an event to the binary log. */ /* Release all metadata locks which are being held. */ ALTER TABLE DROP FOREIGN KEY --------------------------- We're going to allow the non-standard syntax "FOREIGN KEY [id]". DROP FOREIGN KEY is the same as DROP CONSTRAINT. So dropping named foreign keys must be -- alas -- legal. alter() warning(ER_FK_FOREIGN_KEY_ID); /* the rest is the same as for ALTER TABLE DROP CONSTRAINT */ ALTER TABLE DROP INDEX ---------------------- See "DROP INDEX". ALTER TABLE DROP PARTITION -------------------------- Partition dropping is roughly equivalent to 'truncating'. Truncating is illegal. Therefore partition dropping is illegal. alter() fk_check_before_illegal_statement(tables mentioned in ALTER) ALTER TABLE DROP PRIMARY KEY ---------------------------- The HLS says: "If table has an FK referencing it ... If ALTER TABLE ... DROP PRIMARY KEY: Error." alter() fk_check_before_illegal_statement(tables mentioned in ALTER) ALTER TABLE ENABLE KEYS ----------------------- The HLS says: "If table has an FK referencing it ... If ALTER TABLE ... ENABLE KEYS: Error." alter() fk_check_before_illegal_statement(tables mentioned in ALTER) ALTER TABLE table_name RENAME [TO] table_name --------------------------------------------- See RENAME TABLE. BACKUP DATABASE --------------- No anticipated changes for foreign keys. CREATE INDEX ------------ The original LLD says: " Q. Should we allow creation of foreign keys which are not supported by indexes on referencing tables? " The new answer (2007-11-16) is: A. NO. We will create an index automatically if it doesn't exist, emit a warning that it was created automatically, and not allow it to be dropped. Exception: if the user adds an appropriate index (in the same statement or later), the automatic index should be dropped. " However, that is not appropriate as part of a foreign-key task. Every CREATE INDEX, regardless whether one supports foreign keys or not, can create a duplicate of an existing index. Therefore: There will be no changes to CREATE INDEX for WL#148. Instead, there is a new worklog task: WL#3993 Warn if duplicate index. This task is not scheduled for MySQL version 6.1. Eventually MySQL may allow dropping automatically-created indexes: WL#4094 "Foreign keys: without indexes on the child table". CREATE TABLE ------------ Affected files: sql_table.cc This is a multi-stage process. If we fail on one of the stages, we try to rollback all the changes which were done in previous stages in order to guarantee atomicity. Note that currently the only way to modify an .FRM file is to create its new version and replace the old one with it. Since we don't plan to change this in the scope of this worklog, the implementation suggested below tries to create all foreign keys for the table at once rather than adding them one by one. Here is a brief description of these stages: /* Once WL#4099 "Foreign keys: references privilege" is complete the first step will be to check for REFERENCES privilege for all parent tables. */ /* We obtain exclusive metadata locks on the table we are creating and all tables which are mentioned as parents (the latter gives protection from concurrent DML and DDL on the parent and allows us to modify its .FRM safely). Also, to avoid problems with concurrent DDL which might do something with constraints with same name, we should obtain exclusive metadata locks on constraint name. To do this we should pre-generate constraint names in cases when the user has not specified them explicitly (see ALTER TABLE ADD CONSTRAINT for rationale and details). */ /* If the table we are going to create is among parent tables (i.e. there are self-references), we create an auxiliary table which has the same structure as the table to be created except that it is stripped of foreign keys. We use this auxiliary table to simplify checks in fk_check_constraint_added() */ /* We open parent tables in order to be able to perform checks of foreign key definitions. */ /* We create the table. At this stage we also perform apropriate checks for the foreign key (e.g. for number of columns and their types, or whether this storage engine supports this kind of foreign key). If a problem arises and @@foreign_key_checks is set, then we emit an error. Otherwise the error is postponed until execution of the statement which will cause problems. */ for (each foreign key created) fk_check_constraint_added(); create table; /* We create new versions of .FRM file for parent tables (no need to perform checks here...) */ /* We substitute old versions of .FRM files for parent tables with new ones. */ /* We remove old versions of .FRM files and auxiliary tables which we have used to simplify compatibility checks, write an appropriate event to the binary log, and release metadata locks obtained by this statement. */ For specification of when automatic index creation occurs, look elsewhere. CREATE TABLE ... LIKE --------------------- "CREATE TABLE t1 LIKE t2" does not copy t2's foreign keys, so no special code is needed for this. CREATE TABLE ... SELECT ----------------------- As with ordinary CREATE TABLE, this is a two-stage process. For example: CREATE TABLE t2 (FOREIGN KEY (s1) REFERENCES t1(s1)) AS SELECT 1 AS s1; is broken into two statements: CREATE TABLE t2 AS SELECT s1 AS s1; ALTER TABLE t2 ADD FOREIGN KEY (s1) REFERENCES t1(s1)); Therefore, if ALTER TABLE fails, t2 exists anyway without a foreign key. [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: you said "It is not clear how we should perform locking if we have CREATE TABLE ... SELECT that additionally defines some foreign key constraints." The "two-stage" trick evades the problem. It's not what InnoDB does. The statement would be illegal in Oracle. ] DELETE ------ Affected files: sql/sql_delete.cc The changes, which probably will affect sql/sql_delete.cc mysql_delete(), are: we add fk_prepare() near the start, we add fk_check_child_list() and fk_cascading_action_child_list() in the row-by-row loop, we add fk_end_of_statement_check_foreign_keys() at "end of statement". There is no need for fk_check_parent_list() because a parent row can exist without any matching child rows (we don't support PENDANT). Oddly, there is a need for fk_end_of_statement_check_foreign_keys() because we support ON DELETE SET DEFAULT, which may cause a value to exist that is not in a parent table. If so, FK_STATEMENT_FLAG_NO_EOS will be off. We've arranged things so that fk_prepare() and fk_eos_valid_foreign_keys() are only called by the main statement, not by cascading actions. /* The row-by-row loop for foreign keys */ mysql_delete () fk_prepare(); for (each row) delete_record_and_check_foreign_keys(table,record); fk_end_of_statement_check_foreign_keys(); /* Called from mysql_delete(), or called from a cascading action. */ bool delete_record_and_check_foreign_keys(TABLE *table, void *record) { invoke_before_triggers(); /* A special case for IGNORE and some storage engines, e.g. Falcon. Do an optimistic delete: first delete, then check. But have a savepoint for each row. */ if (fk_statement_flags & FK_STATEMENT_FLAG_AFTER_CHECK != 0) and (fk_statement_flags & FK_STATEMENT_FLAG_IGNORE != 0) create_savepoint('X'); table->delete_row(record); if (fk_check_child_list(table, record, is_eos, ALL_COLUMNS) == ERROR) if (fk_ignore_error(error) == TRUE) /* Don't return error, just skip to next row. */ rollback_to_savepoint('X'); else /* savepoint need not be released */ return ERROR; release_savepoint('X'); else if (fk_statement_flags & FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK != 0) or (fk_statement_flags & FK_STATEMENT_FLAG_IGNORE != 0) /* Do a non-optimistic-delete: first check, then delete. */ if (fk_check_child_list(table, record, is_eos, ALL_COLUMNS) == ERROR) if (fk_ignore_error(error) == TRUE) /* Don't return error, just skip to next row. */ else return ERROR; /* Don't worry about fk_cascading_action_child_list, we don't allow cascading action. */ table->delete_row(record); /* e.g. able->file->ha_delete_row */ else /* fk_statement_flags & FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK == 0) /* Do an optimistic-delete: first delete, then check. */ table->delete_row(record); if (fk_check_child_list(table, record, is_eos, ALL_COLUMNS) == ERROR) return ERROR; if (fk_cascading_action_child_list(table, record, record, ALL_COLUMNS, ACTION_DELETE)) return ERROR; invoke_after_triggers(); return OK; } DELETE IGNORE ------------- DELETE IGNORE is handled by the regular DELETE routine, see "DELETE". See also comments for fk_ignore_error() function. DELETE multi-table ------------------ For "DELETE FROM t1,t2" it is normal to do all the t1 deletes, then all the t2 deletes -- no interleaving. So there are two separate "delete" calls. That means: * Multi-table DELETE is handled by the regular DELETE routine, see "DELETE". Except it uses the routine twice. * There is no special work to do, except ensuring that all affected tables are locked at the start (so call fk_prepare_foreign_key_tables() for all tables in the DELETE statement) * The "end-of-statement" buffer and "end-of-statement" validation aren't really happening at end of statement, but at end of row-by-row loop. The same considerations apply for "UPDATE multi-table". DROP DATABASE ------------- If there is a child table name from a different schema, error. drop_database() fk_check_before_drop_tables(list of all tables in database); DROP INDEX ---------- It should not be possible to drop an index if it's the index for supporting a foreign key. Do not assume that storage engines can do sequential searches. There's some code change necessary here. Also, a parent table must be unique. drop_index() fk_prepare(); for (each constraint in constraint list) if (constraint->parent_table = table we want to index) if (index we are dropping is unique) /* it's an error even if another unique index exists */ error(ER_FK_PARENT_SO_UNIQUE_INDEX_UNDROPPABLE); if (constraint->child_table = table we want to index) if (index we are dropping is necessary for this reference) error(ER_FK_CHILD_SO_CHILD_INDEX_UNDROPPABLE); DROP TABLE ---------- As with InnoDB, this means DROP TABLE RESTRICT. So if there are any child tables that are not being dropped, error. But don't cascade. drop_table() fk_check_before_drop_tables(list of tables mentioned in DROP TABLE statement) GRANT ----- There may be some problem with privilege checking; see the comment about privilege checking in fk_prepare(). Apparently, though, no change is necessary for GRANT. The privilege exists, it can be granted, it can be revoked. However, now that 'REFERENCES' is more than meaningless, the category and priority of this bug should be reconsidered: BUG#5503 Views: GRANT REFERENCES statement should be disallowed INSERT ------ Files affected: mysql_insert() and write_record() in sql/sql_insert.cc. During fk_prepare_statement_flags() we ensured that FK_STATEMENT_FLAG_NO_EOS is on unless there is recursion, that is, a self-referential constraint. So usually checks precede inserts and fk_end_of_statement_check_foreign_keys() does nothing. /* Preparation tip: during preparation, to turn off batching: (void) table->file->extra(HA_EXTRA_DELETE_CANNOT_BATCH); */ mysql_insert() fk_prepare(); /* NEW. set up FK tables */ for (each row) /* the row-by-row loop ... */ write_record_and_check_foreign_keys(); fk_end_of_statement_check_foreign_keys();/* NEW. check integrity a second time */ bool write_record_and_check_foreign_keys(TABLE *table, void *record) { invoke_before_triggers(); /* A special case for IGNORE and some storage engines, e.g. Falcon. Do an optimistic insert: first insert, then check. But have a savepoint for each row. */ if (fk_statement_flags & FK_STATEMENT_FLAG_AFTER_CHECK != 0) and (fk_statement_flags & FK_STATEMENT_FLAG_IGNORE != 0) create_savepoint('X'); table->file->ha_write-row; if (fk_check_parent_list(table, record, is_eos, ALL_COLUMNS) == ERROR) if (fk_ignore_error(error) == TRUE) /* Don't return error, just skip to next row. */ rollback_to_savepoint('X'); else /* savepoint need not be released */ return ERROR; release_savepoint('X'); else if (fk_statement_flags & FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK != 0) or (fk_statement_flags & FK_STATEMENT_FLAG_IGNORE != 0) /* Do a non-optimistic-insert: first check, then insert. */ if (fk_check_parent_list(table, record, is_eos, ALL_COLUMNS) == ERROR) if (fk_ignore_error(error) == TRUE) /* Don't return error, just skip to next row. */ else return ERROR; table->file->ha_write_row; /* e.g. ha_innodb.cc ha_innobase::write_row */ else /* fk_statement_flags & FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK == 0) /* Do an optimistic-insert: first insert, then check. */ table->file->ha_write_row; /* e.g. ha_innodb.cc ha_innobase::write_row */ if (fk_check_parent_list(table, record, is_eos, ALL_COLUMNS) == ERROR) return ERROR; table->file->is_fatal_error(... /* Check for duplicate key */ table->triggers->process_triggers /* also known as "invoke_after_triggers()" */ return OK; } INSERT DELAYED -------------- INSERT DELAYED only works for MyISAM and MEMORY, non-transactional engines. The original LLD said the work need to ensure deadlock-free lock acquisition is "not justifiable" and added that we don't do such work for REPLACE or INSERT ON DUPLICATE KEY UPDATE. So we ignore DELAYED. We act as if it's an ordinary INSERT. See "INSERT". INSERT IGNORE ------------- INSERT IGNORE uses the same routine as INSERT, so see "INSERT". See also comments for fk_ignore_error() function. INSERT ON DUPLICATE KEY UPDATE ------------------------------ ON DUPLICATE KEY UPDATE is a non-standard MySQL extension but we'll try do what the standard MERGE statement does, WL#2345. So the standard's rules for "effect of replacing rows in base tables" or "effect of inserting tables into base tables" apply. The process will be the same as for INSERT (see "INSERT" above), except that: * For fk_prepare() the action is UPDATE rather than INSERT. This ensures that all constraints get into constraint_list and all tables get into table_list. * During the insert row-by-row loop, a duplicate-key error must cause a call to update_record_and_check_foreign_keys(). So the duplicate test must happen before activating any AFTER triggers. Similar considerations apply for REPLACE. INSERT SELECT ------------- The process is the same as for INSERT VALUES. See "INSERT". There has been a suggestion that some new locking issues might arise if one is inserting into a table which references a table which appears in one of the SELECT clauses. If that is so, then we'll have to revisit this. KILL ---- There is no change in the syntax of intended semantics of KILL, but someone must check the code thus: For every place where MySQL aborts due to a KILL: if there is a foreign-key with cascading: if two child rows are being 'modified': can one row be updated, and not the other? Incidentally, the existence of KILL and of premature crashes might make 'compensation' or 'reverse update' (the idea of cancelling 'set a=new-value' by later executing 'set x=old-value') more difficult. But we don't plan to support compensation anyway. LOAD DATA INFILE ---------------- LOAD DATA will be transformed into a series of INSERT statements. We do no special 'bulk' code. See "INSERT". Since INSERT is a statement, this means that the "end-of-statement" processing takes place at the end of every row, not at the end of the whole LOAD DATA. LOAD DATA REPLACE ----------------- LOAD DATA REPLACE will be transformed into a series of INSERT REPLACE statements. We do no special 'bulk' code. See "INSERT". LOCK ---- There is no special code for LOCK. We do see that, if table t2 has a foreign key referencing t1, people who say "LOCK TABLE t1 WRITE" might also want to lock t2. But we solve that by documenting it: tell people to try "EXPLAIN UPDATE t1 ..." so they will know what other tables might have to be locked. RENAME DATABASE --------------- Since RENAME DATABASE is in effect a renaming of multiple tables, and since RENAME TABLE causes an error with foreign keys, RENAME DATABASE will cause an error with foreign keys. rename_database() fk_check_before_illegal_statement(list of all tables in database) RENAME TABLE ------------ The HLS says: If table has an FK referencing it ... If ALTER TABLE ... RENAME [TO] new_table_name: Error." The HLS says, for RENAME TABLE: "If table has an FK referencing it, Error." In addition, we must add this which is not stated in HLS: "If table references another table, Error." Affected routines: mysql_rename_table() in sql/sql_table.cc. rename_table() fk_check_before_illegal_statement(tables mentioned in RENAME TABLE) REPAIR TABLE ------------ REPAIR is not in the HLS, so it is not what we will use if a .FRM file becomes corrupt (see comments for WL#4175 "Foreign Keys: Check Tables"). There will be no change to REPAIR TABLE code. REPLACE ------- REPLACE is a non-standard MySQL extension, but we'll say that the standard's rules for "effect of deleting rows from base tables" or "effect of inserting tables into base tables" apply. The process will be the same as for INSERT (see "INSERT" above), except that: * For fk_prepare() the action is REPLACE rather than INSERT. This ensures that all constraints get into constraint_list and all tables get into table_list. In fact there are more constraints in constraint_list than "insert" or "delete" needs, but that should be harmless if we're careful. * During the insert row-by-row loop, a duplicate-key error must cause a call to delete_record_and_check_foreign_keys(). So the duplicate test must happen before activating any AFTER triggers. The deletion is followed by an insertion. Similar considerations apply for INSERT ON DUPLICATE KEY. REPLACE DELAYED --------------- The word DELAYED is ignored. See the comment for INSERT DELAYED. REPLACE IGNORE -------------- REPLACE IGNORE uses the main REPLACE routine, so see "REPLACE". See also the comments for the fk_ignore_error() function. RESTORE ------- RESTORE (i.e. restore database) is not in the HLS. It implies "insert". It's legal. Due to WL#4150 "Online Backup : Disable Foreign Key Constraint on Restore" the value of @@foreign_key_checks=0 during RESTORE. Once WL#148 is implemented, RESTORE code probably wouldn't call code responsible for new foreign key checking. But it still makes sense for RESTORE to set @@foreign_key_checks=0 in order to work properly when --foreign-key-all-engines is off. Therefore there are no anticipated changes for foreign keys. REVOKE ------ The HLS says "It's possible to REVOKE REFERENCES ON PK FROM user, so user no longer has a right to have an FK on the PK. In other DBMSs, this would cause restrict or cascade. But we'll just allow it -- no error check." The original LLD had this question in the Q+A: "Q. How we should handle REFERENCES privilege? Should we really do nothing when it is revoked ..." And the answer was -- we really should do nothing. To prevent a major security breach (cascaded updates changing tables that users have no privileges on), there will be runtime privilege checks, see fk_prepare(). There will be no code changes for REVOKE. SELECT FROM INFORMATION_SCHEMA ------------------------------ MySQL could call fk_prepare() for every table in every database, to make a huge constraint_list. But we deem that inefficient. The INFORMATION_SCHEMA implementation will open tables which match the SELECT's where clause (indeed this might mean opening of all tables in the system) and fetch information about foreign keys from the TABLE objects. An additional column could be: information_schema.table_constraints.constraint_state which would show if a constraint is "disabled" etc. That could be a different worklog task. See also WL#3992 ALTER TABLE ... DISABLE CONSTRAINT. The metadata for foreign keys is available by reading .FRM files. The simplest way will be: fk_prepare() for every table in every database, MySQL calls "new" code if @@foreign_key_all_engines is TRUE, and "old" code if it's FALSE. Following that: users will either see new foreign-key-for-all-engines, or old InnoDB foreign keys. Not both. There is no plan for "old" and "new" information_schema tables. SET --- There is nothing new for SET. The new variable @@foreign_key_all_engines is not settable at runtime. There will be no new code in this area. SHOW CREATE TABLE ----------------- The HLS does not mention SHOW CREATE TABLE, so there is no specification to change it for WL#148. SHOW CREATE TABLE will still work. For possible changes that might affect metadata of foreign keys, see section about information_schema. Column constraints do not have to be shown as table constraints. (For InnoDB foreign keys there is a FOREIGN KEY clause.) Foreign-key information should appear for mysqldump uses. There may be other ways to save foreign-key metadata, depending on progress of 'online backup for metadata' task, WL#3713. SHOW TABLE STATUS ----------------- The HLS does not mention SHOW TABLE STATUS, so there is no need to change it for foreign keys. To get metadata, see "SELECT FROM INFORMATION_SCHEMA". For InnoDB foreign keys there is a REFER clause in the 'Comment' field. There is no need to show it if @@foreign_key_all_engines=1. TRUNCATE -------- We now agree that TRUNCATE will be legal: * instead of disallowing TRUNCATE of child table always, allow always * instead of disallowing TRUNCATE of parent table always, allow if it's a self-reference. See Re: TRUNCATE TABLE in WL#148 "Foreign keys", cc dev-private. See WL#148 Foreign Keys Milestone 14 QA, cc dbg-runtime. if (@@foreign_key_checks<>0) For (each foreign key that references this table) If (it's not a self reference) Error("??") /* perhaps ER_FK_TRUNCATE */ There is no further special processing for foreign keys. Since it's a child, or a parent-and-child i.e. a self reference, the usual TRUNCATE behaviour applies and we don't have to worry about triggers etc. This is a slight incompatibility with InnoDB. [ Note added 2010-12-03: now with InnoDB TRUNCATE of a parent table usually causes an error, see BUG#54678 comments. ] UPDATE ------ This is an illustration of the 'essential' idea, showing the code changes in sql_update.cc mysql_update(). "Bulk update" code is not shown; we do no special 'bulk' code. The current routine: mysql_update() /* the row-by-row loop ... */ table->file->ha_update_row /* e.g. ha_innodb.cc ha_innobase::update_row */ table->file->is_fatal_error(... /* Check for duplicate key */ table->triggers->process_triggers /* AFTER triggers */ table->file_unlock_row ??? The new routine: fk_prepare() /* NEW. set up FK tables */ ... mysql_update() /* the row-by-row loop ... */ fk_check_parent_list() /* NEW. find referencing (parent) rows */ fk_check_child_list() /* NEW. find referencing (child) rows, check */ fk_cascading_action_child_list() /* NEW. find referencing (child) rows, do cascading action */ table->file->ha_update_row /* e.g. ha_innodb.cc ha_innobase::update_row */ table->file->is_fatal_error(... /* Check for duplicate key */ table->triggers->process_triggers /* AFTER triggers */ fk_end_of_statement_check_foreign_keys()/* NEW. check integrity a second time */ Now let us re-state the above, with more detail. update() fk_prepare(); for (each row) fk_update_record_and_check_foreign_keys(table,record,oldrecord); fk_end_of_statement_check_foreign_keys(); /* The row-by-row loop for foreign keys. Called for update() (see above), and for cascading actions. */ bool fk_update_record_and_check_foreign_keys(TABLE *table, void *record, void *oldrecord) { invoke_before_triggers(); Bitmap changed_column_map= fk_find_changed_columns(record, old_record); /* A special case for IGNORE and some storage engines, e.g. Falcon. Do an optimistic update: first update, then check. But have a savepoint for each row. */ if (fk_statement_flags & FK_STATEMENT_FLAG_AFTER_CHECK != 0) and (fk_statement_flags & FK_STATEMENT_FLAG_IGNORE != 0) create_savepoint('X'); table->file->ha_update_row; if (fk_check_child_list(table, record, is_eos, ALL_COLUMNS) == ERROR) if (fk_ignore_error(error) == TRUE) /* Don't return error, just skip to next row. */ rollback_to_savepoint('X'); else /* savepoint need not be released */ return ERROR; release_savepoint('X'); else if (fk_statement_flags & FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK != 0) or (fk_statement_flags & FK_STATEMENT_FLAG_IGNORE != 0) /* Do a non-optimistic-update: first check, then update. */ fk_check_parent_list(table, newrecord, is_eos, changed_column_map); fk_check_child_list(table, oldrecord, is_eos, changed_column_map); table->file->ha_update_row /* e.g. ha_innodb.cc ha_innobase::update_row */ else /* FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK == 0 */ /* Do an optimistic-update: first update, then check. */ table->file->ha_update_row /* e.g. ha_innodb.cc ha_innobase::update_row */ fk_check_parent_list(table, newrecord, is_eos, changed_column_map); fk_check_child_list(table, oldrecord, is_eos, changed_column_map); table->file->is_fatal_error(... /* Check for duplicate key */ table->triggers->process_triggers /* i.e. "invoke_after_triggers()" */ UPDATE LIMIT ------------ To see how InnoDB handles it, here is a script: create table t1 (s1 int primary key) engine=innodb; create table t2 (s1 int, foreign key(s1) references t1 (s1) on update cascade) engine=innodb; insert into t1 values (1),(2),(3); insert into t2 values (1),(1),(1),(1),(1),(1); update t1 set s1 = s1 - 1 limit 2; select * from t1; select * from t2; Result: the first two rows of t1 are changed, and all of the rows in t2 are changed. This shows how LIMIT affects the 'number of rows found in the main statement', and not the 'number of rows affected' due to cascading. Conclusion: UPDATE LIMIT presents no special difficulties. Just handle as for UPDATE, reducing the number of found rows. UPDATE multi-table ------------------ For "UPDATE t1,t2" it is normal to do all the t1 updates, then all the t2 updates -- no interleaving. So there are two separate "update" calls. That means: * Multi-table UPDATE is handled by the regular UPDATE routine, see "UPDATE". Except it uses the routine twice. * There is no special work to do, except ensuring that all affected tables are locked at the start (so call fk_prepare_foreign_key_tables() for all tables in the UPDATE statement) * The "end-of-statement" buffer and "end-of-statement" validation are applicable aren't really happening at end of statement, but at end of loop. The same considerations apply for "DELETE multi-table". ======================================= Effects on non-runtime teams ============================ In this section we say what help or action we need from people who are not in the dev-runtime (Dmitri, Konstantin, Marc, Alik, Kristoffer) group. We describe what all storage engines need (including the class library which still needs discussion), and then list the storage engines, and then list non-dev-runtime teams that will feel some particular effects. Storage Engine teams -------------------- We assume that the storage engine will publish its capabilities truthfully so we can set FK_STATEMENT_FLAG_NO_READ, FK_STATEMENT_FLAG_NO_STATEMENT_ROLLBACK, FK_STATEMENT_FLAG_NO_SAVEPOINT, FK_STATEMENT_FLAG_NO_CONSISTENT_READ, FK_STATEMENT_FLAG_PESSIMISTIC, and FK_STATEMENT_FLAG_FALCON in fk_prepare_statement_flags(). We suggest that the storage engine should check the value of @@foreign_key_all_engines. If it's on, then the server is responsible for foreign-key checks. If it's off, then the engine is responsible, and the server will try not to interfere. See "Variables". There are points in the descriptions which we marked with comments: "Handler interaction point #1: look up child value in parent table" "Handler interaction point #2: look up parent value in child table" "Handler interaction point #3: cursor over child records" See also WL#3290 "Push down foreign key checks to the storage engine.". [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: you should cancel WL#3290. ] And we need approval for one of the following "handler" proposals: /* Proposal 1 */ class handler { /* Produces an identical copy of the handler: necessary to open a cursor for recursion */ handler *clone(); enum extra_opt { FOREIGN_KEY_OPERATIONS }; /* Inform the engine that index_read/index_next semantics should be changed in this handler instance (this means dirty reads in most cases). */ void extra(enum extra_opt); /* Foreign key read operations: in most engines these operations will use dirty read visibility and possibly next-key or gap locks. */ void index_first(void *record); void index_next(void *record); /* Delete the row under position of the cursor */ void delete_row(); /* Update the row under position of the cursor */ void update_row(void *record); }; /* Proposal 2 */ class handler { /* Produces an identical copy of the handler: necessary to open a cursor for recursion. */ handler *clone(); /* Foreign key read operations: in most engines these operations will use dirty read visibility and possibly next-key or gap locks. */ void fk_read(void *record); void fk_next(void *record); /* Delete the row under position of the cursor */ void delete_row(); /* Update the row under position of the cursor */ void update_row(void *record); }; /* Proposal 3 */ class handler { /* Addresses the recursion problem. */ void store_position(HA_POSITION *out_position); void restore_position(const HA_POSITION *in_position); /* Foreign key read operations: in most engines these operations will use dirty read visibility and possibly next-key or gap locks. */ void fk_read(void *record); void fk_next(void *record); /* Delete the row under position of the cursor */ void delete_row(); /* Update the row under position of the cursor */ void update_row(void *record); }; /* Proposal 4 */ class handler { enum extra_opt { FOREIGN_KEY_OPERATIONS }; /* Inform the engine that index_read/index_next semantics should be changed in this handler instance (this means dirty reads in most cases). */ void extra(enum extra_opt); /* Addresses the recursion problem. */ void store_position(HA_POSITION *out_position); void restore_position(const HA_POSITION *in_position); /* Foreign key read operations: in most engines these operations will use dirty read visibility and possibly next-key or gap locks. */ void index_read(void *record); void index_next(void *record); /* Delete the row under position of the cursor */ void delete_row(); /* Update the row under position of the cursor */ void update_row(void *record); }; /* Proposal 5 */ class Handler_cursor { virtual void open()= 0; virtual void next()= 0; virtual void fetch(void *record)= 0; virtual void positioned_delete()= 0; virtual void positioned_delete()= 0; }; class handler { enum Cursor_visibility { CURSOR_VISIBILITY_DEFAULT, CURSOR_VISIBILITY_FOREIGN_KEY }; /* Create a cursor for a given visibility mode. The plan is that existent index_read/index_next methods are phased out and replaced with operations that redirect the functionality to an instance of Handler_cursor */ virtual Handler_cursor *create_cursor(Cursor_visibility visibility); }; /* Proposal 6 */ class handler { enum extra_opt { FOREIGN_KEY_OPERATIONS }; /* Inform the engine that index_read/index_next semantics should be changed in this handler instance (this means dirty reads in most cases). */ void extra(enum extra_opt); /* Addresses the recursion problem. */ void create_cursor_state(CURSOR_STATE *cursor_state); /* Foreign key read operations: in most engines these operations will use dirty read visibility and possibly next-key or gap locks. */ void cursor_read(CURSOR_STATE *cursor_state, void *record); void cursor_next(CURSOR_STATE *cursor_state, void *record); /* Delete the row under position of the cursor */ void delete_row(); /* Update the row under position of the cursor */ void update_row(void *record); }; Storage Engines In Alphabetic Order ----------------------------------- In this section we list the problems that we anticipate with specific storage engines, See previous section, "storage engine teams", for problems that are common, not for specific particular storage engines. ARCHIVE Archive doesn't work with UPDATE or REPLACE or DELETE, and doesn't have indexes. Uniqueness won't work so archive tables can't be parent tables. And we require indexes on child keys so archive tables can' be child tables. Foreign keys will do nothing. BLACKHOLE Blackhole should return "engine cannot read" so fk_statement_flags will have FK_STATEMENT_FLAG_NO_READ = true, and foreign keys will do nothing. CSV CSV doesn't have indexes. Uniqueness won't work so csv tables can't be parent tables. And we require indexes on child keys so csv tables can't be child tables. Foreign keys will do nothing. DB2 FOR I5/OS DB2 has its own foreign-key code, so we expect DB2 users to start with --foreign-key-all-engines=FALSE, and use the DB2 support. So far the only incompatibility we know of, based on documentation, is: for UPDATE|CASCADE SET NULL, DB2 requires that only one column be nullable (we require that all columns be nullable). EXAMPLE Example should return "engine cannot read" so fk_statement_flags will have FK_STATEMENT_FLAG_NO_READ = true, and foreign keys will do nothing. FALCON Falcon will have to work, we've promised Falcon+Foreign-Key publicly. We will not be responsible for any migration problems if users do not start mysqld with --lower-case-table-names=1. See also WL#3289 "Falcon: Implement Storage Engine API for foreign key support". FEDERATED Federated has bugs and limitations (e.g. see bugs.mysql.com). We must trust any engine that says it can handle transactions etc. But when it doesn't work, dev-runtime shouldn't have to lift a finger to fix it. INNODB See the "Documentation" section for notes about incompatibilities and migration. We expect InnoDB users to start mysqld with --foreign-key-all-engines=FALSE, but we also expect InnoDB to provide support for the handler interfaces, like any other storage engine. [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: you should cancel WL#3291 "Reuse InnoDB native foreign key implementation when possible." ] MARIA Maria capabilities are still unclear. We know that savepoints are planned (WL#3069, WL#3074), versioning is planned (WL#3066), and indexes exist. Perhaps we'll know more after WL#148 is done. MEMORY Memory will have the limitations of "non-transactional" storage engines, but foreign keys will work with RESTRICT. MYISAM MyISAM will have the limitations of "non-transactional" storage engines, but foreign keys will work with RESTRICT. See also WL#4133 MyISAM and Foreign Keys NDB NDB should work, but the NDB people seem to have their own plans (WL#1799 "Support for Foreign Keys in MySQL Cluster"). See also WL#3292 "Foreign Keys: NDB Cluster requirements for SQL-level foreign keys". This observation appeared in the original LLD "Q+A" section: "Q. What shall happen with a foreign key when an insert into a table happens that bypasses the SQL layer? Example: use of NDB API, direct insert into the source table in a federated setup. A. Inserts through NDB API will bypass foreign key checks in the current architecture. We won't guarantee consistency of foreign keys if NDB API is used or a direct insert into the source federated table is used. The problem of NDB API should be dealed with in WL#1799 "Support for Foreign Keys in MySQL Cluster"." PBXT PBXT has its own foreign-key code. Peter tested PBXT in May 2007. He posted the results to dev-private in thread "[Fwd: Re: PBXT stability?]" in May 2008. SOLID SolidDB has its own foreign-key code, so we expect Solid users to start with mysqld --foreign-key-all-engines=FALSE, and use the Solid support. optimizer team -------------- There is no effect on the optimizer team. Especially: don't depend on integrity, since foreign key checks can be turned off and we don't use the "can't trust" flag. For example: CREATE TABLE t1 (s1 INT PRIMARY KEY); CREATE TABLE t2 (s1 INT, FOREIGN KEY (s1) REFERENCES t1(s1)); SELECT * FROM t2 WHERE s1 IN (SELECT s1 FROM t1); In this example, the WHERE clause can't be optimized away. qa team ------- >From the QA team, we need a "dedicated resource" for looking at what happens when supporting foreign keys across different storage engines. From the Q+A in the original LLD: " Q. Shall we support foreign keys across different storage engines? A. The initial implementation will allow this possibility. A QA resource will be dedicated to explore possible consequences of this feature, and based on the results of the investigation this feature will be prohibited or allowed in the release. " [ In February 2009 we decided that transactional-storage-engine tables can't reference, or be referenced by, non-transactional-storage-engine tables. ] replication team ---------------- We want the replication team to confirm that this is okay: Statement replication will "work", usually, if master and slave both use transactional storage engines, or if master and slave both use non-transactional storage engines. Row-by-row replication will "work", period. "We should address rolling upgrade in the specification." docs team --------- See the above section, "Documentation". We will send that to docs-private@mysql.com and wait for approval. =========================================== ADDITIONAL TESTS ================ This section is just some notes about additional tests. The LLD had a few examples of failures by InnoDB, etc. These tests will be added to the lists of tests in the dev-runtime thread "Re: Designing & Implementing test coverage for Foreign keys." Once they are there, they will be removed from this LLD. Example of InnoDB failure with recursive foreign keys and row-by-row processing. SET @@storage_engine = innodb; CREATE TABLE t1 (a INT PRIMARY KEY, b INT, FOREIGN KEY (b) REFERENCES t1 (a) ON UPDATE CASCADE); INSERT INTO t1 VALUES (2,1),(1,1); /* The INSERT fails because, when we try to insert child-key=1, there is no parent-key=1. This is wrong, because we would insert child-key=1 later in the same statement. */ INSERT INTO t1 VALUES (1,1); /* This succeeds. */ UPDATE t1 SET a = 3; /* The UPDATE fails because we check before cascading. */ Also this test must go in, because of BUG#29624 and thread "Re: Should we substitute 0/NULL on ALTER TABLE ?" ... ... Example of InnoDB skipping errors for IGNORE. mysql> drop table t1; Query OK, 0 rows affected (0.04 sec) mysql> create table t1 (a int primary key) engine=innodb; Query OK, 0 rows affected (0.13 sec) mysql> create table t2 (a int primary key, b int, foreign key (b) references t1 (a) on delete restrict); Query OK, 0 rows affected (0.07 sec) mysql> insert into t1 (a) values (1); Query OK, 1 row affected (0.09 sec) mysql> insert into t2 (a, b) values (1,1), (2,1), (3,1); Query OK, 3 rows affected (0.03 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> insert ignore into t2 (a,b) values (2,2); Query OK, 0 rows affected (0.00 sec) mysql> select * from t2; +---+------+ | a | b | +---+------+ | 1 | 1 | | 2 | 1 | | 3 | 1 | +---+------+ 3 rows in set (0.00 sec) InnoDB "buggy scenario". mysql> drop table t1, t2; Query OK, 0 rows affected (0.07 sec) mysql> create table t1 (a int primary key); Query OK, 0 rows affected (0.09 sec) mysql> drop table t1; Query OK, 0 rows affected (0.00 sec) mysql> set @@storage_engine=innodb; Query OK, 0 rows affected (0.00 sec) mysql> create table t1 (a int primary key) engine=innodb; Query OK, 0 rows affected (0.08 sec) mysql> create table t2 (a int primary key, b int, foreign key (b) -> references t1 (a) on delete restrict); Query OK, 0 rows affected (0.10 sec) mysql> insert into t1 (a) values (1); Query OK, 1 row affected (0.05 sec) mysql> insert into t2 (a, b) values (1,1); Query OK, 1 row affected (0.04 sec) mysql> insert into t2 (a, b) values (2,1); Query OK, 1 row affected (0.04 sec) mysql> insert into t1 (a) values (2); Query OK, 1 row affected (0.09 sec) mysql> insert into t2 (a, b) values (2,2); ERROR 1582 (23000): Duplicate entry '2' for key 'PRIMARY' mysql> insert into t2 (a, b) values (3,2); Query OK, 1 row affected (0.04 sec) mysql> insert into t2 (a, b) values (4,2); Query OK, 1 row affected (0.05 sec) mysql> insert into t1 (a) values (3); Query OK, 1 row affected (0.10 sec) mysql> delete ignore from t1; Query OK, 0 rows affected, 1 warning (0.09 sec) mysql> select * from t1; +---+ | a | +---+ | 1 | | 2 | | 3 | +---+ 3 rows in set (0.00 sec) -- no rows are deleted, the statement is aborted, but no error is produced (the error is converted to a warning ... /* InnoDB passes. */ /* The idea here seems to be: even though t1 row is deleted, t2 row is deleted. Umm, kind of obvious. */ set @@storage_engine=innodb; drop table if exists t2, t1; create table t1 (a int primary key); create table t2 (a int primary key, b int, foreign key (b) references t1 (a) on delete cascade); insert into t1 (a) values (1); insert into t2 (a, b) values (1,1), (2,1), (3,1); delete from t1 where a in (select a from t2); select * from t1; /* should return 0 rows */ select * from t2; /* should return 0 rows */ Test that Konstantin and Dmitri thought showed "incorrect" behaviour by InnoDB. Actually behaviour is correct. InnoDB passes. -- Expected result: all records in call_log are identical drop table if exists t2, t1, call_log; drop trigger if exists t1_au_trg; create table t1 (a int primary key) engine=innodb; create table t2 (a int primary key, b int, foreign key (b) references t1 (a) on update cascade on delete cascade) engine=innodb; create table call_log (new_parent int, sum_child int); insert into t1 (a) values (1), (2); insert into t2 (a, b) values (1,1), (2, 2); create trigger t1_au_trg after update on t1 for each row insert into call_log (new_parent, sum_child) select new.a, sum(b) from t2; update t1 set a=a+2; select * from call_log; -- mysql> select * from call_log; -- +------------+-----------+ -- | new_parent | sum_child | -- +------------+-----------+ -- | 3 | 5 | -- | 4 | 7 | -- +------------+-----------+ -- 2 rows in set (0.00 sec) -- Update of the child table is also possible, despite a conflict -- with the update through a cascading action. -- The update query executed inside the trigger sees the changes -- of the cascading action, and vice versa. ... /* This test was introduced as a reason to disallow cascading action with non-transaction tables. The original LLD comment: " cascading actions plus cross-storage engine support may lead to dangling references or records that are impossible to do cascading action on. In the example below we create a valid table structure that is impossible to delete records from: to ensure consistency, non-transactional engines require row-by-row checks, but it is impossible to verify the self-referencing relationship in table t2 in row-by-row mode." To pass, we will allow the ALTER TABLE statements but ignore the foreign keys. */ create table t1 (a int primary key) engine=MyISAM; create table t2 (a int unique, b int unique) engine=Falcon; insert into t1 (a) values (1), (2); insert into t2 (a, b) values (2,1), (1,2); alter table t2 add foreign key (b) references t2 (a) on update cascade on delete cascade; alter table t2 add foreign key (a) references t1 (a) on update cascade on delete cascade; ... The comments re REPLACE mentioned how non-transactional engines could end up with half-done operations. So we're trying to replace a foreign-key which is also a unique-key. We succeed in inserting one row. We get a duplicate-key error for the second row. We delete it. This causes a cascade. We insert again. This fails, there is no parent key. With a non-transactional engine, REPLACE has caused a DELETE. We'lll test a "triple play" -- a single statement that causes an INSERT, and a DELETE, and an UPDATE. CREATE TABLE t1 (s1 INT PRIMARY KEY DEFAULT 3); CREATE TABLE t2 (s1 INT, UNIQUE (s1), FOREIGN KEY (s1) REFERENCES t1 (s1)); CREATE TABLE t3 (s1 INT DEFAULT 3, FOREIGN KEY (s1) REFERENCES t2 (s1) ON DELETE SET DEFAULT); INSERT INTO t1 VALUES (1),(2); INSERT INTO t2 VALUES (1),(2); INSERT INTO t3 VALUES (1),(2); INSERT INTO t2 VALUES (1),(2); ...... Additional test: There was a recent bug report about changing a parent_column value due to alter table auto_increment. This change is not going through 'UPDATE', so see what happens with it when @@foreign_key_all_engines=1. ....... Test: Foreign keys and partitions. We're doing a full-table scan, meaning all partitions will be done. While we're updating, a row moves from an early to a late partition. CREATE TABLE t1 (s1 INT PRIMARY KEY) PARTITION BY LIST (s1) (PARTITION p1 VALUES IN (1), PARTITION P2 VALUES IN (2)); CREATE TABLE t2 (s1 INT, FOREIGN KEY (s1) REFERENCES t1 (s1)) PARTITION BY LIST (s1) (PARTITION pn VALUES IN (NULL), PARTITION p1 VALUES IN (1), PARTITION P2 VALUES IN (2)); INSERT INTO t1 VALUES (1),(2); INSERT INTO t2 VALUES (NULL),(NULL),(1),(1); UPDATE t2 SET s1 = 2; /* Moves rows from early to late */ UPDATE t2 SET s1 = NULL WHERE s1 > 0; /* Might skip due to pruning */ Test: self-reference + partition? CREATE TABLE t1 (s1 INT, PRIMARY KEY (s1), FOREIGN KEY (s1) REFERENCES t1 (s1)) PARTITION BY LIST (s2) (PARTITION p1 VALUES IN (1), PARTITION p2 VALUES IN (2)); INSERT INTO t1 VALUES (1); UPDATE t1 SET s1 = s1 + 1; [ PRIVATE NOTE TO KONSTANTIN AND DMITRI: If there is any further correspondence about Monty's "axioms" in his 2007-04-10 email "URGENT: Draft 1.1 of foreign key planing proposal" I think that there are more answers now. Suggestions: * "IGNORE ... must be supported for all engines" Answer: Yes, to the extent that it's possible. * "Foreign key handling should work ... even if there isn't a key in the child table for the FK relation" Answer: Not a problem, the storage engine worries about finding rows. * "For non transactional tables we don't support cascaded updates" "... we need to provide recursive foreign key checks and cascaded deletes also for not transactional engines" Answer: No. * "Interface assumes cursor stability on updates, which is may hard to achieve for all engines." Answer: Don't understand. * He is assuming that one will do read-next table scans and index scans. Answer: That could be a problem for read consistency. If a row appears twice, that's not our fault. * "I would like that the FK document we produce, will go through every possible FK problem and step by step explain how to solve this." Answer: "Documentation" section, as requested. * He agrees that self-referencing is okay, although the Sorrento meeting obviously said they are not. Answer: Still okay, though apparently Sorrento meeting was for nothing. * "What to do with not transactional tables when you have recursive definitions ..." Answer: It works if the row is in memory. It works if the parent value has already been added. Otherwise, either INSERT (NULL) and then update to the desired value, or set @@foreign_key_checks=0. * "Note that this should work according to ANSI SQL: INSERT IGNORE into family_tree(1, NULL, NULL, "Base ancestor");" Answer: no, it should not work according to ANSI SQL. * "it would make it possible for a non transactional storage engine to implement these with reverse updates" Answer: No. No reverse updates. No compensating transactions. * "In the first version I suggest we should not allow FK that will cause the same table to be used twice through some paths. (Ie, no self referential and no circles); This will greatly simply non transactional table handling as we can always do check before write." ??? From the context, I think he only means "no cycles for non-transactional tables". Answer: We decided to allow cycles. * "Plan for reverse operations to restore rows in case of row-by-row checks (not needed until we have a full solution for self-referential non transactional tables)" Answer: No. No reverse updates. No compensating transactions. * "If a handler doesn't support savepoints it should be treated as a non-transactional table." Answer: No. Savepoints would be necessary if we supported IGNORE with 'cascading action', but we don't. * "We need to produce full documentation how the foreign keys should, work seen from the end user level in all the possible scenarios that MySQL offers." Answer: see "Documentation".
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.