WL#6614: Define and reimplement IGNORE
Affects: Server-5.7 — Status: Complete — Priority: Medium
This worklog aims to define what the IGNORE keyword actually should mean and to reimplement it so that it is consistent across the supported statements and more easy to maintain. The current IGNORE implementation has resulted in quite a number of bugs: BUG#30191, BUG#49539, BUG#55421, BUG#54543, BUG#54106, BUG#49534, BUG#47788, BUG#46539, and BUG#46425. IGNORE is supported for the following statements: DELETE - http://dev.mysql.com/doc/refman/5.6/en/delete.html "The IGNORE keyword causes MySQL to ignore all errors during the process of deleting rows. (Errors encountered during the parsing stage are processed in the usual manner.) Errors that are ignored due to the use of IGNORE are returned as warnings." INSERT - http://dev.mysql.com/doc/refman/5.6/en/insert.html "If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row still is not inserted, but no error is issued. IGNORE has a similar effect on inserts into partitioned tables where no partition matching a given value is found. Without IGNORE, such INSERT statements are aborted with an error; however, when INSERT IGNORE is used, the insert operation fails silently for the row containing the unmatched value, but any rows that are matched are inserted. For an example, see Section 17.2.2, “LIST Partitioning”. Data conversions that would trigger errors abort the statement if IGNORE is not specified. With IGNORE, invalid values are adjusted to the closest values and inserted; warnings are produced but the statement does not abort. You can determine with the mysql_info() C API function how many rows were actually inserted into the table." UPDATE - http://dev.mysql.com/doc/refman/5.6/en/update.html "With the IGNORE keyword, the update statement does not abort even if errors occur during the update. Rows for which duplicate-key conflicts occur are not updated. Rows for which columns are updated to values that would cause data conversion errors are updated to the closest valid values instead." LOAD DATA INFILE - http://dev.mysql.com/doc/refman/5.6/en/load-data.html "If you specify IGNORE, input rows that duplicate an existing row on a unique key value are skipped. If you do not specify either option, the behavior depends on whether the LOCAL keyword is specified. Without LOCAL, an error occurs when a duplicate key value is found, and the rest of the text file is ignored. With LOCAL, the default behavior is the same as if IGNORE is specified; this is because the server has no way to stop transmission of the file in the middle of the operation." LOAD XML - http://dev.mysql.com/doc/refman/5.6/en/load-xml.html (not explicitly documented) ALTER TABLE - http://dev.mysql.com/doc/refman/5.6/en/alter-table.html "IGNORE is a MySQL extension to standard SQL. It controls how ALTER TABLE works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if duplicate-key errors occur. If IGNORE is specified, only the first row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value." It has been decided to depricate "ALTER IGNORE" in 5.6 and remove it in 5.7. See WL#7395 for more details. CREATE TABLE ... IGNORE SELECT ... - http://dev.mysql.com/doc/refman/5.6/en/create-table.html (not explicitly documented)
Functional requirements: ------------------------ F-1: IGNORE keyword should affect only specified set of errors. The list of errors is : ER_SUBQUERY_NO_1_ROW ER_ROW_IS_REFERENCED_2 ER_NO_REFERENCED_ROW_2 ER_BAD_NULL_ERROR ER_DUP_ENTRY ER_DUP_ENTRY_WITH_KEY_NAME ER_DUP_KEY ER_VIEW_CHECK_FAILED ER_NO_PARTITION_FOR_GIVEN_VALUE ER_NO_PARTITION_FOR_GIVEN_VALUE_SILENT ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET These errors occur for each record. F-2: The errors affected by IGNORE keyword should be downgraded from error to warning when IGNORE keyword is used in the supported statements. F-3: Changing (Downgrading) an error to warning will cause the statement to skip to the next row instead of aborting the statement. F-4: Warning message should be displayed when an error is downgraded to warning.
This worklog consist of two parts: 1) Define what IGNORE should mean 2) Re-implement IGNORE 1) Define IGNORE ================ As a basis, consider Kostja's quote in BUG#30191: "The purpose of this clause is to insert as many records as possible, skipping records that violate various constraints (i.e. taking no action on those records).The same principle applies to DELETE IGNORE, UPDATE IGNORE." IGNORE should work by downgrading error to warning, causing a record to be skipped without aborting the statement. It should be defined exactly which errors (and/or classes of errors) are affected. Based on the study of the existing implementation and existing behavior, the errors affected by IGNORE are listed below: ER_SUBQUERY_NO_1_ROW 1242 "Subquery returns more than 1 row" ER_ROW_IS_REFERENCED_2 1451 "Cannot delete or update a parent row: a foreign key constraint fails (%.192s)" ER_NO_REFERENCED_ROW_2 1452 "Cannot add or update a child row: a foreign key constraint fails (%.192s)" ER_BAD_NULL_ERROR 1048 "Column \'%-.192s\' cannot be null" ER_DUP_ENTRY 1062 "Duplicate entry \'%-.192s\' for key %d" ER_DUP_ENTRY_WITH_KEY_NAME 1586 "Duplicate entry \'%-.64s\' for key \'%-.192s\'" ER_DUP_KEY 1022 "Can\'t write; duplicate key in table \'%-.192s\'" ER_VIEW_CHECK_FAILED 1369 "CHECK OPTION failed \'%-.192s.%-.192s\'" ER_NO_PARTITION_FOR_GIVEN_VALUE 1526 "Table has no partition for value %-.64s" ER_NO_PARTITION_FOR_GIVEN_VALUE_SILENT 1591 "Table has no partition for some existing values" ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET 1748 "Found a row not matching the given partition set" These errors occur for each record. 2) Re-implement IGNORE ====================== Existing implementation: ----------------------- The current implementation of IGNORE sets the ignore flag in the object of LEX class for the statements which support IGNORE keyword. We set it to true if we want to downgrade errors to warnings for specific statements. From the object of LEX class, 'ignore' flag is passed as bool argument in some functions. This 'ignore' flag is used to: - Downgrade an error to warning if 'ignore' flag is set. - pass to a lower level function, which will again do the same. - Set flags like COPY_INFO->ignore, SELECT_LEX->no_error. New Implementation : -------------------- In order to get consistent behavior of IGNORE across different statements which support IGNORE keyword, IGNORE should be re-implemented using Internal_error_handler. - A class Ignore_error_handler should be created which inherits from Internal_error_handler class. - This error handler for IGNORE will change the severity level of specified set of errors to warning level in case IGNORE keyword is specified in the statement. - The Ignore_error_handler object should be pushed only once for one statement. - The error handler for IGNORE should be pushed if IGNORE keyword is used in the supported statement. - Ignore flag should be eliminated from all possible places in the code. Behaviour Changes ------------------ There are some behavior changes that should be done in this worklog to make the behavior of the IGNORE more consistent. a. In the current behaviour, some handler errors ( returned from storage engines) are ignored by IGNORE keyword but no warning is shown instead. This is because Ignore flag is used to skip print_error() which maps handler errors to MySQL errors. The errors are : HA_ERR_FOUND_DUPP_KEY -> ER_DUP_ENTRY -> ER_DUP_ENTRY_WITH_KEY_NAME When UNIQUE KEY constraint is violated. HA_ERR_ROW_IS_REFERENCED -> ER_ROW_IS_REFERENCED_2 When Foreign key constraint on parent table fails. HA_ERR_NO_REFERENCED_ROW -> ER_NO_REFERENCED_ROW_2 When Foreign key constraint on child table fails on insert / update. HA_ERR_NOT_IN_LOCK_PARTITIONS -> ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET When a row does not matches the given partition set. HA_ERR_NO_PARTITION_FOUND: -> ER_NO_PARTITION_FOR_GIVEN_VALUE ER_NO_PARTITION_FOR_GIVEN_VALUE_SILENT When a table has no partition for a given row. A warning message should be displayed when above errors are ignored. In STRICT mode, errors will not be downgraded to warnings. So, we have behavior changes for handler errors mentioned above as: Current behavior : .................. Default -Error. in SRICT mode -Error (same as Default behavior). With IGNORE -No Error / Warning. Statement skips to the next row instead of aborting. With IGNORE in STRICT Mode -No Error / Warning. Statement skips to the next row instead of aborting. New Behavior : .............. Default -Error. in SRICT mode -Error (same as Default behavior). With IGNORE -Warning. Statement skips to the next row instead of aborting. With IGNORE in STRICT Mode -Warning. Statement skips to the next row instead of aborting. b. For a view with CHECK constraint, when the CHECK constraint is violated it should give an error. Using IGNORE keyword should downgrade errors to warnings. In current behavior, inserting single row which violates CHECK constraint resulted in error even if IGNORE keyword is used. Inserting multiple rows with IGNORE, in which one or more rows violate CHECK constraint results in warnings. This behaviour was introduced by the fix of bug : http://bugs.mysql.com/bug.php?id=5995 This is inconsistent behaviour. If a statement affects single record and IGNORE is used, errors should be downgraded to the warnings. c. The proposed behavior change in IGNORE + STRICT mode is mentioned in the section below along with the current behavior of IGNORE + STRICT. Behavior with STRICT mode -------------------------- Following behavior should be followed for the statements : a. When the default behavior of the statement is error. Default : Error in SRICT mode : Error (same as Default behavior). With IGNORE : Warning With IGNORE in STRICT mode : Warning This behavior is same as the behavior in the trunk. It should be followed for all the errors mentioned above. ER_BAD_NULL_ERROR error behaves differently which is described below. b. When the default behavior of the statement is warning. Default : Warning With IGNORE : Warning (same as Default behavior). in STRICT mode : Error With IGNORE in STRICT mode : Warning This behavior should be followed for the following list of errors : ER_TRUNCATED_WRONG_VALUE ER_WRONG_VALUE_FOR_TYPE ER_WARN_DATA_OUT_OF_RANGE ER_DIVISION_BY_ZERO ER_TRUNCATED_WRONG_VALUE_FOR_FIELD WARN_DATA_TRUNCATED ER_DATA_TOO_LONG ER_BAD_NULL_ERROR ER_NO_DEFAULT_FOR_FIELD ER_NO_DEFAULT_FOR_VIEW_FIELD ER_CUT_VALUE_GROUP_CONCAT ER_DATETIME_FUNCTION_OVERFLOW ER_WARN_TOO_FEW_RECORDS ER_WARN_NULL_TO_NOTNULL ER_INVALID_ARGUMENT_FOR_LOGARITHM ER_WRONG_ARGUMENTS Note : - This list is based on the current behavior checked on trunk. - This list of errors is compiled from the errors mentioned in WL#6891. - WL#6891 may change some behavior to make STRICT mode more consistent which should be documented there. - It is a subset of the errors which are affected by STRICT mode as in the current behavior, IGNORE keyword does not downgrade all errors (which were warning by default but upgraded to error because of STRICT mode) to warnings in the STRICT mode. c. The case with ER_BAD_NULL_ERROR. There will be no behavior change in the worklog for NOT NULL constraint violation. Effect of behavior changes on Replication: ------------------------------------------ If we have a scenario where an error which is being converted to warning now but will remain error after this worklog, the replication may break for the case : Old master server --> New slave server(with reimplemented IGNORE) Besides this, New implementation of IGNORE and above behavior changes should not affect replication. Effect of behavior changes on Upgrade/Downgrade: ------------------------------------------------- This worklog will not affect upgrade / downgrade scenarios. Places in the code to remove ignore flag check for errors --------------------------------------------------------- a. From the definition of functions mysql_alter_table() mysql_insert() mysql_update() mysql_multi_update() mysql_load() view_check_option() fill_record_n_invoke_before_triggers() fill_record() check_record() select_create() select_insert() multi_update() mysql_delete() JOIN::exec() read_fixed_length() read_sep_field() read_xml_field() write_execute_load_query_log_event() copy_data_between_tables() b. From the Class variable COPY_INFO->ignore Load_log_event->ignore Note: From the discussion with Replication team, Load_log_event is pre 5.1 GA. It is kept to maintain backward compatibility in replication. Other use cases of Ignore Flag : -------------------------------- I. To decide if Internal_error_handler for IGNORE and STRICT should be pushed. II. To print the Info about the number of rows affected and number of duplicates found. III. Pass HA_EXTRA_IGNORE_DUP_KEY flag to handler. IV. Set IGNORE_FLAG for Load_log_event object. V. To use row based logging in case of unsafe statements. Unsafe statements are: - UPDATE IGNORE. - CREATE TABLE...IGNORE/REPLACE SELECT... - INSERT...SELECT...ON DUPLICATE KEY UPDATE/REPLACE SELECT/ INSERT...IGNORE...SELECT , unless ORDER BY PRIMARY KEY clause is used in SELECT statement. VI. To set ignore flag of class Alter_inplace_info. It is later used by storage engine. ( to be removed in 5.7 as ALTER IGNORE will be removed by WL#7395). VII. To check if check_read_removal() function should be called. VIII.Return VIEW_CHECK_SKIP or VIEW_CHECK_ERROR in view_check_option() depending on the ignore flag. IX. To push suitable warning message based on IGNORE keyword and STRICT mode.
Copyright (c) 2000, 2017, Oracle Corporation and/or its affiliates. All rights reserved.