WL#6891: Define and reimplement STRICT mode

Affects: Server-Prototype Only   —   Status: Complete

This worklog is similar to WL#6614 "Define and reimplement IGNORE",
but instead for STRICT mode.

IGNORE downgrades errors to warnings to allow e.g. UPDATE, INSERT and DELETE 
statements to skip a row which would have otherwise have caused the statement
to abort. STRICT mode does the opposite - upgrade warnings to errors.

Similarly to IGNORE, STRICT mode is currently not very clearly defined and the 
implementation is a source of bugs like: BUG#42910, BUG#5929, BUG#43880,
BUG#48637, BUG#5912, and BUG#5913.

This is the motivation behind this worklog.
It is assumed that the implementation of these two worklogs can
be quite similar - both with regard to code design (using Internal_error_handler)
and test coverage.

NOTE:
====
- It is decided to deprecate ERROR_FOR_DIVISION_BY_ZERO, NO_ZERO_DATE and
  NO_ZERO_IN_DATE SQL MODES and make their functionality part of STRICT MODE.
  See WL#7467 for details.
  Here STRICT MODE refers to STRICT_ALL_TABLES and STRICT_TRANS_TABLES.

- STRICT mode currently works differently for SELECT. Warnings are NOT escalated 
  to errors. 
Functional Requirements:
========================

F-1: User should experience a consistent behaviour of 
     STRICT MODE across all supported statements. 

F-2: User should know which statements are affected by STRICT MODE and which 
     errors are upgraded from Warnings to Errors in STRICT MODE.
     It should be explicitly enlisted in the Documentation.

F-3: STRICT MODE should affect only specified set errors. The below list of   
     errors is based on existing test coverage in mysql-trunk:
     
     ER_TRUNCATED_WRONG_VALUE 		
     ER_WRONG_VALUE_FOR_TYPE 		
     ER_WARN_DATA_OUT_OF_RANGE		
     ER_TRUNCATED_WRONG_VALUE_FOR_FIELD	
     WARN_DATA_TRUNCATED			
     ER_DATA_TOO_LONG			
     ER_BAD_NULL_ERROR			
     ER_DIVISION_BY_ZERO			
     ER_NO_DEFAULT_FOR_FIELD		
     ER_NO_DEFAULT_FOR_VIEW_FIELD		
     ER_CUT_VALUE_GROUP_CONCAT	
     ER_DATETIME_FUNCTION_OVERFLOW		
     ER_WARN_NULL_TO_NOTNULL
     ER_WARN_TOO_FEW_RECORDS	
     ER_TOO_LONG_KEY		
     ER_WRONG_ARGUMENTS
     ER_INVALID_ARGUMENT_FOR_LOGARITHM

     The above errors represent class of errors in which input value either
     its invalid (either it has the wrong data type for the column or it might
     be out of range out of range) or it is missing (a value is missing when a
     new row to be inserted does not contain a value for a non-NULL column 
     that has no explicit DEFAULT clause in its definition).

F-4: STRICT MODE should be applicable to DML statements. INSERT, UPDATE, LOAD,
     MULTI UPDATE, DELETE, MULTI DELETE, ALTER TABLE, INSERT SELECT in which     
     case an invalid tuple can be inserted / deleted into / from the table and 
     some DDL statements like CREATE TABLE, CREATE TABLE SELECT, and SELECT 
     sleep() statements in which case some value might be out of range. This 
     list of statements is based on existing test coverage.
     
F-5: The errors affected by STRICT MODE and supported statements should be
     upgraded from warning to error when STRICT MODE is set in the session.

F-6: Relationship of STRICT with IGNORE, should basically have No Effect,
     when IGNORE is specified in a statement and STRICT MODE is set in a
     session. Errors and Warnings should remain as it is.

F-7: An error message should be displayed if a Warning is upgraded to Error in
     STRICT MODE.

Non-functional Requirements:
============================

NF-1: Implicit Requirement: This Worklog should not break any of the existing
      features.

NF-2: Behaviour of STRICT MODE should be consistent across all supported 
      statements backed by a good test coverage.
This Worklog consists of two parts:

1) Define what STRICT MODE should mean
2) Re-implement STRICT MODE

1) Define STRICT MODE
=====================
According to manual, http://dev.mysql.com/doc/refman/5.7/en/server-sql-mode.html
Strict mode controls how MySQL handles input values that are invalid or missing.
A value can be invalid for several reasons. For example, it might have the wrong
data type for the column, or it might be out of range. A value is missing when a
new row to be inserted does not contain a value for a non-NULL column that has 
no
explicit DEFAULT clause in its definition. (For a NULL column, NULL is inserted
if the value is missing.)

The existing implementation has been studied and these are the errors that
are affected by STRICT MODE according to current implementation and 
existing test coverage for STRICT MODE.

Errors affected by INSERT, UPDATE Statements
--------------------------------------------

NOTE: For all these errors, Coverage there in strict.test

ER_TRUNCATED_WRONG_VALUE: 		1292, "Truncated incorrect %-.32s value: 
\'%-.128s\'"
ER_WRONG_VALUE_FOR_TYPE: 		1411, "Incorrect %-.32s value: 
\'%-.128s\' for
function %-.32s"
ER_WARN_DATA_OUT_OF_RANGE:		1264, "Out of range value for column 
\'%s\' at row %ld"
ER_TRUNCATED_WRONG_VALUE_FOR_FIELD:	1366, "Incorrect %-.32s value: 
\'%-.128s\'
for column \'%.192s\' at row %ld"
WARN_DATA_TRUNCATED:			1265, "Data truncated for column \'%s\' 
at row %ld"
ER_DATA_TOO_LONG:			1406, "Data too long for column \'%s\' 
at row %ld"
ER_BAD_NULL_ERROR:			1048, "Column \'%-.192s\' cannot be 
null"
ER_DIVISION_BY_ZERO:			1365, "Division by 0"

/* Coverage there in strict.test as well as view.test line 1834 */
ER_NO_DEFAULT_FOR_FIELD:		1364, "Field \'%-.192s\' doesn\'t have a 
default value"

/* Coverage there in view.test line 1836 */
ER_NO_DEFAULT_FOR_VIEW_FIELD:		1423, "Field of view \'%-.192s.%-.192s\'
underlying table doesn\'t have a default value"

/* Coverage there in func_gconcat.test line 800 */
ER_CUT_VALUE_GROUP_CONCAT:		1260, "Row %u was cut by GROUP_CONCAT()"

/* Coverage there in func_date_add.test line 54 */
ER_DATETIME_FUNCTION_OVERFLOW:		1441, "Datetime function: %-.32s field 
overflow"

/* Coverage there in func_math.test line 600 */
ER_INVALID_ARGUMENT_FOR_LOGARITHM       1903 "Invalid argument for logarithm"

Affected by LOAD DATA Statement.
--------------------------------

/* Coverage there in wl_6030.test */
ER_WARN_NULL_TO_NOTNULL:		1263, "Column set to default value;  
NULL supplied to NOT NULL column \'%s\' at row %ld"

/*New Coverage added in draft patch submitted */
ER_WARN_TOO_FEW_RECORDS, 1261, "Row %ld doesn\'t contain data for all columns"  

Errors affected by CREATE Statement
-----------------------------------

/* For these 2 errors, coverage there in strict.test */
ER_TOO_LONG_TABLE_COMMENT:		1628, "Comment for table \'%-.64s\' is 
too long (max
= %lu)"
ER_TOO_LONG_FIELD_COMMENT:		1629, "Comment for field \'%-.64s\' is 
too long (max
= %lu)"

/* For this error, Coverage there in comment_index.test */
ER_TOO_LONG_INDEX_COMMENT:		688, "Comment for index \'%-.64s\' is 
too long (max
= %lu)"

/* Affected by CREATE, Coverage there in i_main.strict test, myisam.test line
347. */
ER_TOO_LONG_KEY:			1071, "Specified key was too long; max 
key length is %d bytes"

Error Affected by SELECT sleep. 
------------------------------

/* Coverage there in i_main.sleep */
ER_WRONG_ARGUMENTS:			1210, "Incorrect arguments to %s"

2) Re-implement STRICT MODE:
===========================
Existing implementation:
------------------------
The implementation of STRICT MODE is such that we have abort_on_warning
flag in THD class. Generally we set abort_on_warning flag if either of
STRICT_ALL_TABLES or STRICT_TRANS_TABLES are active and we don't have IGNORE.
We set it to true if we want to upgrade from warnings to errors. Warnings are
set by push_warning() or push_warning_printf() function which in 
turn calls THD::raise_condition() where these warnings are converted to errors 
due to this code snippet.

  if ((level == Sql_condition::SL_WARNING) &&
      really_abort_on_warning())
  {
    /*
      FIXME:
      push_warning and strict SQL_MODE case.
    */
    level= Sql_condition::SL_ERROR;
    killed= THD::KILL_BAD_DATA;
  }
abort_on_warning is not set for all statements (= STRICT MODE applies to
only some statements). And it's not set for the duration of the whole 
statements like INSERT, UPDATE etc. Initially it is switched ON in DML's
and in Stored Routines it is switched OFF. Also setting it ON and OFF
and vice versa is sprinkled at many places in the code, which makes
it quite difficult to understand the existing implementation. Only resort
was test coverage which we had for STRICT MODE. This test coverage was
studied and these are the statements which are affected by STRICT MODE.

Statements affected ARE INSERT, UPDATE, LOAD, CREATE TABLE, 
CREATE TABLE SELECT, MULTI UPDATE, DELETE, MULTI DELETE ALTER TABLE,
INSERT SELECT, and SELECT sleep().
CALL Stored Procedure(), SELECT Stored function() are affected
in such a way that we store the sql_mode at the time when stored 
routine was created and use it when executing it irrespective of
the mode in which it is invoked. Any individual INSERT, UPDATE
statements that are affected by STRICT MODE give error if the
stored routine is created in STRICT MODE.

Motivation and Guidelines for New Implementation:
------------------------------------------------
The motivation behind reimplementation of STRICT MODE is we
want to get consistent behaviour and clean implementation of
STRICT MODE across the different supported statements.
This worklog is essentially about replacing abort_on_warning
with an Internal_error_handler activated just for STRICT MODE.

Instead of turning this flag on and off and checking it inside
THD::raise_condition(), we push an Internal_error_handler (Strict error
handler) at one point during a statement execution and pop it at another point
during the statement execution (so only one push and one pop per statement).
This Internal_error_handler upgrades a specific set of warnings to
errors. Currently those errors which were affected by STRICT MODE in old
implementation are converted to errors in STRICT MODE according to new 
implementation. 

Effect on Replication, Upgrade/Downgrade:
=========================================
Effect of STRICT MODE on replication and upgrade/downgrade issue has
been discussed at length in the email threads exchanged between replication
and runtime team. In brief, Bug fixes/new features affecting operations in
STRICT MODE may result in replication failure on upgraded slave.

Since we are reimplementing STRICT MODE and some statements or
Stored Routines/Triggers which were not giving error in 
STRICT MODE will start giving error, or previously they were
producing errors in STRICT MODE and now they are not 
replication can break if 
 
Older master (WL is not there) --> Newer slave (WL is there)

Therefore, the proper procedure to do upgrade/downgrade during replication must
be thoroughly documented. In brief, to do upgrade stop all new statements on the
master and wait until the slaves catch up. Then upgrade the slaves followed by
the master.
 
Alternatively, if new statements cannot be stopped, temporarily
change to row-based logging on the master (binlog_format=ROW) and
wait until all slaves have processed all binary logs produced up to
the point of this change. Then upgrade the slaves followed by the
master and change the master back to statement-based logging.
This will ensure Replication is not broken if we follow this procedure of
upgrade/downgrade.

Other than that, I think it should not have any affect
on the above modules.

Behaviour Changes:
==================
Initially in the draft patch submitted for this WL, the behavioral changes are
kept at minimum to keep the backward compatibility. All the MTR testcases pass.
However, based on the feedback from support and replication, the behaviour can
change. 
Also, With this draft patch it is verified that some existing bugs related to
STRICT MODE are handled. Here is the description of the bugs/behaviour changes
addressed with the draft patch.

1. According to MySQL manual
http://dev.mysql.com/doc/refman/5.7/en/create-trigger.html "MySQL stores the
sql_mode system variable setting that is in effect at the time a trigger is
created, and always executes the trigger with this setting in force, regardless
of the server SQL mode in effect when the event begins executing."
With the previous implementation this rule was not functioning properly.
With the new implementation, this bug has been solved in the draft patch 
submitted for this WL and the behaviour of STRICT MODE is consistent in triggers
now.

2. Also with the draft patch submitted for this WL, it is verified
that in sp-vars.test it was aborting for SELECT Stored function() when the
function is not created in STRICT MODE, and vice versa breaking the established
rule that we execute Stored Routines with the sql mode that is in effect at the
creation time irrespective of the sql mode that is in effect when the routine is
called. 
Although there was no bug logged for this issue, this issue has also 
been addressed in the draft patch.

3.ER_DUP_INDEX will be removed from being affected by STRICT MODE, since the
text of this error says "This is deprecated and will be disallowed in a future
release" A warning that something is deprecated makes sense. But an error that
something is deprecated does not make sense.

NOTE:
=====
It is decided to create a single tree or branch for both IGNORE (WL#6614) and
STRICT MODE (WL#6891) Worklogs
1. Since the implementation of both these WL's is quite similar in nature.
2. It will be easier to review and to perform QA.

NOTES TO DOCUMENTATION:
======================

LIST OF STATEMENTS AFFECTED BY STRICT MODE:
------------------------------------------


LIST OF ERRORS AFFECTED BY STRICT MODE:
---------------------------------------
To reimplement Strict mode, we are basically reusing the Internal error handler
class. A class named Strict error handler is created which inherits from
Internal_error_handler class.
Strict error handlers are pushed in sql_parse.cc file before the function calls
like mysql_insert(), mysql_update(), mysql_multi_update(), mysql_load(),
mysql_create_table(), select_create(), mysql_alter_table() etc. which are 
affected by STRICT MODE.
It is also pushed during execution of Stored Routines CALL SP(), SELECT SF() to 
adhere to the established behaviour that the sql_mode at the time of creation 
of Stored Routines will be honoured irrespective of the sql_mode in which
it is invoked.
Also unnecessary setting and unsetting of abort_on_warning flag is completely
removed from the code.