WL#13456: Inject CAST nodes for comparisons with STRING non-const expressions

Affects: Server-8.0   —   Status: Complete

WL#12108 created the infrastructure around injecting CAST nodes in comparisons between non-const expressions of different data types. This WL will implement support for these string-based data type expressions too.

F-1 to F-5 requirements from WL#12108

F-6 Casts to DOUBLE should be injected when comparing a string based data type

   with a numeric one (or YEAR). The cast needs to be injected to the
   arguments that are not of type FLOAT/DOUBLE/REAL.

F-7 Cast to DATETIME should be injected when comparing a string based data type

   with an argument of type DATETIME or TIMESTAMP. The cast needs to be injected
   only to the arguments that is not of DATETIME/TIMESTAMP data type.

F-8 Cast to DATETIME should be injected to both arguments when comparing DATE with

   TIME.

F-9 Cast to DATE should be injected when comparing a string based data type with

   an argument of type DATE. The cast will be injected for the argument that has
   the string based data type.

Where numeric data types are defined as:

   * Integer Types (Exact Value): INTEGER, INT, SMALLINT, TINYINT, MEDIUMINT, 
                                  BIGINT
   * Fixed-Point Types (Exact Value) - DECIMAL, NUMERIC
   * Floating-Point Types (Approximate Value) - FLOAT, DOUBLE, REAL
   * Bit-Value Type - BIT

String-based data types: CHAR, VARCHAR, BINARY, VARBINARY, BLOB, TEXT, ENUM, and SET

When comparing two columns, one of which is of a STRING-based data type, the comparator used will be:

(1) DOUBLE: If one of the columns is of type INT (or any of the sub-types), DECIMAL, DOUBLE/FLOAT/REAL or YEAR (internally also represented as INT). For this case one or both arguments that are not of type DOUBLE/FLOAT/REAL will get wrapped into CASTs to DOUBLE.

example:

 SELECT * FROM t1 INNER JOIN t2 ON t1.d_int * 3 = t2.d_varchar;

becomes:

 SELECT * FROM t1 INNER JOIN t2 ON CAST (t1.d_int AS DOUBLE) =
   CAST(t2.d_varchar AS DOUBLE);

(2) DATETIME: If one of the column is of type DATETIME/TIMESTAMP or one of the arguments is of type DATE and the other one of type TIME. For these cases the arguments that are not of DATETIME/TIMESTAMP type will get wrapped into CASTs to DATETIME.

example:

 SELECT * FROM t1 INNER JOIN t2 ON t1.d_date = t2.d_time;

becomes:

 SELECT * FROM t1 INNER JOIN t2 ON CAST(t1.d_date AS DATETIME) = 
   CAST(t2.d_time AS DATETIME);

2nd example:

 SELECT * FROM t1 INNER JOIN t2 ON t1.d_datetime = t2.d_varchar;

becomes:

 SELECT * FROM t1 INNER JOIN t2 ON t1.d_datetime = 
   CAST(t2.d_varchar AS DATETIME);

(3) DATE: If one of the column is of type DATE the argument that is not of DATE type will get wrapped into CAST to DATE.

example:

 SELECT * FROM t1 INNER JOIN t2 ON t1.d_date = t2.d_str;

becomes:

 SELECT * FROM t1 INNER JOIN t2 ON t1.d_date = 
   CAST(t2.d_str AS DATE);

The CAST should be injected in the same manner for non-constant expressions as it is done for columns (fields).

example:

 SELECT * FROM t WHERE col_str  = 1 * col_int;

becomes:

 SELECT * FROM t WHERE CAST(col_str AS DOUBLE) =
   CAST((1 * col_int) AS DOUBLE));

Note:

For comparison as double(real) there are only two scenarios where we skip injecting the CAST nodes:

- when comparison is between numeric arguments
- when arguments are of type YEAR and TIME (we fallback to comparison as DOUBLE as a last resort) 

The Arg_comparator's inject_cast_nodes method previously returned early when it recognized that one of the arguments was a STRING based data type (for the REAL comparator) or a STRING based data type excluding the temporal ones (for the DATETIME comparator). This was done by checking that the arguments are only of temporal type or numeric. With this WL, STRING based arguments are also accepted and wrapped with CAST nodes as needed.

Identifying the data type for an argument in MySQL is done using the following functions belonging to the Item class:

- result_type(): which can be INT_RESULT, REAL_RESULT and DECIMAL_RESULT for
  the numeric data types (plus YEAR), or STRING_RESULT for the STRING related
  data types (CHAR, VARCHAR, BLOB, VARBINARY..), or the DATE/TIME types.
- data_type(): returns an exact data type (some of which are only used
  internally like MYSQL_TYPE_TIME2) to more precisely discriminate between the
  arguments