WL#10778: Parser: output deprecation warnings on utf8 references, where utf8mb3 is an alias of utf8

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

Currently the utf8 charset is the alias for utf8mb3.

Since we are going to completely replace utf8mb3 with utf8mb4, this is logical
to notify customers, that the utf8 character set alias and related syntax
constructs will change their meaning soon:

* `utf8` charset alias itself,

* `_utf8` charset prefix,

* `N'...'` string literals,

* `NATIONAL`, `NCHAR` etc. data types.

The current WL is intended to investigate all affected cases in the grammar and
force deprecation warnings on them where applicable.
* NF-1 A warning will be output whenever the parser sees NATIONAL/N[VAR]CHAR/N'...'/.
* NF-2 A warning will be output whenever the parser sees 'utf8' used as a character set name, or _utf8'...'
Affected grammar cases

There is a number of places in the current MySQL grammar where *utf8mb3* is intended as an alias of UTF8 instead of *utf8mb4*, so it would be nice to warn there:

1. String literals (both DML and DDL)

* String constants with the "national" charset: `N'...'`.

* Charset-prefixed string constants: `_utf8 '...'`.

2. "National" data types (DDL, DML)

There is a number of "national" data types with the *utf8mb3* charset default (*utf8_general_ci* collation):




Note: UNICODE is for UCS2, not UTF8.

3. *utf8* charset references

The UTF8 charset can be referenced from DML and DDL as a part of:

* type declarations (columns, SP parameters/variables/return data types),

* table and schema declarations,

* string conversion function calls,

* administrative statements -- connection/client charset manipulation.

In those contexts the charset can be references in a form of:

* identifier: `utf8`,

* quoted identifier (MySQL extention): `` `utf8` ``,

* standard quoted identifier/MySQL string: `"utf8"`,

* standard string: `'utf8'`.


* Data type declarations:

        CHAR [BINARY] CHARSET utf8,
        VARCHAR [BINARY] CHARSET `utf8`,
        TEXT [BINARY] CHARSET "utf8",


* Some of them can be used as cast types in string conversion functions:

        CAST(... AS <cast type>)
        CONVERT(..., <cast type>)

* Direct *utf8* charset references in `CHARSET`/`CHARACTER SET`/`COLLATION` clauses of various DDL and administrative statements:

        SET CHARSET utf8

* Direct charset references (without a type definition) in function calls:

        CHAR(... USING utf8)
        CONVERT(... USING utf8)

* `SET NAMES utf8`