WL#3811: Deprecate GLOBAL @@collation_database, @@character_set_database; assignment of SESSION counterparts

Affects: Server-5.7   —   Status: Complete

Session collation_database and character_set-database variables
have broken support now and completely unsupported by 5.x-objects,
such as view, stored routine, triggers, events. Thus, we have
two options:

  - fix full support for all these objects, which seems to be
    quite a complex task, which requires thorough design phase;

  - change the semantics of the variables. The proposal is to
    make them readonly so that they provide a way to get character
    set/collation of the current database only.

If SESSION collation_database, character_set_database are readonly,
there is no reason in having GLOBAL counterparts of them.

So, the proposal of this item is:

  1) Deprecate GLOBAL collation_database, character_set_datavase
    variables -- deprecate in 5.7, remove in 5.8;

  2) Make SESSION collation_database, character_set_database variables
    read only -- deprecate in 5.7, remove in 5.8;

Affected bugs:
BUG#35357 "character_set_database does not effects CREATE DATABASE without
character set"
BUG#27208 If no current database, character_set_database !=character_set_server
BUG#27687 ALTER TABLE ... CONVERT TO uses @@collation_database


Documented behavior
====================

MySQL manual never mentions global variables @@collation_database,
@@character_set_database. 

Speaking of their SESSION counterparts, the following information
is provided in the manual:

--cut: charset.xml  
The character set and collation for the default database can be
determined from the values of the @@character_set_database and
@@collation_database system variables. The
server sets these variables whenever the default database
changes. If there is no default database, the variables have the
same value as the corresponding server-level system variables,
@@character_set_server and @@collation_server.
--end cut


--cut: charset.xml
SET CHARACTER SET is similar to SET NAMES but sets
@@character_set_connection and @@collation_connection
to @@character_set_database and @@collation_database.
--end cut

--cut: database-administration.xml
@@character_set_database
  The character set used by the default database. The server
  sets this variable whenever the default database changes.
  If there is no default database, the variable has the same
  value as character_set_server.

@@collation_database
  The collation used by the default database. The server
  sets this variable whenever the default database changes.
  If there is no default database, the variable has the same
  value as collation_server.
--end cut

--cut: sql-syntax.xml
LOAD DATA INFILE documentation:

The character set indicated by the
@@character_set_database system variable is used to interpret the
information in the file.
--end cut

--cut: stored-procedures.xml
(DECLARE var TYPE description)

  For character data types, if there is a CHARACTER
  SET clause in the declaration, the specified character
  set and its default collation are used. If there is no such
  clause, the database character set and collation that are in
  effect at the time the routine is created are used. (These are
  given by the values of the character_set_database and
  collation_database system variables.
--end cut


Original intent of the implementor
==================================
The original intent was to allow one to reset SESSION-level
*_database variable to temporarily "alter" the default character
set of the current database. 

E.g. to allow one to do:

SET @@collation_database=latin5;
LOAD DATA infile 'foo.txt' ... ;
-- foo.txt is interpreted as if it was in latin5 character set

Or: 

SET @@collation_database=latin1;
CREATE TABLE t1 (a CHAR 10)); -- latin1 is created in latin1
character set, regardless of the character set of the current
(default) database.

What we have now and how it got broken.
=======================================

1. GLOBAL counterparts of
   character_set_database/collation_database are never
   used.

   The only scenario when they can have any effect is when one
   issues:

   SET GLOBAL @@character_set_database=latin5;
   
   
   SELECT @@character_set_database; -- session counterpart takes
   its initial value from the global counterpart if there is no 
   current database selected.

kostja@bodhi:~/work/mysql-5.0-runtime/sql> mysql -uroot -A -s
mysql> SET GLOBAL collation_database=latin5_turkish_ci;

kostja@bodhi:~/work/mysql-5.0-runtime/sql> mysql -uroot -A -s
mysql> select @@collation_database;
@@collation_database
latin5_turkish_ci

   This can have no effect on any SQL statement or data
   inserted/modified in the database. They were never considered
   as of any practical use and apparently were added by mistake.

2. Local variables.

a) Before BUG#2326 "Charset of table is determined by charset of
db only if "USE db;" was fixed, collation_database would have
effect on all created tables (regardless of what database they are
created in).

After it was fixed, it would have no effect on CREATE TABLE
statement (presumably a bug in the fix for the bug, as it
contradicts with the intent. But this fact plays out nice with our
intent to deprecate setting of these variables).

Example (current 5.0):

mysql> drop database db1;
Query OK, 1 row affected (0.00 sec)
mysql> drop database db2;
Query OK, 1 row affected (0.00 sec)
mysql>  create database db1 character set utf8;
Query OK, 1 row affected (0.01 sec)
mysql> create database db2 character set latin5;
Query OK, 1 row affected (0.00 sec)
mysql> use db1;
Database changed
mysql> set @@character_set_database=latin1;
Query OK, 0 rows affected (0.00 sec)
mysql>  create table t1 (a char(10));
Query OK, 0 rows affected (0.02 sec)
mysql> create table db2.t1 (a char(10));
Query OK, 0 rows affected (0.02 sec)
mysql> show create table db1.t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `a` char(10) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> show create table db2.t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `a` char(10) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin5
1 row in set (0.01 sec)

Version when the bug was fixed: 5.0.0

b) Before BUG#15126 character_set_database is not replicated
   (LOAD DATA INFILE need it) was fixed, setting of a local
   variable would break replication, since it was never
   replicated.

   After this bug got fixed, it has also become possible to
   explicitly specify the file character set in the LOAD DATA
   syntax.

Version: 5.0.36.

c) The manual is wrong saying that if there is no current
database, the setting of @@character_set_database is taken from
global variable @@character_set_server.

Example:

mysql> set global character_set_server=latin5;
Query OK, 0 rows affected (0.00 sec)

mysql> create database foo;
Query OK, 1 row affected (0.00 sec)
mysql> use foo;
Database changed
mysql> drop database foo;
Query OK, 0 rows affected (0.00 sec)
mysql> select @@character_set_database;
+--------------------------+
| @@character_set_database |
+--------------------------+
| latin1                   | 
+--------------------------+

This is reported as BUG#27208 "If no current database,
character_set_database !=character_set_server"


To sum up, the only effect that setting of a local variable can
currently have is on LOAD DATA INFILE statement, when it
influences the alleged character set of the file.

Setting of these variables has no effect on stored programs
(triggers, functions, procedures), base tables or views -- all
these objects use the character set and collation of the database
they belong to.