WL#2073: Schemas in ANSI style

Affects: Server-7.1   —   Status: Un-Assigned

Given that we will map databases to schemas 
(as in WL#2071 Database = Schema), our next 
step is to support all ANSI/ISO "schema" 
statements and capabilities. 
 
Most of the description of this task was 
salvaged from WL#942 progress reports. 
Originally WL#942 was "Database = schema", 
but as it slowly became a "Catalogs" task 
the original requirement was destroyed. 
Despite my efforts to recover all the 
original, the text is now quite a mess.  
 
A MySQL "database" is analogous to a       
standard-SQL "schema". Given that       
decision, we can do some mapping.       
       
There will be a full hierarchy       
"catalog.schema.table" but it is       
equivalent to       
"catalog.database.table" and not       
"database.schema.table".       
       
The catalog level does not need       
to be implemented immediately:       
catalogs are a non-core feature,       
not required for the NIST tests,       
and not used in all DBMSs. But 
there is a separate worklog task 
for catalogs now, WL#942.     
       
When the catalog level is implemented,        
this means that the OS directory       
structure must change. The top level        
is catalog, within that are the schema       
directories. We did not discuss whether        
InnoDB would need any change.       
       
The catalog directory does not have to "include"       
the schema directories. As long as        
there is a way to "point to" the       
schema directories from the catalog        
directory, and vice versa, the DBMS       
can know what the hierarchy is.        
This might make a transition easier.       
       
Connector/J and MyODBC shall both       
return "schema" instead of "catalog"        
for the functions that ask what the       
equivalent name is for "database".        
       
These statements are equivalent:        
       
USE DATABASE = SET SCHEMA        
       
DROP DATABASE = DROP SCHEMA        
       
CREATE DATABASE = CREATE SCHEMA       
   
Monty has also suggested that we could   
add support for these statements,   
presumably to make transition easier:   
   
 USE CATALOG catalog_name;   
 USE SCHEMA|DATABASE schema_name;   
 USE catalog_name.schema_name;   
 
The following details have not been       
agreed upon, but seem uncontroversial.       
       
There must be a "default catalog".       
In the present structure (which has       
no real 3-level hierarchy) the default       
catalog is always "SYSTEM", or "PUBLIC",       
or whatever.       
       
Every schema has an "owner". For us,       
this "owner" is the user who said        
"CREATE SCHEMA" or "CREATE DATABASE".       
For always-present databases like mysql       
or test, there is an arbitrary owner,       
which I'll again call "SYSTEM".       
In standard SQL it is illegal for       
any user to create a table in a       
schema that he/she does not own --       
but we cannot follow standard SQL       
in that respect. Also, the "owner"        
of a schema has ALL PRIVILEGES on       
all tables in the schema automatically,       
even on the tables he/she did not       
create. (This may be undesirable,  
so it is a detail that's subject  
to change.).       
       
There is a schema INFORMATION_SCHEMA.       
It is virtual, that is, there is no        
directory structure of this name.       
The owner of INFORMATION_SCHEMA is        
a user named _SYSTEM (notice that       
this name is illegal in standard        
SQL, so that one cannot connect using       
this name). The description of 
INFORMATION_SCHEMA is in WL#173. 
       
A table is always within a schema.        
 
A character set or a collation is        
always within INFORMATION_SCHEMA.       
However, these objects are always        
the same in MySQL, so there is       
never a need to qualify them with        
INFORMATION_SCHEMA.latin1_general_ci       
or whatever. That should be legal,        
but isn't necessary.       
       
A user is NOT within a schema. One       
does not qualify a user name. In        
standard SQL, a user is not even       
within a catalog. It is a system-wide        
object. 
       
A privilege IS within a schema. The       
privilege is subsidiary to the object        
and not subsidiary to the user. Since    
we already store privilege information    
in the "user" database (i.e. the "user"    
SCHEMA), this might be okay.       
       
A procedure is always within a       
schema. In standard SQL, a built-in        
procedure (such as SUBSTRING) is in       
INFORMATION_SCHEMA, but I understand        
that we will not try to implement       
such a fiction.        
       
We have discussed the concept of a        
"public schema" or "public database".       
It could be named mysql.        
My preference would be a new one,       
named PUBLIC, which anyone can read,        
but only a limited number of users       
can write. Primarily we want        
this schema to store common-to-all       
procedures in.        
       
On startup, the public schema is the        
CURRENT schema. The CURRENT schema is       
usually not the same as the DEFAULT        
schema. The mechanism for changing       
the CURRENT schema is not yet known.        
       
The notation .table shall mean        
" . table".       
In other words, an empty space before        
a dot shall be interpreted as the       
current schema name.        
       
I suggested that we do not need a        
public or current schema, that we       
should use the standard-SQL PATH statement,       
but that motion did not pass.       
       
So a basic installation looks like this:       
       
----------- --------       
- catalog - - user -       
----------- --------       
|       
|________________________________       
|                 |             |       
----------------  -----------  -------------------       
- information_ -  - public  -  - schemas created -       
- schema       -  -         -  - by users       -       
----------------  -----------  -------------------       
|                 |            |       
----------------- ------------ -------------------       
- metadata view - - common   - - tables created  -       
- or            - - routines - - by users        -       
- _system       - ------------ -------------------       
- object        -       
-----------------        
       
Remember that everything in information_schema        
is virtual, and we don't need a public schema       
until 5.0, so the above diagram doesn't really        
represent a drastic set of changes.       
       
Eventually we want to claim compliance      
with SQL standard feature F031      
"Basic schema manipulation".      
      
We would also like to support SQL standard feature   
F311 "Schema definition statement" (core)   
   We have to support "CREATE SCHEMA    
   + authorizations + object-creation". For example:   
   CREATE SCHEMA S AUTHORIZATION PETER;   
   CREATE SCHEMA S CREATE TABLE T (S1 INT);   
   CREATE SCHEMA S   
   ... CREATE TABLE T ...   
   ... CREATE VIEW V ...   
   ... GRANT ...   
   ;   
The "AUTHORIZATION user-name" clause has no analogue   
in current MySQL, it should become the owner.   
The "CHARACTER SET character-set-name" clause   
(not illustrated) has an analogue in current   
MYSQL, it should become the database character set.   
 
The "CHARACTER SET character-set-name" clause    
(not illustrated) has an analogue in current    
MYSQL, it should become the database character set.  
 
These statements are equivalent:         
        
USE DATABASE = SET SCHEMA         
        
DROP DATABASE = DROP SCHEMA         
        
CREATE DATABASE = CREATE SCHEMA        
     
The full syntax for CREATE SCHEMA is       
in SQL-99 Complete, Really (pages 343ff).        
The only things I should add are:       
       
-- NIST assumes that CREATE SCHEMA implies        
SET SCHEMA, that is, CREATE DATABASE       
implies USE DATABASE. I don't see why.   
It's not certain that we'll allow this.       
Monty has also suggested that we could    
add support for these statements,    
presumably to make transition easier:    
    
 USE CATALOG catalog_name;    
 USE SCHEMA|DATABASE schema_name;    
 USE catalog_name.schema_name;    
    
    
The following details have not been        
agreed upon, but seem uncontroversial.        
         
-- SQL:2003 allows SET COLLATION       
.. But we'll worry about that later.   
       
Additional note, we should look at following the complete spec for CREATE 
SCHEMA and allow all create tables that occur as a part of it to not test for 
foeign key attributes until the schema is loaded (aka this allows two tables 
that rely on each other to be loaded). 
 
This task is a requirement for SAP DB compatibility,  
Milestone2 "M2 Schemas". 
 
See also: 
WL#2071 "Database = Schema" 
 
Write specification for implementation of schemas in MySQL.   
Note: Done, Worklog entry #942.   
Add CREATE SCHEMA and DROP SCHEMA, as in SQL-99.   
Add the framework for an implicit schema, for cases where no CREATE SCHEMA  
statement has been executed.   
Map the AUTHORIZATION clause of the CREATE SCHEMA statement to the owner of all  
tables created for the schema. This will fulfill the requirement that SAP DB  
tables have owners.   
Note that this task is related to that of establishing standard security  
handling; MySQL must have the ability to create users and user groups before  
CREATE SCHEMA can define an owner.   
Modify the CREATE TABLE statement to allow the owner to be explicitly defined,  
i.e. CREATE TABLE . and CREATE TABLE TEMP..   
Add a user (i.e. an owner) called TEMP, to specify a temporary table.   
Modify the CREATE TEMPORARY TABLE statement to support the creation of a  
temporary table with the syntax CREATE TABLE TEMP..   
For each user, add a silent schema with the same name as the user name. SAP  
DB's qualification level is users/owners, while MySQL's is schemas. By making  
sure there is a one-to-one relationship between a schema and a user you can  
ensure that there will be no conflict.   

Feature requests:
BUG#5754 Please: SCHEMA & GROUP support