WL#6394: Bootstrap code for new DD

Affects: Server-8.0   —   Status: Complete

1. Introduction
===============

Bootstrapping is done when starting the mysqld server process. In the context of
this worklog, we focus on aspects related to the new data dictionary (DD). There
are four basic aspects of this:

i)   Creating the data dictionary while starting a new server instance.
ii)  (Re)starting a server with existing data.
iii) Upgrading from a previous version. 
iv)  Upgrading from 5.7 to 8.0, i.e., start using the new DD on already
     existing user data.

2. Related work
===============

- WL#6392: Upgrade and downgrade tool for new data dictionary.

3. Definitions
==============

DD: Data dictionary.

SE: Storage engine.

DDSE: Storage engine in which the DD tables are created.

Unconstrained table definition: A table definition without foreign key constraints.

Constrained table definition: A table definition with foreign key constraints.

System tables: The tables that are created by the MySQL server when the server
is initialized. E.g., the dictionary tables that store table definitions, the
help tables, time zone information, etc.

Dictionary (DD) tables: The system tables that are used for storing meta data
about database items. E.g., the table 'mysql.tables' is a dictionary table,
while the help tables are not.

DDSE tables: System tables that are created by the server on behalf of the
dictionary storage engine. These behave in the same way as the core tables (see
below), and are hard coded based on DD version, but their hard coding is in the
DDSE rather than in the SQL layer.

Table definition tables: The DD tables that are used for (re)storing an
arbitrary table definition. These tables are opened when a table definition is
to be read from, or written to, persistent storage.

Core tables: The table definition tables that are needed to store the
unconstrained definitions of the table definition tables.

Core table parents: System tables that are needed to enable creating the
constrained core tables, i.e., the tables which are the parents of foreign keys
of a core table.

Inert tables: Tables that have an inert, unalterable definition which can always
be assumed to be valid. Currently, this is only the 'version' table.

DD schema: The collection of all DD table definitions.

DD version: A unique version number associated with the dictionary schema. When
there is a change in the DD schema, the version number is increased. 

Supported DD version: A (set of) DD version(s) from which it is possible to upgrade.

Actual DD table definition: The meta data of a DD table that is actually stored
persistently within the DD tables.

Actual DD schema: The collection of all actual DD table definitions.

Actual DD version: The version of the actual DD schema. The actual DD version is
stored persistently in a dedicated DD table 'dd.version'.

Target DD table definition: The DD table definition which is created when a
MySQL server starts the first time. The actual- and target DD table definitions
are the same unless there is a dictionary upgrade. 

Target DD schema: The collection of all target DD table definitions.

Target DD version: The version of the target DD schema. The target DD version is
represented in the server source code. If the server restarts, and the actual DD
version (in the dd.version table) equals the target DD version, we know that
dictionary upgrade is not needed.

Hardcoded DD table definition: The DD table definition represented by a
hardcoded internal data structure. 

Synthesized DDL statement: An SQL DDL statement that is generated based on the
hardcoded definition. 

Synthesized TABLE_SHARE: A TABLE_SHARE structure that is generated based on the
internal definition. 

3.1 Table categorization
------------------------

With the definitions above, the following categorization may be applied:

INERT_TABLE:      Needed to start initialization. Immutable.
CORE_TABLE:       Needed to read/store unconstrained table definitions.
TABLE_DEF_TABLE:  Needed to read/store table definitions.
DDSE_TABLE:       Needed by DDSE.
DICTIONARY_TABLE: Needed to read/store database item meta data.
SYSTEM_TABLE:     Needed to read/store system data.

With the current (8.0.0) predefined tables, we may apply the categorization as
follows:

Inert tables
............
Version

Core tables
...........
Schemata
Tables
Columns
Column_type_elements
Indexes
Index_column_usage
Tablespaces
Tablespace_files

Table definition tables
.......................
Foreign_keys
Foreign_key_column_usage
Table_partitions
Table_partition_values
Index_partitions
View_table_usage

Dictionary tables
.................
Character_sets
Collations
Catalogs

DDSE_tables
...........
innodb_table_stats
innodb_index_stats
innodb_table_metadata

System tables
.............
...


4. Short term goals
===================

The short term goals of this WL are:

4.1 Create new DD tables
------------------------

The MySQL server should be able to create new DD tables during the bootstrap
procedure. We will do this internally in the server, not by means of executing
SQL scripts as has been done previously.

We suggest to use hard coded structures from which SQL DDL statements can be
syntesized, and then execute the DDL statements for two purposes:

i)  Create the DD tables in the DDSE.
ii) Generate the meta data representing the DD tables.

The DDL execution must be modified to avoid storing the generated meta data into
the DD tables (all of which are not yet created), and instead store the meta
data temporarily in memory until table creation is finished. Then, eventually,
the generated meta data can be stored persistently in the DD tables, when they
are all created in the DDSE.

Additionally, some foreign key constraints must be collected and added after the
tables are created due to circular foreign key relationships. 

4.2 Upgrade to a new DD version
-------------------------------

Upgrade to a new DD version must be supported. This means that it must be
possible to open the existing DD tables, provided their DD version is supported,
and alter them to match the target DD. Thus, we must be able to obtain the
following information for each of the DD tables:

i)   The target table definition.
ii)  For each supported version: 
     a) The corresponding table definition.
     b) The necessary changes to match the target schema.

Upgrade from the supported version to the target version may be done by
assembling and executing SQL statements. It may be possible to deduce these
statements based on the difference between i) and iia), so we may not need to
hard code all of the information listed above. However, this decision is more
of an implementation issue.

Downgrade is not supported. If it should be supported, this may be done within
the same framework as described above. However, downgrade is not only a
dictionary related procedure, so even if downgrade may be doable from a
dictionary point of view, there may be other obstacles preventing it from being
supported.

4.3 Read serialized meta data, execute SQL statements
-----------------------------------------------------

This is necessary in order to support upgrade from 5.7 to 8.0. A dedicated tool
will generate serialized dictionary information, which will be read by
the starting 8.0 server, which will: 

- Create the DD tables and initialize the MySQL server.
- De-serialize the serialized meta data, and store it in the DD tables.

Additionally, the upgrade tool will generate SQL statements necessary for e.g.
dropping the old and abandoned dictionary tables. These statements must be read
and executed by the server.