WL#9532: InnoDB_New_DD: Move data dictionary tables to a single tablespace

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

Currently, data dictionary tables are created and stored in different InnoDB
tablespaces and .ibd files. The design is to have the data dictionary
storage engine (InnoDB) to create a data dictionary tablespace, which is used to
store all the data dictionary tables. 

In this worklog, InnoDB will hard-code the data dictionary tablespace like:


With this tablespace defined, all data dictionary tables would be created there.
And WL#6391 can add new non-hardcoded data dictionary tables as follows:

CREATE TABLE mysql.triggers … TABLESPACE=mysql;

With this change, all undo tablespace metadata is stored in this tablespace too.
Because undo logs can only be accessed after undo tablespaces has been located,
on startup, InnoDB should be able to scan the data dictionary tablespace to
get all the undo tablespace metadata and locate all the undo tablespaces.

Tablespace objects for undo tablespaces and temporary tablespace should be
created in the data dictionary tablespace too, so that they are visible by
a SELECT from data dictionary tables.
FR-1: The data dictionary tablespace will be created or must exist in the 
datadir, as mysql.ibd. This name cannot be overridden. So does the innodb_system 

On startup, InnoDB will need to know the locations of 3 sets of files:

1.1 The InnoDB system tablespace
1.2 The data dictionary tablespace
1.3 The InnoDB redo log (if missing, the 2 tablespaces are assumed to be clean)

All other InnoDB user tablespaces will be located based on metadata stored in the 
data dictionary tablespace.

FR-2: The metadata for DD tables should be adjusted correctly in 

FR-3: DDTableBuffer should be moved to the data dictionary tablespace too, and 
now its name is mysql.innodb_dynamic_metadata.

FR-4: Upgrade has to be changed accordingly, but this is not in this worklog's 

FR-5: Undo tablespaces and temporary tablespace now can be found in 
mysql.tablespaces and got by a SELECT on it.

FR-6: sys.sys_config will be moved to the data dictionary tablespace too.
1. Data dictionary tablespace name is hard-coded as 'mysql' and its file name
is hard-coded as 'mysql.ibd'. The internal tablespace id of it is also
hard-coded as 0xFFFFFFFE while the global tablespace id is 1.

2. All data dictionary tables, including core and non-core ones shall reside in
the data dictionary tablespace. This requires that SQLs in
scripts/mysql_system_tables.sql should be adjusted accordingly, so that non-core
DD tables would be created in the DD tablespace.

3. InnoDB system tablespace name is hard-coded as 'innodb_system' and no changes 
to its file name and internal tablespace id. The global tablespace id is 2.

4. InnoDB temporary tablespace name is hard-coded as 'innodb_temporary' and its
internal tablespace id is hardcoded as 0xFFFFFFFD while its global tablespace id
is 3.

5. Undo tablespaces tablespace name is defined as 'innodb_undo*'. Entries of
them would be created during initializing database into mysql.tablespaces.

6. DDTableBuffer would be created like innodb_table(index)_stats tables by SQL
in data dictionary tablespace. Its name is now mysql.innodb_dynamic_metadata.
6.1 With this change, table row format would become dynamic, and it should not
be created in innodb_system tablespace, so some adjustments are necessary for
accessing this table.
6.2 With new DD, after redo logs applied, some DD tables have to be opened first
before mysql.innodb_dynamic_metadata is opened, which is managed by Server.
So we could not apply changed dynamic metadata to these DD tables, also some
user tables because mysql.innodb_dynamic_metadata has not been opened. InnoDB
will only open this table in innobase_dict_recover. To fix this, after we can
open the mysql.innodb_dynamic_metadata, we have to iterate over opening tables,
and check if there are corresponding entries for them in
mysql.innodb_dynamic_metadata, if so, apply them now.

7. During recovery, all tablespaces, including undo tablespaces and user
tablespaces can be scanned after retrieving all of them via
dd::cache::Dictionary_client::fetch_global_components(tablespaces). The max
tablespace id can be calculated then. With this change, old scanning of
SYS_TABLES and SYS_TABLESPACES should be removed.

8. Since all DD tables are now stored in one tablespace, the root page of their
indexes would change accordingly. Thus these numbers of root pages in
ha_innobase::get_se_private_data() should be adjusted.
Regarding to the point 8 in HLS, although DD tables and indexes are created
sequentially, which means tables are created before data are inserted, there
are still gaps between root pages.

Let's take innodb-page-size=4K for example to understand how root pages are

a. In InnoDB, every index will reserve two segments in the tablespace, one for
non-leaf pages allocation, one for leaf pages allocation. This is done during
the index is created.
b. Inode is used to represent a segment, and Inodes are allocated/stored in
the Inode pages too.
c. The key point here is page of different sizes consist of different Inode
entries. In this example, a 4K page can only have 7 Inode entries.
d. In DD tablespace, first page is header page and second is for ibuf bitmap.
To allocate a new segment, an Inode page with 7 Inode entries would be created.
So the third page(page no=2) is created as Inode page. Root pages would be
allocated from page no=3.
e. Supposing that 7 Inode entries are (0, 1, 2, 3, 4, 5, 6), when new indexes
are created, (0, 1) are assigned to first index whose root page no is 3, and
(2, 3) are for next whose root page no is 4, (4, 5) are mapped to root page no
5. (6) is not enough for next index whose root page no is 6(allocated already).
So a new Inode page has to be allocated, whose page no is 7. And its first
Inode together with above (6) are allocated to the index. Since page no 7 is
already assigned, next root page no can only start from 8. Thus why there
are gaps between the root pages of DD table indexes.
f. Also, the gaps appear with some rules, every 5 pages then 4 pages, then 5,
and so on. This is because of 7 Inode entries in one Inode page.

For other page sizes, it should be the same principle, just the gap should be
bigger than the ones with 4K page size.