WL#9532: InnoDB_New_DD: Move data dictionary tables to a single tablespace
Affects: Server-8.0 — Status: Complete — Priority: Medium
Currently, data dictionary tables are created and stored in different InnoDB tablespaces and .ibd files. The design is to have the data dictionary storage engine (InnoDB) to create a data dictionary tablespace, which is used to store all the data dictionary tables. In this worklog, InnoDB will hard-code the data dictionary tablespace like: CREATE TABLESPACE mysql ADD DATAFILE 'mysql.ibd'; With this tablespace defined, all data dictionary tables would be created there. And WL#6391 can add new non-hardcoded data dictionary tables as follows: CREATE TABLESPACE mysql ADD DATAFILE 'mysql.ibd'; CREATE DATABASE mysql; CREATE TABLE mysql.triggers … TABLESPACE=mysql; With this change, all undo tablespace metadata is stored in this tablespace too. Because undo logs can only be accessed after undo tablespaces has been located, on startup, InnoDB should be able to scan the data dictionary tablespace to get all the undo tablespace metadata and locate all the undo tablespaces. Tablespace objects for undo tablespaces and temporary tablespace should be created in the data dictionary tablespace too, so that they are visible by a SELECT from data dictionary tables.
FR-1: The data dictionary tablespace will be created or must exist in the datadir, as mysql.ibd. This name cannot be overridden. So does the innodb_system tablespace. On startup, InnoDB will need to know the locations of 3 sets of files: 1.1 The InnoDB system tablespace 1.2 The data dictionary tablespace 1.3 The InnoDB redo log (if missing, the 2 tablespaces are assumed to be clean) All other InnoDB user tablespaces will be located based on metadata stored in the data dictionary tablespace. FR-2: The metadata for DD tables should be adjusted correctly in get_se_private_data(). FR-3: DDTableBuffer should be moved to the data dictionary tablespace too, and now its name is mysql.innodb_dynamic_metadata. FR-4: Upgrade has to be changed accordingly, but this is not in this worklog's scope. FR-5: Undo tablespaces and temporary tablespace now can be found in mysql.tablespaces and got by a SELECT on it. FR-6: sys.sys_config will be moved to the data dictionary tablespace too.
1. Data dictionary tablespace name is hard-coded as 'mysql' and its file name is hard-coded as 'mysql.ibd'. The internal tablespace id of it is also hard-coded as 0xFFFFFFFE while the global tablespace id is 1. 2. All data dictionary tables, including core and non-core ones shall reside in the data dictionary tablespace. This requires that SQLs in scripts/mysql_system_tables.sql should be adjusted accordingly, so that non-core DD tables would be created in the DD tablespace. 3. InnoDB system tablespace name is hard-coded as 'innodb_system' and no changes to its file name and internal tablespace id. The global tablespace id is 2. 4. InnoDB temporary tablespace name is hard-coded as 'innodb_temporary' and its internal tablespace id is hardcoded as 0xFFFFFFFD while its global tablespace id is 3. 5. Undo tablespaces tablespace name is defined as 'innodb_undo*'. Entries of them would be created during initializing database into mysql.tablespaces. 6. DDTableBuffer would be created like innodb_table(index)_stats tables by SQL in data dictionary tablespace. Its name is now mysql.innodb_dynamic_metadata. 6.1 With this change, table row format would become dynamic, and it should not be created in innodb_system tablespace, so some adjustments are necessary for accessing this table. 6.2 With new DD, after redo logs applied, some DD tables have to be opened first before mysql.innodb_dynamic_metadata is opened, which is managed by Server. So we could not apply changed dynamic metadata to these DD tables, also some user tables because mysql.innodb_dynamic_metadata has not been opened. InnoDB will only open this table in innobase_dict_recover. To fix this, after we can open the mysql.innodb_dynamic_metadata, we have to iterate over opening tables, and check if there are corresponding entries for them in mysql.innodb_dynamic_metadata, if so, apply them now. 7. During recovery, all tablespaces, including undo tablespaces and user tablespaces can be scanned after retrieving all of them via dd::cache::Dictionary_client::fetch_global_components(tablespaces). The max tablespace id can be calculated then. With this change, old scanning of SYS_TABLES and SYS_TABLESPACES should be removed. 8. Since all DD tables are now stored in one tablespace, the root page of their indexes would change accordingly. Thus these numbers of root pages in ha_innobase::get_se_private_data() should be adjusted.
Regarding to the point 8 in HLS, although DD tables and indexes are created sequentially, which means tables are created before data are inserted, there are still gaps between root pages. Let's take innodb-page-size=4K for example to understand how root pages are allocated. a. In InnoDB, every index will reserve two segments in the tablespace, one for non-leaf pages allocation, one for leaf pages allocation. This is done during the index is created. b. Inode is used to represent a segment, and Inodes are allocated/stored in the Inode pages too. c. The key point here is page of different sizes consist of different Inode entries. In this example, a 4K page can only have 7 Inode entries. d. In DD tablespace, first page is header page and second is for ibuf bitmap. To allocate a new segment, an Inode page with 7 Inode entries would be created. So the third page(page no=2) is created as Inode page. Root pages would be allocated from page no=3. e. Supposing that 7 Inode entries are (0, 1, 2, 3, 4, 5, 6), when new indexes are created, (0, 1) are assigned to first index whose root page no is 3, and (2, 3) are for next whose root page no is 4, (4, 5) are mapped to root page no 5. (6) is not enough for next index whose root page no is 6(allocated already). So a new Inode page has to be allocated, whose page no is 7. And its first Inode together with above (6) are allocated to the index. Since page no 7 is already assigned, next root page no can only start from 8. Thus why there are gaps between the root pages of DD table indexes. f. Also, the gaps appear with some rules, every 5 pages then 4 pages, then 5, and so on. This is because of 7 Inode entries in one Inode page. For other page sizes, it should be the same principle, just the gap should be bigger than the ones with 4K page size.
Copyright (c) 2000, 2018, Oracle Corporation and/or its affiliates. All rights reserved.