WL#13895: InnoDB: SET TABLESPACE AUTOEXTEND SIZE

Affects: Server-8.0   —   Status: Complete

Tablespace growth size is hardcoded inside InnoDB to 1MB (page_size * number of 
pages in an extent). This 1MB limit may have been sufficient on HW from 20 years 
ago. With today's loads, this becomes a bottleneck, especially where we want to 
do high ingestion of data.

In WL#13782 the idea was to use fallocate() on Linux to increase tablespace 
space. This is very fast but has the side effect of fragmenting the file system 
which results in a lot of random IO and more importantly, dropping a tablespace 
becomes very expensive. The time goes up from sub-second to 25 seconds for a 15GB 
tablespace. For very large tablespaces this is unacceptable.

The proposal is to make the auto-extend a tablespace attribute that can be 
updated by the following extension to the existing DDLs to manipulate both the 
implicit or explicit tablespaces:

1) For explicit/general tablespaces
CREATE TABLESPACE T [AUTOEXTEND_SIZE [=] ];
ALTER TABLESPACE T [AUTOEXTEND_SIZE [=] ];

2) For implicit or file_per_table tablespaces, CREATE/ALTER TABLE statements will 
be modified to support the AUTOEXTEND_SIZE clause:
CREATE TABLE T [AUTOEXTEND_SIZE [=] ];
ALTER TABLE T [AUTOEXTEND_SIZE [=] ];
Functional requirements:
1. FR-1: Provide AUTOEXTEND_SIZE clause to the following DDL statements to
manipulate explicit or general tablespaces:
    a) CREATE TABLESPACE T [AUTOEXTEND_SIZE [=] ]
    b) ALTER TABLESPACE T [AUTOEXTEND_SIZE [=] ]

2. FR-2: Provide AUTOEXTEND_SIZE clause to the following DDL statements to
manipulate implicit or file_per_table tablespaces:
    a) CREATE TABLE T [AUTOEXTEND_SIZE [=] ]
    b) ALTER TABLE T [AUTOEXTEND_SIZE [=] ]

3. FR-3: Return error to the user if ALTER TABLESPACE is used to modify
autoextend_size value for an implicit or file_per_table tablespace.

4. FR-4: Increment the size of the tablespace by the size mentioned with the
AUTOEXTEND_SIZE clause. If the AUTOEXTEND_SIZE is set to 0, use the existing
algorithm for allocating space in a tablespace.

5. FR-5: Return an error if the AUTOEXTEND_SIZE is not a multiple of the
minimum AUTOEXTEND_SIZE.

6. FR-6: INFORMATION_SCHEMA.INNODB_TABLESPACES should display the
autoextend_size and max_size values for the tablespaces.

7. FR-7: SHOW CREATE TABLE should display the AUTOEXTEND_SIZE clause if its
value is greater than 0. Please note that a 0 value of AUTOEXTEND_SIZE means
that the allocation will happen as per the current legacy allocation algorithm.

8. FR-8: CREATE [TABLE | TABLESPACE] AUTOEXTEND_SIZE [=]  should create 
the tablespace file of size  if the size is greater than 0.

9. FR-9: If the AUTOEXTEND_SIZE value for a tablespace is altered using
ALTER [TABLE | TABLESPACE] AUTOEXTEND_SIZE [=] , then the first
allocation that takes place after the ALTER should extend the tablespace
size to make it a multiple of the new AUTOEXTEND_SIZE value. Any subsequent 
allocations to the tablespace will extend the tablespace by AUTOEXTEND_SIZE 
value.

Non-functional requirements:
1. NFR-1: DROP TABLE should be faster while dropping large tables.
Introduction
============
The current code used posix_fallocate() to extend the file by allocating new
pages at the end of the file. However, at present, the size by which space is
extended is very small and often in multiple of few pages. This results in a 
fragmented file system resulting in an increased amount of random IO.

This worklog aims towards addressing the fragmentation issue by extending the
tablespace file with a larger chunk. To achieve this, the following new 
attribute will be added to the tablespaces:

AUTOEXTEND_SIZE: This attribute will define the size by which the tablespace
will be extended every time the server runs out of space in that tablespace.
The current legacy algorithm for extending a tablespace is to extend the
tablespace by 4 extents at a time after the tablespace is larger
than 32 extents. The value for this attribute will be a multiple of 4 extents.
Please note that the extent size depends on the page size as given in the
table below. This will ensure that the allocation size is at least equal
to the current largest allocation size:

Page size      Extent size (in pages) Min Autoextend_size
---------------------------------------------------------
4KB            256                    4MB
8KB            128                    4MB
16KB           64                     4MB
32KB           64                     8MB
64KB           64                     16MB

The maximum allowed value for AUTOEXTEND_SIZE will be 64MB. If the value of
this attribute is set to 0, then the server will fall back to the existing 
allocation algorithm.

InnoDB supports the following types of tablespaces:

1) Explicit or general tablespaces
These tablespaces are user-defined tablespaces and can be shared by multiple
tables. These tablespaces are created by the "CREATE TABLESPACE" DDL statement.
The user needs to specify the tablespace while creating the table if the table
needs to be created in an explicit tablespace. By default, InnoDB creates all
the tables with implicit or file-per-table tablespaces (described below).

2) Implicit or file-per-table tablespaces
As the name suggests, these tablespaces are automatically created while
creating a table. These tablespaces are not shared across tables. By default,
InnoDB creates all the tables with their own implicit tablespace.

This WL will support the following syntax to allow the user to specify the
AUTOEXTEND_SIZE attribute for a tablespace:

1) DDL to create an explicit or general tablespace with AUTOEXTEND_SIZE 
attribute:
CREATE TABLESPACE T [AUTOEXTEND_SIZE [=] ]

2) DDL to update the AUTOEXTEND_SIZE attribute for a general tablespace:
ALTER TABLESPACE T [AUTOEXTEND_SIZE [=] ]

3) DDL to create a table with file-per-table tablespace with AUTOEXTEND_SIZE
attribute:
CREATE TABLE T [AUTOEXTEND_SIZE [=] ]

4) DDL to update the AUTOEXTEND_SIZE attribute to a file-per-table tablespace:
ALTER TABLE T [AUTOEXTEND_SIZE [=] ]

Book-keeping
============
The autoextend_size will be stored in the data dictionary as a tablespace
attribute.

Effects of AUTOEXTEND_SIZE attribute
====================================
If the value of AUTOEXTEND_SIZE is greater than 0, then the server will extend 
the tablespace by this much size every time the space allocation happens. If
the value of autoextend_size is 0, then the server will fall back to the
existing logic of extending the space.

In the current implementation, the allocation logic allocates new pages to the
tablespace as per the following rule:
1) Add one page at a time if the size of the tablespace is less than an extent
2) Add one extent at a time if the size of the tablespace is greater than 1
extent but smaller than 32 extents
3) Add four extents at a time if the size of the tablespace is more than 32
extents

If the value of AUTOEXTEND_SIZE is set while creating the table or tablespace,
then the corresponding table or tablespace will be created with that many
pages. In the current implementation, a new table or tablespace is always 
created with 7 pages.

If the value of AUTOEXTEND_SIZE is modified using ALTER TABLE, then the
immediate next allocation will make sure that the size of the file is a
multiple of AUTOEXTEND_SIZE. For example, consider the following sequence of
events:

Scenario-1:
-----------
1) CREATE TABLE t1(c1 INT) AUTOEXTEND_SIZE 4M ----> Assuming 4k page size,
this table will be created with 1024 pages
2) A sequence of DML transactions happen on the table and all the 4 extents
(4MB) are consumed
3) ALTER TABLE t1 AUTOEXTEND_SIZE 8M

The allocation that happens immediately after the ALTER TABLE statement will 
allocate only 4M of space to make it a multiple of 8M. All the subsequent
allocations will be then allocated 8M of space.

Scenario-2:
-----------
1) CREATE TABLE t1(c1 INT) ----> This creates table with 7 pages
2) ALTER TABLE t1 AUTOEXTEND_SIZE 4M
3) The sequence of DML transactions. The server runs out of space in the
tablespace and requests for allocating more space

Assuming a 4k page size, the server will allocate 1017 pages (1024 pages to
an extent - 7 pages already allocated) to make the total allocation a
multiple of the AUTOEXTEND_SIZE value.

Scenario-3:
-----------
CREATE TABLE t1(c1 INT, c2 TEXT) AUTOEXTEND_SIZE 4M
  PARTITION BY RANGE (c1) (
    PARTITION p0 VALUES LESS THAN (100),
    PARTITION p1 VALUES LESS THAN maxvalue) 

Since the AUTOEXTEND_SIZE is specified while creating the partitioned table,
the partitions will be created with an initial size of 4M.

Scenario-4:
-----------
1) CREATE TABLE t1(c1 INT, c2 TEXT) AUTOEXTEND_SIZE 4M
2) ALTER TABLE t1 PARTITION BY RANGE (c1) (
     PARTITION p0 VALUES LESS THAN (100),
     PARTITION p1 VALUES LESS THAN maxvalue)

The above statements will create the partitions with an initial size of 4M.

Scenario-5:
-----------
1) CREATE TABLE t1(c1 INT) AUTOEXTEND_SIZE 8M ----> This will create the
tablespace with an initial size of 8M.
2) ALTER TABLE t1 ADD COLUMN c2 TEXT AUTOEXTEND_SIZE 4M

Some operations like The ALTER TABLE ADD COLUMN and TRUNCATE TABLE initialize
a new temporary table and copy data from the original table to this one and 
renames the newly created temporary table to the original table. All such
will initialize the tablespace with the size specified with the AUTOEXTEND_SIZE
attribute. This may result in a situation where the size of the tablespace is
different than the original table.

Error ER_INNODB_AUTOEXTEND_SIZE_OUT_OF_RANGE will be returned to the user if
the AUTOEXTEND_SIZE value being set is greater than 0 and is not between the
size of 4 extents and 64M.

Error ER_INNODB_INVALID_AUTOEXTEND_SIZE_VALUE will be returned to the user
if the AUTOEXTEND_SIZE value being set is greater than 0 and is not a
multiple of the size of 4 extents.

New error messages
==================
1) ER_INNODB_INVALID_AUTOEXTEND_SIZE_VALUE
   This error will be returned to the user if the AUTOEXTEND_SIZE attribute is 
not a multiple of the size of 4 extents. The error message will also mention the 
nearest valid value to help the user.
2) ER_INNODB_AUTOEXTEND_SIZE_OUT_OF_RANGE
   This error will be returned to the user if the AUTOEXTEND_SIZE is greater
than 0 and not between the size of 4 extents and 64M.
3) ER_CANNOT_USE_AUTOEXTEND_SIZE_CLAUSE
   This error will be returned if the user attempts to set the AUTOEXTEND_SIZE 
attribute to temporary tables.
4) ER_INNODB_INCOMPATIBLE_WITH_TABLESPACE
   This error will be returned if the user attempts to set the AUTOEXTEND_SIZE
attribute to a general tablespace using a CREATE TABLE or ALTER TABLE statement.

Interface changes
=================
1) The information_schema.innodb_tablespaces will be modified to display
AUTOEXTEND_SIZE value for tablespaces.
2) SHOW CREATE TABLE SQL will be modified to display the AUTOEXTEND_SIZE
clause if its value is greater than 0. This clause will not be displayed if
its value is set to 0.