Online DDL improves several aspects of MySQL operation:
Applications that access the table are more responsive because queries and DML operations on the table can proceed while the DDL operation is in progress. Reduced locking and waiting for MySQL server resources leads to greater scalability, even for operations that are not involved in the DDL operation.
Instant operations only modify metadata in the data dictionary. An exclusive metadata lock on the table may be taken briefly during the execution phase of the operation. Table data is unaffected, making operations instantaneous. Concurrent DML is permitted.
Online operations avoid the disk I/O and CPU cycles associated with the table-copy method, which minimizes overall load on the database. Minimizing load helps maintain good performance and high throughput during the DDL operation.
Online operations read less data into the buffer pool than table-copy operations, which reduces purging of frequently accessed data from memory. Purging of frequently accessed data can cause a temporary performance dip after a DDL operation.
By default, MySQL uses as little locking as possible during a
DDL operation. The LOCK
clause can be
specified for in-place operations and some copy operations to
enforce more restrictive locking, if required. If the
LOCK
clause specifies a less restrictive
level of locking than is permitted for a particular DDL
operation, the statement fails with an error.
LOCK
clauses are described below, in order of
least to most restrictive:
LOCK=NONE
:Permits concurrent queries and DML.
For example, use this clause for tables involving customer signups or purchases, to avoid making the tables unavailable during lengthy DDL operations.
LOCK=SHARED
:Permits concurrent queries but blocks DML.
For example, use this clause on data warehouse tables, where you can delay data load operations until the DDL operation is finished, but queries cannot be delayed for long periods.
LOCK=DEFAULT
:Permits as much concurrency as possible (concurrent queries, DML, or both). Omitting the
LOCK
clause is the same as specifyingLOCK=DEFAULT
.Use this clause when you do not expect the default locking level of the DDL statement to cause any availability problems for the table.
LOCK=EXCLUSIVE
:Blocks concurrent queries and DML.
Use this clause if the primary concern is finishing the DDL operation in the shortest amount of time possible, and concurrent query and DML access is not necessary. You might also use this clause if the server is supposed to be idle, to avoid unexpected table accesses.
Online DDL operations can be viewed as having three phases:
Phase 1: Initialization
In the initialization phase, the server determines how much concurrency is permitted during the operation, taking into account storage engine capabilities, operations specified in the statement, and user-specified
ALGORITHM
andLOCK
options. During this phase, a shared upgradeable metadata lock is taken to protect the current table definition.Phase 2: Execution
In this phase, the statement is prepared and executed. Whether the metadata lock is upgraded to exclusive depends on the factors assessed in the initialization phase. If an exclusive metadata lock is required, it is only taken briefly during statement preparation.
Phase 3: Commit Table Definition
In the commit table definition phase, the metadata lock is upgraded to exclusive to evict the old table definition and commit the new one. Once granted, the duration of the exclusive metadata lock is brief.
Due to the exclusive metadata lock requirements outlined above, an online DDL operation may have to wait for concurrent transactions that hold metadata locks on the table to commit or rollback. Transactions started before or during the DDL operation can hold metadata locks on the table being altered. In the case of a long running or inactive transaction, an online DDL operation can time out waiting for an exclusive metadata lock. Additionally, a pending exclusive metadata lock requested by an online DDL operation blocks subsequent transactions on the table.
The following example demonstrates an online DDL operation waiting for an exclusive metadata lock, and how a pending metadata lock blocks subsequent transactions on the table.
Session 1:
mysql> CREATE TABLE t1 (c1 INT) ENGINE=InnoDB;
mysql> START TRANSACTION;
mysql> SELECT * FROM t1;
The session 1 SELECT
statement
takes a shared metadata lock on table t1.
Session 2:
mysql> ALTER TABLE t1 ADD COLUMN x INT, ALGORITHM=INPLACE, LOCK=NONE;
The online DDL operation in session 2, which requires an exclusive metadata lock on table t1 to commit table definition changes, must wait for the session 1 transaction to commit or roll back.
Session 3:
mysql> SELECT * FROM t1;
The SELECT
statement issued in
session 3 is blocked waiting for the exclusive metadata lock
requested by the ALTER TABLE
operation in session 2 to be granted.
You can use
SHOW FULL
PROCESSLIST
to determine if transactions are waiting
for a metadata lock.
mysql> SHOW FULL PROCESSLIST\G
...
*************************** 2. row ***************************
Id: 5
User: root
Host: localhost
db: test
Command: Query
Time: 44
State: Waiting for table metadata lock
Info: ALTER TABLE t1 ADD COLUMN x INT, ALGORITHM=INPLACE, LOCK=NONE
...
*************************** 4. row ***************************
Id: 7
User: root
Host: localhost
db: test
Command: Query
Time: 5
State: Waiting for table metadata lock
Info: SELECT * FROM t1
4 rows in set (0.00 sec)
Metadata lock information is also exposed through the
Performance Schema metadata_locks
table, which provides information about metadata lock
dependencies between sessions, the metadata lock a session is
waiting for, and the session that currently holds the metadata
lock. For more information, see
Section 29.12.13.3, “The metadata_locks Table”.
The performance of a DDL operation is largely determined by whether the operation is performed instantly, in place, and whether it rebuilds the table.
To assess the relative performance of a DDL operation, you can
compare results using ALGORITHM=INSTANT
,
ALGORITHM=INPLACE
, and
ALGORITHM=COPY
. A statement can also be run
with old_alter_table
enabled to
force the use of ALGORITHM=COPY
.
For DDL operations that modify table data, you can determine whether a DDL operation performs changes in place or performs a table copy by looking at the “rows affected” value displayed after the command finishes. For example:
Changing the default value of a column (fast, does not affect the table data):
Query OK, 0 rows affected (0.07 sec)
Adding an index (takes time, but
0 rows affected
shows that the table is not copied):Query OK, 0 rows affected (21.42 sec)
Changing the data type of a column (takes substantial time and requires rebuilding all the rows of the table):
Query OK, 1671168 rows affected (1 min 35.54 sec)
Before running a DDL operation on a large table, check whether the operation is fast or slow as follows:
Clone the table structure.
Populate the cloned table with a small amount of data.
Run the DDL operation on the cloned table.
Check whether the “rows affected” value is zero or not. A nonzero value means the operation copies table data, which might require special planning. For example, you might do the DDL operation during a period of scheduled downtime, or on each replica server one at a time.
For a greater understanding of the MySQL processing associated
with a DDL operation, examine Performance Schema and
INFORMATION_SCHEMA
tables related to
InnoDB
before and after DDL operations to
see the number of physical reads, writes, memory allocations,
and so on.
Performance Schema stage events can be used to monitor
ALTER TABLE
progress. See
Section 17.16.1, “Monitoring ALTER TABLE Progress for InnoDB Tables Using Performance
Schema”.
Because there is some processing work involved with recording the changes made by concurrent DML operations, then applying those changes at the end, an online DDL operation could take longer overall than the table-copy mechanism that blocks table access from other sessions. The reduction in raw performance is balanced against better responsiveness for applications that use the table. When evaluating the techniques for changing table structure, consider end-user perception of performance, based on factors such as load times for web pages.