Documentation Home
MySQL 5.7 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 37.9Mb
PDF (A4) - 37.9Mb
PDF (RPM) - 37.3Mb
HTML Download (TGZ) - 10.2Mb
HTML Download (Zip) - 10.3Mb
HTML Download (RPM) - 8.9Mb
Man Pages (TGZ) - 214.6Kb
Man Pages (Zip) - 327.6Kb
Info (Gzip) - 3.4Mb
Info (Zip) - 3.4Mb
Excerpts from this Manual

MySQL 5.7 Reference Manual  /  ...  /  Online DDL Performance, Concurrency, and Space Requirements

14.13.2 Online DDL Performance, Concurrency, and Space Requirements

Online DDL improves several aspects of MySQL operation, such as performance, concurrency, availability, and scalability:

  • Because queries and DML operations on the table can proceed while the DDL is in progress, applications that access the table are more responsive. Reduced locking and waiting for other resources throughout the MySQL server leads to greater scalability, even for operations not involving the table being altered.

  • For in-place operations, by avoiding the disk I/O and CPU cycles to rebuild the table, you minimize the overall load on the database and maintain good performance and high throughput during the DDL operation.

  • For in-place operations, because less data is read into the buffer pool than if all the data was copied, you avoid purging frequently accessed data from memory, which formerly could cause a temporary performance dip after a DDL operation.

If an online operation requires temporary sort files, InnoDB creates them in the temporary file directory by default, not the directory containing the original table. If this directory is not large enough to hold such files, you may need to set the tmpdir system variable to a different directory. Alternatively, you can define a separate temporary directory for InnoDB online ALTER TABLE operations using the innodb_tmpdir configuration option. For more information, see Space Requirements for Online DDL Operations, and Section B.5.3.5, “Where MySQL Stores Temporary Files”.

Locking Options for Online DDL

While an InnoDB table is being changed by a DDL operation, the table may or may not be locked, depending on the internal workings of that operation and the LOCK clause of the ALTER TABLE statement. By default, MySQL uses as little locking as possible during a DDL operation; you specify the clause either to make the locking more restrictive than it normally would be (thus limiting concurrent DML, or DML and queries), or to ensure that some expected degree of locking is allowed for an operation. If the LOCK clause specifies a level of locking that is not available for that specific kind of DDL operation, such as LOCK=SHARED or LOCK=NONE while creating or dropping a primary key, the clause works like an assertion, causing the statement to fail with an error. The following list shows the different possibilities for the LOCK clause, from the most permissive to the most restrictive:

  • For DDL operations with LOCK=NONE, both queries and concurrent DML are allowed. This clause makes the ALTER TABLE fail if the kind of DDL operation cannot be performed with the requested type of locking, so specify LOCK=NONE if keeping the table fully available is vital and it is OK to cancel the DDL if that is not possible. For example, you might use this clause in DDLs for tables involving customer signups or purchases, to avoid making those tables unavailable by mistakenly issuing an expensive ALTER TABLE statement.

  • For DDL operations with LOCK=SHARED, any writes to the table (that is, DML operations) are blocked, but the data in the table can be read. This clause makes the ALTER TABLE fail if the kind of DDL operation cannot be performed with the requested type of locking, so specify LOCK=SHARED if keeping the table available for queries is vital and it is OK to cancel the DDL if that is not possible. For example, you might use this clause in DDLs for tables in a data warehouse, where it is OK to delay data load operations until the DDL is finished, but queries cannot be delayed for long periods.

  • For DDL operations with LOCK=DEFAULT, or with the LOCK clause omitted, MySQL uses the lowest level of locking that is available for that kind of operation, allowing concurrent queries, DML, or both wherever possible. This is the setting to use when making pre-planned, pre-tested changes that you know do not cause any availability problems based on the workload for that table.

  • For DDL operations with LOCK=EXCLUSIVE, both queries and DML operations are blocked. This clause makes the ALTER TABLE fail if the kind of DDL operation cannot be performed with the requested type of locking, so specify LOCK=EXCLUSIVE if the primary concern is finishing the DDL in the shortest time possible, and it is OK to make applications wait when they try to access the table. You might also use LOCK=EXCLUSIVE if the server is supposed to be idle, to avoid unexpected accesses to the table.

In most cases, an online DDL operation on a table waits for currently executing transactions that are accessing the table to commit or roll back because it requires exclusive access to the table for a brief period while the DDL statement is being prepared. Likewise, the online DDL operation requires exclusive access to the table for a brief time before finishing. Thus, an online DDL statement also waits for transactions that are started while the DDL is in progress to commit or roll back before completing. Consequently, in the case of long running transactions that perform inserts, updates, deletes, or SELECT ... FOR UPDATE operations on the table, it is possible for online DDL operation to time out waiting for exclusive access to the table.

A case in which an online DDL operation on a table does not wait for a currently executing transaction to complete can occur when the table is in a foreign key relationship and a transaction is run explicitly on the other table in the foreign key relationship. In this case, the transaction holds an exclusive metadata lock on the table it is updating, but only holds shared InnoDB table lock (required for foreign key checking) on the other table. The shared InnoDB table lock permits the online DDL operation to proceed but blocks the operation at the commit phase, when an exclusive InnoDB table lock is required. This scenario can result in deadlocks as other transactions wait for the online DDL operation to commit. (See Bug #48652, and Bug #77390)

Because there is some processing work involved with recording the changes made by concurrent DML operations, then applying those changes at the end, an online DDL operation could take longer overall than the old-style mechanism that blocks table access from other sessions. The reduction in raw performance is balanced against better responsiveness for applications that use the table. When evaluating the ideal techniques for changing table structure, consider end-user perception of performance, based on factors such as load times for web pages.

A newly created InnoDB secondary index contains only the committed data in the table at the time the CREATE INDEX or ALTER TABLE statement finishes executing. It does not contain any uncommitted values, old versions of values, or values marked for deletion but not yet removed from the old index.

Performance of In-Place versus Table-Copying DDL Operations

The raw performance of an online DDL operation is largely determined by whether the operation is performed in-place, or requires copying and rebuilding the entire table. See Table 14.10, “Online Status for DDL Operations” to see what kinds of operations can be performed in-place, and any requirements for avoiding table-copy operations.

The performance speedup from in-place DDL applies to operations on secondary indexes, not to the primary key index. The rows of an InnoDB table are stored in a clustered index organized based on the primary key, forming what some database systems call an index-organized table. Because the table structure is closely tied to the primary key, redefining the primary key still requires copying the data.

When an operation on the primary key uses ALGORITHM=INPLACE, even though the data is still copied, it is more efficient than using ALGORITHM=COPY because:

  • No undo logging or associated redo logging is required for ALGORITHM=INPLACE. These operations add overhead to DDL statements that use ALGORITHM=COPY.

  • The secondary index entries are pre-sorted, and so can be loaded in order.

  • The change buffer is not used, because there are no random-access inserts into the secondary indexes.

To judge the relative performance of online DDL operations, you can run such operations on a big InnoDB table using current and earlier versions of MySQL. You can also run all the performance tests under the latest MySQL version, simulating the previous DDL behavior for the before results, by setting the old_alter_table system variable. Issue the statement set old_alter_table=1 in the session, and measure DDL performance to record the before figures. Then set old_alter_table=0 to re-enable the newer, faster behavior, and run the DDL operations again to record the after figures.

For a basic idea of whether a DDL operation does its changes in-place or performs a table copy, look at the rows affected value displayed after the command finishes. For example, here are lines you might see after doing different types of DDL operations:

  • Changing the default value of a column (super-fast, does not affect the table data at all):

    Query OK, 0 rows affected (0.07 sec)
  • Adding an index (takes time, but 0 rows affected shows that the table is not copied):

    Query OK, 0 rows affected (21.42 sec)
  • Changing the data type of a column (takes substantial time and does require rebuilding all the rows of the table):

    Query OK, 1671168 rows affected (1 min 35.54 sec)

    Changing the data type of a column requires rebuilding all the rows of the table with the exception of changing VARCHAR size, which may be performed using online ALTER TABLE. See Modifying Column Properties for more information.

For example, before running a DDL operation on a big table, you might check whether the operation is fast or slow as follows:

  1. Clone the table structure.

  2. Populate the cloned table with a tiny amount of data.

  3. Run the DDL operation on the cloned table.

  4. Check whether the rows affected value is zero or not. A non-zero value means the operation requires rebuilding the entire table, which might require special planning. For example, you might do the DDL operation during a period of scheduled downtime, or on each replication slave server one at a time.

For a deeper understanding of the reduction in MySQL processing, examine the performance_schema and INFORMATION_SCHEMA tables related to InnoDB before and after DDL operations, to see the number of physical reads, writes, memory allocations, and so on.

Space Requirements for Online DDL Operations

Online DDL operations have the following space requirements:

  • Space for temporary log files

    There is one such log file for each index being created or table being altered. This log file stores data inserted, updated, or deleted in the table during the DDL operation. The temporary log file is extended when needed by the value of innodb_sort_buffer_size, up to the maximum specified by innodb_online_alter_log_max_size. If a temporary log file exceeds the upper size limit, the ALTER TABLE operation fails and all uncommitted concurrent DML operations are rolled back. Thus, a large value for this option allows more DML to happen during an online DDL operation, but also extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log.

    If the operation takes so long, and concurrent DML modifies the table so much, that the size of the temporary online log exceeds the value of the innodb_online_alter_log_max_size configuration option, the online DDL operation fails with a DB_ONLINE_LOG_TOO_BIG error.

  • Space for temporary sort files

    Online DDL operations that rebuild the table write temporary sort files to the MySQL temporary directory ($TMPDIR on Unix, %TEMP% on Windows, or the directory specified by the --tmpdir configuration variable) during index creation. Each temporary sort file is large enough to hold all columns defined for the new secondary index plus the columns that are part of the primary key of the clustered index, and each one is removed as soon as it is merged into the final table or index. Such operations may require temporary space equal to the amount of data in the table plus indexes. An online DDL operation that rebuilds the table can cause an error if the operation uses all of the available disk space on the file system where the data directory (datadir) resides.

    As of MySQL 5.7.11, you can use the innodb_tmpdir configuration option to define a separate temporary directory for online DDL operations. The innodb_tmpdir option was introduced to help avoid temporary directory overflows that could occur as a result of large temporary sort files created during online ALTER TABLE operations that rebuild the table.

  • Space for an intermediate table file

    Some online DDL operations that rebuild the table create a temporary intermediate table file in the same directory as the original table as opposed to rebuilding the table in place. An intermediate table file may require space equal to the size of the original table. Operations that rebuild the table in place are noted in Section 14.13.1, “Online DDL Overview”.

User Comments
Sign Up Login You must be logged in to post a comment.