WL#11819: InnoDB: Support ACID Undo DDL
Affects: Server-8.0 — Status: Complete
The introduction of Undo DDL in WL#9508 was done using the existing method for creating undo tablespaces. This method did not use redo logging. WL#9508 used a form of atomicity which existed for undo tablespace truncation. This process creates an undo trunc.log file without redo logging which would cause the truncation or creation process to continue at startup if a crash occurred. This current process also suffers from a performance issue because when the undo tablespace is replaced, the old tablespace pages are removed from the buffer pool causing a checkpoint and the new tablespace pages are flushed to disk in a second checkpoint. This process needs to change to become similar to the redo logged method used for CREATE TABLESPACE and the file-per-table datafile handling used during table DDL. The advantage of using REDO logging is to avoid the two checkpoints during undo truncation and to make the process more ACID.
FR1: Background undo truncate must be crash safe FR2: The change should have no impact to related features like MEB and Clone NFR1. Background undo truncate should have minimal impact on foreground performance. "Minimal impact" is left up to QA to define. NFR2: When innodb-max-undo-log-size is set too small and undo truncation is happening too often, the performance should be much better. NFR3: There should be no stalls in foreground activity when an undo truncation occurs.
Primary Goals ============= 1. Use REDO logging during undo truncation. 2. Avoid checkpoints during undo truncation. Tasks ===== 1. Do not immediately remove pages from an existing undo tablespace when it is being truncated. Instead, mark the file as pending delete once all its buffer are released. This is done by introducing a new way to remove buffer pages called buf_remove_t::BUF_REMOVE_NONE. If BUF_REMOVE_NONE is set in the call to Fil_shard::space_delete() do the following; * Skip the call to buf_LRU_flush_or_remove_pages(). * Put the current lsn into fil_space_t::m_deleted_lsn. This allows the fil_space_t to be finally deleted once the checkpoint low-water-mark has advanced past this lsn. * Wait for any pending writes. * Add the fil_space_t to a list called Fil_shard::m_deleted. * Keep the fil_space_t in the Fil_shard::m_spaces list so that it can be found by space_id. (Do not call Fil_shard::space_free_low()). * Delete the tablespace name from Fil_shard::m_names so that a new tablespace can replace it. * Remove the fil_space_t from Fil_shard::m_unflushed_spaces. * Call Fil_shard::file_close_to_free() to close any file handle since there will be no more writing to the file. Only UNDO tablespace files are deleted with this new method called BUF_REMOVE_NONE. This means that pages exist in the buffer pool for files that have already been deleted. These pages can be removed passively. Once the checkpoint low-water-mark has risen higher than the lsn at which the file was deleted, then fil_space_t can be fully deleted. This is done in a new routine called Fil_shard::checkpoint(). The Fil_shard::m_deleted list is traversed looking for a fil_space_t::m_deleted_lsn that is less than the current lwm. If it finds one, it will call; * Fil_shard::space_delete(space_id); * Fil_shard::space_delete_low(fil_space_t *); * Fil_shard::m_deleted.erase(); There are some places in InnoDB that need to be aware of the existence of pages in the buffer pool for deleted files. * Clone_Snapshot::add_node() * Clone_Snapshot::add_page() If undo truncation is happening too often and the buffer pool is very large, it is possible that all available reserved space IDs for a single undo tablespace may be used by previous truncations. In this case, all 512 reserved space IDs for an undo space may be in Fil_shard::m_deleted pending a checkpoint that would free them up. To prevent this from occurring, a maximum number of reserved space IDs for the same undo space is enforced in Fil_shard::m_deleted. If 64 reserved space IDs are already pending delete, then undo truncation is not allowed on that undo tablespace even if it has become too big. When this condition is found, the following warning message is put into the error log; "Undo Truncation is occurring too often. Consider increasing --innodb-max-undo-log-size." 2. Redo log the undo truncate changes. This mostly consists of redo logging the creation of the initial 128 undo segment header pages. 3. During normal operations, add a time delay between undo truncation by the purge thread. Unless it is a slow shutdown or an explicit SET INACTIVE has been called, wait at least one second after an undo truncate is completed before searching for another big undo tablespace to truncate. 4. Since two new 'throttles' are added to the decision to start a truncate of an undo tablespace, the code to make that decision is refactored and new comments are added to make it easier to understand. The decision condition checks are moved into trx_purge_mark_undo_for_truncate() and its subroutine undo::tablespace::needs_truncation(). 5. Remove innodb_metrics associated with removing and flushing the buffer pool during undo truncation. These are the metrics for both checkpoints that used to happen. This also involves updating a few testcases.
Copyright (c) 2000, 2021, Oracle Corporation and/or its affiliates. All rights reserved.