WL#6915: InnoDB: Undo logs for temp-tables (and related objects) should reside in temp-tablespace
Status: Complete
Introduction: ------------ The UNDO logs currently reside in the system-tablespace or in dedicated undo- tablespaces (undo tablespaces are configurable by user and are optional) for all objects created in system. This work log adds an additional type of UNDO log, one that is not redo logged and resides in the temp-tablespace. These non-redo-logged UNDO logs are not required during recovery but only used for rollback action. (Purging is blocked for temp-tables, MVCC is not applicable for temp-tables) Note: The temp-tablespace is always recreated on startup. This new type of UNDO log is being introduced for the use by temp-tables, both compressed and Non-compressed. WL helps in addressing this requirement. Visible Difference (if any): --------------------------- Note change in info message that explicitly notify user about number of redo and non-redo rollback segment active. 2013-05-10 17:20:42 7609 [Note] InnoDB: 96 redo rollback segment(s) found. 1 redo rollback segment(s) are active. 2013-05-10 17:20:42 7609 [Note] InnoDB: 32 non-redo rollback segment(s) are active. How would this help ? ---------------------- - Since they are not required for crash recovery we can avoid REDO logging. On crash/startup temp-tablespace is re-created therefore there is no reference of the temp-tables on restart. This small change helps in saving lot of un-needed IO arising out of REDO logging and further help in improving performance. ---------------------------------------------------------------------- chtest validtion needed for pb2 runs: ################################################################################ # innodb_wl6915_debug passed all enabled (_CHECK <> 0) tests # ------------------------------------------------------------------ # Top level file = ./suite/innodb/t/innodb_wl6915_debug.test # Current directory = /opt/projects/innodb/codebase/mysql-tree/mysql-trunk-wl6915/mysql-test # MTR_BINDIR = /opt/projects/innodb/codebase/mysql-tree/mysql-trunk-wl6915/bld # TREE_DIR_CHECK = 1 # Check if the directories used are consistent # GREP_CHECK = 1 # Simple result check with grep # EXTENDED_CHECK = 1 # Extended check testcases # REPEAT_CHECK = 1 # The test must be repeatable # ASSIGNED_DIR_CHECK = 1 # Check with explicit assigned MTR # # options tmpdir and vardir. # OPTION_CHECK = 1 # Check with various server/mysqltest # # options # OPTION_CHECK_EXT = 1 # Extended check of such options # SESSION_CHECK = 1 # Check for disconnect in time etc. # CPU_LOAD_CHECK = 1 # Check with high parallel CPU load # DISK_LOAD_CHECK = 1 # Check with high parallel Disk load ################################################################################ # innodb_wl6915 passed all enabled ( _CHECK <> 0) tests # ------------------------------------------------------------------ # Top level file = ./suite/innodb/t/innodb_wl6915.test # Current directory = /opt/projects/innodb/codebase/mysql-tree/mysql-trunk-wl6915/mysql-test # MTR_BINDIR = /opt/projects/innodb/codebase/mysql-tree/mysql-trunk-wl6915/bld # TREE_DIR_CHECK = 1 # Check if the directories used are consistent # GREP_CHECK = 1 # Simple result check with grep # EXTENDED_CHECK = 1 # Extended check testcases # REPEAT_CHECK = 1 # The test must be repeatable # ASSIGNED_DIR_CHECK = 1 # Check with explicit assigned MTR # # options tmpdir and vardir. # OPTION_CHECK = 1 # Check with various server/mysqltest # # options # OPTION_CHECK_EXT = 1 # Extended check of such options # SESSION_CHECK = 1 # Check for disconnect in time etc. # CPU_LOAD_CHECK = 1 # Check with high parallel CPU load # DISK_LOAD_CHECK = 1 # Check with high parallel Disk load ----------------------------------------------------------------------------
High Level Approach Description: -------------------------------- - Each server instance has 128 rollback segment slots. A transaction is assigned a rollback segment which it uses for UNDO logging. (Segment is expanded to add more pages based on size of UNDO logging). - The plan is to reserve few slots for non-redo rollback segments. UNDO logs written to these rollback segments will not generate any REDO logs. Given the constraint these rollback segments will be used only for objects that don't need to be restored on re-start e.g., temp-tables. - Number of reserved Slots for non-redo rollback segments: 32 Currently the non-redo rollback segments will reside in the temp-tablespace. Eventually other tablespace that satisfy restart condition could be used. =============================================================================== - With 32 slots reserved for non-redo rsegs the number of slots available for transactions that write to normal tables will be reduced from 128 to 96. The impact of this change is explained below: --------------------------------------------------------------------------- Fresh DB scenarios: CASE-1: Default sceanrio (no undo-logs configured): - slot-0: rsegs reside in the system-tablespace. (redo rsegs) - slot-1 .... slot-N: rsegs reside in the temp-tablespace. (non-redo rsegs) - Slot-N+1 .... slot-127: rsegs reside in the system-tablespace (redo rsegs). CASE-2: Undo logs = M (M = 1) - slot-0: rsegs reside in the system-tablespace. (redo rsegs) - slot-1 .... slot-N: rsegs reside in the temp-tablespace. (non-redo rsegs) 1st rollback segment is always created in system-tablespace and so with lower limit (M = 1) no extra rollback segment is created in undo-tablespace. This means all transactions will use system-tablespace residing rollback segment only. CASE-3: Undo logs = M (1 < M <= N) - slot-0: rsegs reside in the system-tablespace. (redo rsegs) - slot-1 .... slot-N: rsegs reside in the temp-tablespace. (non-redo rsegs) (Only 1 slot remain active .... more generic case of above scenario.) CASE-4: Undo logs = M (1 < N < M <= 128) - slot-0: rsegs bounded system-tablespace. (redo rsegs) - slot-1 .... slot-N: rsegs bounded to temp-tablespace. (non-redo rsegs) - slot-N+1 .... slot-M: rsegs bounded to undo-tablespace (redo rsegs) ---------------------------------------------------------------------------- Upgrade DB scenarios: (Note: for upgrade clean-shutdown is must). CASE-1: Default sceanrios (no undo-logs configured) Existing setup: - slots-0 .... slot-127: rsegs bounded to system-tablespace Upgraded setup: - CASE-1 from fresh db scenario CASE-2: Undo logs = M (M = 1) Existing setup - slot-0: rsegs bounded to system-tablespace (redo rsegs) Upgraded setup: - CASE-2 from fresh db scenario CASE-3: Undo logs = M (1 < M <= N) Existing setup: - slot-0: rsegs bounded to system-tablespace. - slot-1...slot-M: rsegs bounded to undo-tablespace. Upgraded setup: - CASE-3 from fresh db scenario CASE-4: Undo logs = M (1 < N < M <= 128) Existing setup: - slot-0: rsegs bounded to system-tablespace. - slot-1....slot-M: rsegs bounded to undo tablespace. Upgraded setup: - CASE-4 from fresh db scenario ------------------------------------------------------------------------- User is notified about active rollback segments and found rollback segments using following messages printed during server startup. Note: there was a generic message before that use to print total rollback segment active which is now revised to print redo/non-redo rollback segment and also info about real active rollback segment that would take part in undo logging.2013-04-09 09:51:11 26965 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active. 2013-04-09 09:51:11 26965 [Note] InnoDB: 32 non-redo rollback segment(s) are active. ------------------------------------------------------------------------- Currently each transaction is assigned a rollback segment that it uses for undo logging. With this new semantics each transaction will be assigned 2 rollback segments. First comes from redo rollback pool and second one from non-redo rollback pool. The non-redo rseg is assigned to a transaction only if the transaction updates a temporary table. Read only transactions are not affected by this change except that their rollback segment is always a non-redo rollback segment because only temp-tables are allowed to be modified under this mode. -------------------------------------------------------------------------- The split of the rsegs between redo logged and non-redo logged rsegs will reduce the upper limit, however it should not have an impact in the real world. For this impact to have an effect we would need 96K transactions writing to normal tables before we run out UNDO slots. Maximum number of active transaction at given time will now be reduced from 128 * 1024 = 128K to 128 - 32 = 96 * 1024 = 96K as one rollback segment can be assigned to 1024 transaction at any given time. (Note: 96K is still a very high limit.) 96K: provided all of them are operating with non-temp-tables. 32K: if all of them are operating with temp-table. ------------------------------------------------------------------------- Benchmarking experiement using sysbench to ensure there is no regression in non-temp table case due to reduced undo log rollback segments. mysql-trunk mysql-trunl-wl6915 threads 32 7209 7214 64 7430 7626 128 6719 6865 256 4652 4841 512 2031 2229 1024 698 764
Temporary table UNDO log changes ================================ The objective is to have a special UNDO segment for transactions that update temporary tables, for both compressed and normal temporary tables. This temporary table specific UNDO segment should not be REDO logged, this is to reduce the cost of IO. The temporary tables will not be recovered during crash recovery therefore we don't care about durability. However, we need the UNDO segments to rollback to savepoint during normal transaction processing. This WL proposes to use the temporary tablespace to store these special UNDO logs. The temporary tablespace update operations are not REDO logged. Assigning an additional rseg to trx_t -------------------------------------- Each trx_t should have a rseg that can be used for undo logging of temporary tables. There are other 3 variables related to UNDO logging that are anchored in trx_t trx_rseg_t* rseg; rollback segment assigned to thetransaction, or NULL if not assigned yet trx_undo_t* insert_undo; pointer to the insert undo log, or NULL if no inserts performed yet trx_undo_t* update_undo; pointer to the update undo log or NULL if not updates performed yet In the new design we will encapsulate these state variables in a new structure. struct trx_undo_ptr_t { fields from above; }; For now there are 2 rollback segments per transaction but in order to keep design generic will encapsulate it in single structure and will maintain its instance in trx_t struct. struct trx_t{ .... existing member variables .... trx_rsegs_t rsegs; /* rollback segments for undo logging */ }; struct trx_rsegs_t { /** undo log ptr bounded to system/undo tablespace used for undo logging of all objects those needs to be recovered on crash. */ trx_undo_ptr_t m_redo; /** undo log ptr bounded to temp tablespace used for undo logging of objects only that don't need to be recovered on crash. Can be NULL if transaction doesn't modify any such table that doesn't need to be recovered. */ trx_undo_ptr_t m_noredo; }; Important note: roll_limit that points to least undo-no up to which we can rollback should be updated irrespective of which rseg is being used. Transaction is assigned redo rollback segment by default (unless it is read-only transaction). Non-Redo rollback segment is assigned only if transaction involves modification of temp-tables. trx_undo_report_row_operation() function is used for undo logging. Modify function to add check for assignment of non-redo rollback segment. Also, explicit turn-off REDO logging if transaction involves use of temp-table. allocation of non-redo rollback segments trx_sys is central structure that holds references to all the active rollback segments. On start, based on configuration N rollback segments are created (open in case server is not running in boot-strapped mode). In default configuration, all of 128 rollback segments resides in system-tablespace. Idea is to reserve 32 slots from these 128 slots (note: we are not increasing number of slots) for non-redo rollback segments and keep remaining slots open for redo logged rollback segments. In boot-strap mode: Create 32 rollback segments residing in temp-tablespace Skip use of these segments while creating redo rollback segments irrespective of whether redo is being created in system or undo tablespace. trx_sys_create_rsegs(): Creates extra rollback segments. Modify interface to accept the reserved limit of non-redo tablespace and then create them before creating other rollback segments. Existing logic will detect presence of these rollback segments and will create only delta rollback segments. trx_rseg_create()/trx_sysf_rseg_find_free(): Searches for free slots and creates and assigns rollback segment under free slot. Logic of find free slots needs a small modification, Why ? Non-Redo rollback segments are re-created on server startup and so even though there is registered entry about it in sys-header (created during boot-strapped) it needs to be ignored if slot is reserved for non-redo rollback segment so that new rollback segment is created and corresponding slot is updated accordingly. Rest of the logic of creating rollback segment in specified space remains unchanged (trx_rseg_mem_create). At the end trx_sys will have N rollback segments with slot-0: redo rollback segment residing in system tablespace slot-1-32: non-redo rollback segment residing in temp-tablespace slot-33-slot-127: redo rollback segment residing in system/undo tablespace. Purging of rollback segments ---------------------------- Once transaction commit we don't need rollback segments and so can be purged (unless there is other view using it in which case purge will not proceed). Given that we now have 2 rollback segments per transaction, both needs to be purged. Logically they are single entity and so they should be added as single block to purge queue with same trx->no. Currently purge queue has facility to accept only one rollback segment. We plan to revise this interface and make it more generic so that purge can accept array of rollback segments for given transaction. rseg_queue_t is now replaced with more generic class name TrxUndoRsegs class TrxUndoRsegs { .... member function mainly getter/setter + constructor + comparator based on trx_no .... private: typedef std::vectortrx_rsegs_t; private: /** Transaction number of a transaction of which rollback segments are part off. */ trx_id_t m_trx_no; /** Rollback segments of a transaction, scheduled for purge. */ trx_rsegs_t m_rsegs; }; Iterator interface over TrxUndoRsegs to access rollback segments. /** Chooses the rollback segment with the smallest trx_no. */ struct TrxUndoRsegsIterator { /** Constructor */ TrxUndoRsegsIterator(trx_purge_t* purge_sys); /** Sets the next rseg to purge in m_purge_sys. @return zip_size if log is for a compressed table, ULINT_UNDEFINED if no rollback segments to purge, 0 for non compressed tables. */ ulint operator++(int); private: // Disable copying TrxUndoRsegsIterator(const TrxUndoRsegsIterator&); TrxUndoRsegsIterator& operator=(const TrxUndoRsegsIterator&); /** The purge system pointer */ trx_purge_t* m_purge_sys; /** The current element to process */ TrxUndoRsegs m_trx_undo_rsegs; /** Track the current element in m_trx_undo_rseg */ TrxUndoRsegs::iterator_t m_iter; /** Sentinel value */ static const TrxUndoRsegs NullElement; }; trx_purge_get_rseg_with_min_trx_id() is now replaced by operator++. purge_sys now holds an iterator to TrxUndoRsegs. Once it is done serving the assigned rollback segment it moves to next rollback segment by incrementing the iterator. Whenever purge_sys request for next rollback segment it is poped from purge_queue and assigned to purge system through purge_sys->rseg. If there are more than one rollback segment assigned to a transaction next request for assigning rollback segment will be served by assigning rollback segment from same transaction rollback segment vector. (New element is not popped from Purge Queue). In order to add redo and non-redo rollback segments as logical one block we need to lock both the rollback segments at same time (which is not possible given the dead-lock sync-rules) and so we have added separate sync-level for redo mutex and non-redo mutex. Sequence enforces acquisition of redo mutex first (as it is always present) and then non-redo (optional if non-redo rollback segment is assigned). #define SYNC_REDO_RSEG 600 #define SYNC_NOREDO_RSEG 599 Effect on rollback with multiple rollback segments -------------------------------------------------- Rollback uses rollback-segments to perform the rollback. Currently it read undo-record from assigned rollback segments and simply apply these undo record. With multiple rollback segments, it needs to read undo records from both the rollback segments and apply them. (On crash, non-redo rollback segments are not restored and so not active during rollback which is fine given that objects pointed by them are not restored too.) Additional related changes --------------------------- - Qualifying temp-tablespace as non-user tablespace: Update function that classify tablespace as user or system based on space-id. /** Determine if (i) is a user tablespace id or not. */ # define fil_is_user_tablespace_id(i) ((i) > srv_undo_tablespaces_open)
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.