WL#7868: InnoDB: Improvements around flushing for proper performance
The flushing activity behavior is important to control throughput of DML oriented workload. The oldest modification LSN age must be controlled in the transaction log file total capacity, not to overwrite the last checkpoint log LSN. The age is increased by new transaction logging the modification. And is decreased by flushing the page which has the oldest modification. So, in long term, flushing the oldest pages is needed to progress new transactions. 1: Implementing improvements to the adaptive flushing algorithm 1-a: should consider the distribution about the oldest LSN of the pages in the end of the flush_list Along with the each oldest LSN for the pages, the number of pages to be flushed for the same LSN age progress are different. If there are many similar oldest LSN aged pages in the end of the flush_list, then all pages should be flushed to progress the LSN age. The factor which considers "How many pages should be flushed to progress the LSN age at the time" should added to the adaptive flushing algorithm. 1-b: should consider flushing balance for each buffer pool instances The adaptive flushing algorithm also should consider dirty page balance between buffer pool instances. Dirty pages in the buffer pool instance which has the most dirty pages should be flushed in prior 2: Setting a thread priority for the page_cleaner (if the platform allows and authorized to set) To keep the proper flushing along with the throughput of the workload, page_cleaner threads should have opportunity to be executed properly. If there are many active user threads and their thread priority are all the same, the page_cleaner has less opportunity to run and might not flush enough dirty pages. Even if it depends on environment and 'best effort', page_cleaner priority should be greater than the others. 3: Proper flush waiting when the max modified LSN age is around max_modified_age_sync In the previous version, if the max modified LSN age reaches max_modified_age_sync the flushing could overreacts (long flushing batch and all user threads should wait for the batch end). Each user thread should wait only for the target LSN age, not for the batch end. They should progress along with the flushing pages and reduced the last modified LSN age. By that throughput around max_modified_age_sync should be much more smooth and proper, and much less performance risk to reach to max_modified_age_sync. 3': should block the checkpoint LSN overwritten By implementing the above item "3:", the user threads have more opportunity to progress than before, and also might have more possibility to overwrite the checkpoint. The each log write should be conscious about the last checkpoint LSN not to overwrite its transaction log.
Functional requirements: F-1: Proper flushing throughput along with setting innodb_io_capacity_max. (User should not set too large innodb_io_capacity_max. [written blocks/s] can reach to system max and can be limited by innodb_io_capacity_max with the number (blocks/s)) F-2: User can detect too large setting of innodb_io_capacity_max by the server error log. (like "page_cleaner: 1000ms intended loop took XXXXms. The settings might not be optimal. (flushed=YYYY and evicted=ZZZZ, during the time.)" At the case the optimal innodb_io_capacity_max might be 'ZZZZ * (1000/XXXX)') F-3: User can get activity detail of the page_cleaner threads by INFORMATION_SCHEMA.INNODB_METRICS table Non-Functional requirements: NF-1: optimize (stabilize and same/higher average) long-term (means might reach to max_modified_age_sync flushing) RW-workload
Changes to the interface specification: I-1: No new files I-2: New syntax I-2-1: new items for INFORMATION_SCHEMA.INNODB_METRICS table buffer_flush_n_to_flush_by_age: "Number of pages target by LSN Age for flushing." buffer_flush_adaptive_avg_time_slot: "Avg time (ms) spent for adaptive flushing recently per slot." buffer_LRU_batch_flush_avg_time_slot: "Avg time (ms) spent for LRU batch flushing recently per slot.", buffer_flush_adaptive_avg_time_thread: "Avg time (ms) spent for adaptive flushing recently per thread.", buffer_LRU_batch_flush_avg_time_thread: "Avg time (ms) spent for LRU batch flushing recently per thread.", buffer_flush_adaptive_avg_time_est: "Estimated time (ms) spent for adaptive flushing recently.", buffer_LRU_batch_flush_avg_time_est: "Estimated time (ms) spent for LRU batch flushing recently.", buffer_flush_avg_time: "Avg time (ms) spent for flushing recently.", buffer_flush_adaptive_avg_pass: "Numner of adaptive flushes passed during the recent Avg period.", buffer_LRU_batch_flush_avg_pass:"Number of LRU batch flushes passed during the recent Avg period.", buffer_flush_avg_pass: "Number of flushes passed during the recent Avg period.", buffer_LRU_get_free_loops: "Total loops in LRU get free.", buffer_LRU_get_free_waits: "Total sleep waits in LRU get free." I-3: No new commands I-4: No new tools I-5: No impact on existing functionality
Copyright (c) 2000, 2023, Oracle Corporation and/or its affiliates. All rights reserved.