InnoDB maintains a buffer pool for
caching data and indexes in memory. InnoDB
manages the pool as a list, using a least recently used (LRU)
algorithm incorporating a midpoint insertion strategy. When room
is needed to add a new block to the pool,
InnoDB evicts the least recently used block
and adds the new block to the middle of the list. The midpoint
insertion strategy in effect causes the list to be treated as
two sublists:
At the head, a sublist of “new” (or “young”) blocks that have been recently used.
At the tail, a sublist of “old” blocks that are less recently used.
As a result of the algorithm, the new sublist contains blocks that are heavily used by queries. The old sublist contains less-used blocks, and candidates for eviction are taken from this sublist.
The LRU algorithm operates as follows by default:
3/8 of the buffer pool is devoted to the old sublist.
The midpoint of the list is the boundary where the tail of the new sublist meets the head of the old sublist.
When
InnoDBreads a block into the buffer pool, it initially inserts it at the midpoint (the head of the old sublist). A block can be read in as a result of two types of read requests: Because it is required (for example, to satisfy query execution), or as part of read-ahead performed in anticipation that it will be required.The first access to a block in the old sublist makes it “young”, causing it to move to the head of the buffer pool (the head of the new sublist). If the block was read in because it was required, the first access occurs immediately and the block is made young. If the block was read in due to read-ahead, the first access does not occur immediately (and might not occur at all before the block is evicted).
As long as no accesses occur for a block in the pool, it “ages” by moving toward the tail of the list. Blocks in both the new and old sublists age as other blocks are made new. Blocks in the old sublist also age as blocks are inserted at the midpoint. Eventually, a block that remains unused for long enough reaches the tail of the old sublist and is evicted.
In the default operation of the buffer pool, a block when read in is loaded at the midpoint and then moved immediately to the head of the new sublist as soon as an access occurs. In the case of a table scan (such as performed for a mysqldump operation), each block read by the scan ends up moving to the head of the new sublist because multiple rows are accessed from each block. This occurs even for a one-time scan, where the blocks are not otherwise used by other queries. Blocks may also be loaded by the read-ahead background thread and then moved to the head of the new sublist by a single access. These effects can be disadvantageous because they push blocks that are in heavy use by other queries out of the new sublist to the old sublist where they become subject to eviction.
Several InnoDB system variables control the
size of the buffer pool and let you tune the LRU algorithm:
Specifies the size of the buffer pool. If your buffer pool is small and you have sufficient memory, making the pool larger can improve performance by reducing the amount of disk I/O needed as queries access
InnoDBtables.Specifies the approximate percentage of the buffer pool that
InnoDBuses for the old block sublist. The range of values is 5 to 95. The default value is 37 (that is, 3/8 of the pool).Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist. The default value is 0: A block inserted into the old sublist moves to the new sublist when Innodb has evicted 1/4 of the inserted block's pages from the buffer pool, no matter how soon after insertion the access occurs. If the value is greater than 0, blocks remain in the old sublist until an access occurs at least that many ms after the first access. For example, a value of 1000 causes blocks to stay in the old sublist for 1 second after the first access before they become eligible to move to the new sublist.
innodb_old_blocks_pct and
innodb_old_blocks_time are
available as of MySQL 5.1.41, but only for InnoDB
Plugin, not the built-in version of
InnoDB.
By setting
innodb_old_blocks_time greater
than 0, you can prevent one-time table scans from flooding the
new sublist with blocks used only for the scan. Rows in a block
read in for a scan are accessed rapidly many times in
succession, but the block is unused after that. If
innodb_old_blocks_time is set
to a value greater than the block scan time, the block is not
moved to the new sublist during the table scan. Instead, it
remains in the old sublist and ages to the tail of the list to
be evicted quickly. This way, blocks used only for a one-time
scan do not act to the detriment of heavily used blocks in the
new sublist.
innodb_old_blocks_time can be
set at runtime, so you can change it temporarily while
performing operations such as table scans and dumps to prevent
them from flooding the new sublist:
SET GLOBAL innodb_old_blocks_time = 1000;
... perform queries that scan tables ...
SET GLOBAL innodb_old_blocks_time = 0;
This strategy does not apply if your intent is to fill the
buffer pool with a table's content. For example, you might
perform a table or index scan at server startup or during
benchmarking or testing specifically to “warm up”
the buffer pool. In this case, leaving
innodb_old_blocks_time set to 0
accomplishes the goal of loading the scanned blocks into the new
sublist.
The output from the InnoDB Standard Monitor contains several
fields in the BUFFER POOL AND MEMORY section
that pertain to operation of the buffer pool LRU algorithm:
Old database pages: The number of pages in the old sublist of the buffer pool.Pages made young, not young: The number of old pages that were moved to the head of the buffer pool (the new sublist), and the number of pages that have remained in the old sublist without being made new.youngs/s non-youngs/s: The number of accesses to old pages that have resulted in making them young or not. This metric differs from that of the previous item in two ways. First, it relates only to old pages. Second, it is based on number of accesses to pages and not the number of pages. (There can be multiple accesses to a given page, all of which are counted.)young-making rate: Hits that cause blocks to move to the head of the buffer pool.not: Hits that do not cause blocks to move to the head of the buffer pool (due to the delay not being met).
The young-making rate and
not rate will not normally add up to the
overall buffer pool hit rate. Hits for blocks in the old sublist
cause them to move to the new sublist, but hits to blocks in the
new sublist cause them to move to the head of the list only if
they are a certain distance from the head.
The preceding information from the Monitor can help you make LRU tuning decisions:
If you see very low
youngs/svalues when you do not have large scans going on, that indicates that you might need to either reduce the delay time, or increase the percentage of the buffer pool used for the old sublist. Increasing the percentage makes the old sublist larger, so blocks in that sublist take longer to move to the tail and be evicted. This increases the likelihood that they will be accessed again and be made young.If you do not see a lot of
non-youngs/swhen you are doing large table scans (and lots ofyoungs/s), you will want to tune your delay value to be larger.
For more information about InnoDB Monitors, see Section 14.6.8, “InnoDB Monitors”.
The MyISAM storage engine also uses an LRU
algorithm, to manage its key cache. See
Section 8.10.1, “The MyISAM Key Cache”.
Actually the opposite may be true in some scenarios. What if I am running Solaris, OpenSolaris or an OS using a technology such as ZFS? With that I want innodb buffer cache to be as small as possible to allow the underlying FS to cache all operations that this process will require. This is also more efficient as ZFS uses a combination of MFU & MRU lists using an adaptive caching policy. Disk writes are handled by txg's (transaction groups) to batch writes together by the FS as well. LRU is less adaptive therefore less effective.