WL#6117: InnoDB: Resize the InnoDB Buffer Pool Online

Status: Complete   —   Priority: Medium

The online resize of the buffer pool is useful for users to adjust effective
hardware resource use more quickly than restarting mysqld.

There are 2 ways to change size of buffer pool.
[1] change each size of instance
[2] add/drop instances

In this wl#, only [1] is implemented.
because the hash function to assign the page is based on the number of instances.
we should reassign the all pages of the buffer pool at once for [2].
(That means the all of instances cannot be free until converted entirly to the
new instances.
 It means, at least "sum of old size" + "sum of new size" memory are needed for
[2] at the same time.)

Currently the buffer pool is configured with two variables:
innodb_buffer_pool_size - The total size of the buffer pool in bytes
innodb_buffer_pool_instances - The number of memory pools to allocate

This wl# makes only innodb_buffer_pool_size dynamically changeable.
So, each instance size should be changed.

Because page assignment for instance is not changed, we can resize instance one
by one.

Increasing instance size:
    (all operations should exclude other threads to use buffer pool)
    - Moving pages to the new area (or adding the chunks(*))
    - Covert hash_table, lists, pointers to the new addresses.
    - Adding increased pages to the free list.

Decreasing instance size:
    - Defragmentaion and withdrawing the decreasing pages.
         (can be done with the other threads online)

    (following operations should exclude other threads to use buffer pool)
    - Moving pages to the new area (or deleting the chunks(*))
    - Covert hash_table, lists, pointers to the new addresses.

(*)
After the actual proof of concept, the phase "Moving pages to the new area" is
found as the most expensive among the other phases. So we should add chunk base
management option for the users who are severe to performance.

If the buffer pool and resizing size is always aligned to the 'chunk unit size',
only adding/deleting chunks is enough instead of the "Moving pages to the new area"
Functional requirements:
F-1 : innodb_buffer_pool_size : is a legacy global variable, is now changed to
'Dynamic Variable.

(The Default value is still 128M. not changed.)

F-2: User shall be able to monitor progress of resizing buffer pool (by global
status 'innodb_buffer_pool_resize_status' or server error log)

F-3 : innodb_buffer_pool_chunk_size : is a new global read-only variable ,
default value is 128M.

      * always innodb_buffer_pool_chunk_size > 0 
          always enable the online resizing of innodb_buffer_pool_size at
'innodb_buffer_pool_chunk_size' granularity.

The 'innodb_buffer_pool_chunk_size' unit is 1M. And the actual buffer pool size
is always aligned to 'innodb_buffer_pool_chunk_size' *
'innodb_buffer_pool_instances'.

When initializing buffer pool, if 'innodb_buffer_pool_chunk_size' *
'innodb_buffer_pool_instances' is larger than the initial
'innodb_buffer_pool_size', 'innodb_buffer_pool_chunk_size' is automatically
truncated to 'innodb_buffer_pool_size'/'innodb_buffer_pool_instances'

Non-Functional requirements:
NF-0: (no performance requirements can be defined. because it must affect
performance much.)
NF-1: Resize shall not to cause any crashes / data lost for any usage
NF-2: All resize shall be guaranteed to finish
NF-3: User shall be able to shutdown mysqld even during resizing buffer pool
NF-4: User shall be able to use InnoDB API (e.g. innodb_memcached) safely. (No
data lost; Returning errors are allowable if no lost data)


----------------------------------------
(raw description and notes)
The user can resize buffer pool online by setting the value to the global
variable 'innodb_buffer_pool_size'.

To optimize the resizing performance (the resizing affects to throughput, so
shorter time is better), the chunk base size management is prepared also. (not
needed copy whole of blocks. just add/delete chunks) The new global variable
'innodb_buffer_pool_chunk_size' is used to control the behavior.

During resizing, the resizing process once waits for all transactions are not
active, and new transaction or operation which might access to buffer pool wait
for finishing the resizing at the start of each transaction/operation. It is
needed for critical phase and this hardly affects mainly for the throughput of
the user's workload.

Especially when shrinking the buffer pool, defragmentation and reducing blocks
of the buffer pool is needed before resizing buffer pool. This phase also might
affect to throughput of the user's workload, because the reducing blocks might
cause shortage of blocks temporally, though concurrent access to buffer pool is
allowed.

------------
(note)
The users must care active transactions using InnoDB. They block resizing
process. All transactions should be finished once to progress resizing.

------------
(note)
InnoDB API user (e.g. innodb_memcached):
Almost same, if the operations are based on active transaction.

But user should not use recursive active user transactions. The sub-transaction
might blocked by the resizing process.

For non transactional API, the operation might be failed during the critical
pahse of resizing buffer pool. Because no way to confirm it is based on
transaction or not.
Changes to the interface specification: 
I-1: No new files
I-2: New syntax:
  I-2-1: global variable 'innodb_buffer_pool_size' is changed to 'Dynamic Variable'
         (* 'innodb_buffer_pool_instances' syntax is not changed)
         (If the following new variable 'innodb_buffer_pool_chunk_size' is 0,
the online set value is ignored and cause warning)

  I-2-2: new global read-only variable 'innodb_buffer_pool_chunk_size'

The default value is 128M. And always larger than 0.

The 'innodb_buffer_pool_chunk_size' unit is 1M. And the actual buffer pool size
is always aligned to 'innodb_buffer_pool_chunk_size' *
'innodb_buffer_pool_instances'.


  I-2-3: new global status 'innodb_buffer_pool_resize_status' shows progress of
resizing by string
     e.g.) mysql> show global status like 'innodb_buffer_pool_resize_status';
           +----------------------------------+-------------+
           | Variable_name                    | Value       |
           +----------------------------------+-------------+
           | Innodb_buffer_pool_resize_status | not started |
           +----------------------------------+-------------+
           1 row in set (0.00 sec)

I-3: No new commands
I-4: No new tools. 
I-5: Impact on existing functionality.
  I-5-1: a load of the buffer pool is aborted by resizing buffer pool, if on going


----------------------------------------
(raw description and notes)

- new option 'innodb_buffer_pool_chunk_size' (Read-only) is added.

           * too huge number of chunks might cause another performance
             problem. though not tested enough, the rough number of chunks
                [innodb_buffer_pool_size]
            --------------------------------- might be good not to exceed 1000.
             [innodb_buffer_pool_chunk_size]

- The actual resizing process is done by background thread. The progress status
can be confirmed by global status value 'innodb_buffer_pool_resize_status'

- during resizing each buffer pool instance, all threads which might use buffer
pool (both user and background) are paused, not to cause deadlock and to replace
pages without waiting. It is also to avoid active transaction existing.

- "load of the buffer pool from a file named @@innodb_buffer_pool_filename" is
aborted while resizing buffer pool. (To pause the dump thread also.)

- The detail of the resizing process is logged to the server error log file as INFO

- Defragmentaion and withdrawing before shrinking size, is only the phase to
allow concurrent activity.
The resizing process as followings.

- Disable adaptive hash index
	It uses blocks from buffer pool directly as heap memory for each record.
	Discarding the blocks and disabling AHI during resizing is required and reasonable.

- Withdraw blocks to shrink instance (decreasing size case only)
	Withdrawing blocks on the shrinking area.
	Relocating the blocks/data on the area to the not-withdrawn area.

	this phase is done by the loop
	- withdraw from free_list
	- LRU_list flushing
	- relocate blocks and page->zip.data

	* in this phase, buf_LRU_get_free_only() for the other threads also withdraw blocks
	* this phase allows the other threads activity, because the withdrawing is in
organic way.

	* If there still are blocks cannot be relocated (because used too hot or held
by long thread, retry this phase after the next phase ("pause the other
threads") again.

- Resize buffer pool instances
	the processes for the each instance
	- add/delete chunks of the innodb_buffer_pool_chunk_size

- Normalize the other components which depend on buf_pool_size (if unsuitable
for the new size)
	- hash_table at lock_sys (block->lock_hash_val should be updated also)
	- hash_table at AHI
	- hash_tables at dict_sys
	- ibuf->max_size

- Enable adaptive hash index, if needed