WL#13795: InnoDB: Disable/Enable redo log globally & dynamically

Affects: Server-8.0   —   Status: Complete

SUMMARY

This worklog implements support for disabling and enabling Innodb redo log dynamically. Disabling redo log (WAL) makes all write operation to DB faster. It also makes the server vulnerable to a crash and the entire instance data could be lost.

USER STORIES

The primary use case for disabling redo log is while loading initial data to a mysqld instance.

A. Disable redo logging

B. Load data

C. Enable redo logging

In this model data loading is fast. However, If node goes down during step [B] all data is lost and one needs to create a fresh instance and start with step [A].

  • It is important that the instance goes back to normal recoverable state immediately after step[C] i.e. it must recover from any crash after [C].

  • When redo logging is disabled, a shutdown must ensure that instance is recoverable. This means innodb_fast_shutdown=2 must be disregarded and data pages must be flushed and synced to disk before shutting down.

  • We have new dynamic privilege - INNODB_REDO_LOG_ENABLE

  • CLONE and MEB log archiving cannot operate when redo logging is disabled and vice versa.

USER INTERFACE

1.Grant privilege INNODB_REDO_LOG_ENABLE to admin loading data.

GRANT INNODB_REDO_LOG_ENABLE ON *.* to 'data_load_admin';

2.Disable/Enable redo logging

ALTER INSTANCE ENABLE|DISABLE INNODB REDO_LOG;

3.Check current redo log state [ON|OFF] from performance_schema global status.

SELECT * FROM performance_schema.global_status WHERE variable_name = 'innodb_redo_log_enabled';
  • The SQL command is synchronous. While enabling redo log, the call returns only after guaranteeing crash safety.
  • MDL lock prevents concurrent clone and other ALTER INSTANCE commands.

Alternative Interface: [Not implemented] Have read-write dynamic configuration

1.Grant privilege "SYSTEM_VARIABLES_ADMIN" [Existing] to admin loading data.

GRANT SYSTEM_VARIABLES_ADMIN ON *.* to 'data_load_admin';

2.Disable/Enable redo logging. Check and wait for operation to complete.

SET GLOBAL INNODB_REDO_LOG = [ON|OFF]
SELECT @@GLOBAL.INNODB_REDO_LOG

4.Check current redo log state [ON|OFF] anytime

SELECT @@GLOBAL.INNODB_REDO_LOG
  • Setting the configuration is asynchronous and would trigger the operation.
  • The operation would run in background and user need to check/wait for completion.
  • No MDL lock protection and concurrency with other operations must be handled otherwise.

Functional

F-1. Must support SQL command to dynamically enable and disable Innodb redo logging.

ALTER INSTANCE [ENABLE|DISABLE] INNODB REDO_LOG;

F-1A: Must have INNODB_REDO_LOG_ENABLE privilege to enable/disable redo logging.

F-1B: Enable/Disable redo logging command must not be logged to binary log.

F-2. Must support viewing the state whether redo log is enabled or disabled.

SELECT * FROM performance_schema.global_status WHERE variable_name = 'innodb_redo_log_enabled';

SHOW GLOBAL STATUS LIKE 'innodb_redo_log_enabled';

F-3. Instance must be recoverable after explicit shutdown when redo logging is disabled.

F-4. Instance should throw error and refuse to start in case of a crash when redo logging is disabled.

F-5. Instance must be recoverable from a crash when redo logging is enabled.

F-6. Clone must throw error and exit if redo logging is disabled at donor instance.

 CLONE INSTANCE|LOCAL ...;

F-6A. Disabling redo log should wait (on BACKUP lock) for existing clone to complete.

F-7. Enabling redo archive must throw error if redo logging is disabled.

 DO innodb_redo_log_archive_start(...);

F7A. Disabling redo log must throw error if redo log archiving is already in progress.

F-8. Concurrent redo log ENABLE|DISABLE command should serialize in server MDL lock.

  • Other concurrent ALTER INSTANCE command (like ROTATE) could serialize with redo log ENABLE|DISABLE.
  • Occasional deadlock error(on server MDL) is expected if too many ALTER INSTANCE is run simultaneously but no hang.

Non-functional

NF-1: Loading data should be faster with redo log disabled.

NF-2: Pages should not be double written when redo logging is disabled.

  • This would reduce IO and help page flush IO to scale better.
  • The global configuration innodb_doublewrite is not affected by redo logging enable/disable. When redo logging is enabled, double write must be done as per the global innodb_doublewrite configuration.

User Interface

  • SQL Command:

    ALTER INSTANCE [ENABLE|DISABLE] INNODB REDO_LOG;
    

    *Takes BACKUP MDL lock - Exclusive

  • Performance schema global status:

    Innodb_redo_log_enabled - Boolean [ON|OFF]
    

https://dev.mysql.com/doc/refman/8.0/en/server-status-variables.html

Scenarios

1. Check redo logging status

mysql> select * from performance_schema.global_status where variable_name like '%redo%';
+-------------------------+----------------+
| VARIABLE_NAME           | VARIABLE_VALUE |
+-------------------------+----------------+
| Innodb_redo_log_enabled | ON             |
+-------------------------+----------------+
1 row in set (0.00 sec)

mysql> alter instance disable innodb redo_log;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from performance_schema.global_status where variable_name like '%redo%';
+-------------------------+----------------+
| VARIABLE_NAME           | VARIABLE_VALUE |
+-------------------------+----------------+
| Innodb_redo_log_enabled | OFF            |
+-------------------------+----------------+
1 row in set (0.01 sec)

2. Kill mysqld server when redo log is disabled and try to restart

2020-02-13T13:14:47.672230Z 1 [ERROR] [MY-013578] [InnoDB] Server was killed when Innodb Redo logging was disabled. Data files could be corrupt. You can try to restart the database with innodb_force_recovery=6
2020-02-13T13:14:47.672390Z 1 [ERROR] [MY-012930] [InnoDB] Plugin initialization aborted at srv0start.cc[2477] with error Generic error.
2020-02-13T13:14:47.673587Z 1 [ERROR] [MY-010334] [Server] Failed to initialize DD Storage Engine
2020-02-13T13:14:47.674010Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2020-02-13T13:14:47.674283Z 0 [ERROR] [MY-010119] [Server] Aborting
... 

3. Error with concurrent CLONE and MEB

mysql> alter instance disable innodb redo_log;
Query OK, 0 rows affected (0.00 sec)

CLONE when redo log is disabled

mysql> clone local data directory '/home/debanerj/mysql-src4/bld_install/data_local';
ERROR 3968 (HY000): Cannot perform operation as Innodb redo logging is disabled. Please retry after enabling redo log with ALTER INSTANCE

MEB when redo log is disabled [starts with enabling redo log archiving in Innodb]

mysql> DO innodb_redo_log_archive_start('meb-1', 'meb_arch');
ERROR 3968 (HY000): Cannot perform operation as Innodb redo logging is disabled. Please retry after enabling redo log with ALTER INSTANCE

DISABLE when MEB [redo log archiving] in progress

mysql> alter instance enable innodb redo_log;
Query OK, 0 rows affected (0.05 sec)

mysql> DO innodb_redo_log_archive_start('meb-1', 'meb_arch');
Query OK, 0 rows affected (0.04 sec)

mysql> alter instance disable innodb redo_log;
ERROR 3969 (HY000): Cannot perform operation as Innodb is archiving redo log. Please retry after stopping redo archive by invoking innodb_redo_log_archive_stop()

DISABLE when CLONE in progress

mysql> clone local data directory '/home/debanerj/mysql-src4/bld_install/data_local';
... [running]

mysql> alter instance disable innodb redo_log;

... [running | blocked]

mysql> show processlist;

+----+-----------------+-----------+------+---------+------+-------------------------+-------------------------------------------------------------------------------+
| Id | User            | Host      | db   | Command | Time | State                   | Info                                                                          |
+----+-----------------+-----------+------+---------+------+-------------------------+-------------------------------------------------------------------------------+
|  5 | event_scheduler | localhost | NULL | Daemon  |   77 | Waiting on empty queue  | NULL                                                                          |
|  9 | root            | localhost | NULL | Query   |   45 | starting                | clone local data directory '/home/debanerj/mysql-src4/bld_install/data_local' |
| 10 | root            | localhost | NULL | Query   |   12 | Waiting for backup lock | alter instance disable innodb redo_log                                         |
| 11 | root            | localhost | NULL | Query   |    0 | starting                | show processlist                                                              |
+----+-----------------+-----------+------+---------+------+-------------------------+-------------------------------------------------------------------------------+

Internal SE | handlerton interface

/**
@brief
Enable or Disable SE write ahead logging.

@param[in] thd    server thread handle
@param[in] enable enable/disable redo logging

@return true iff failed.
*/
typedef bool (*redo_log_set_state_t)(THD *thd, bool enable);

struct handlerton {
...
redo_log_set_state_t redo_log_set_state;
...
}

File Format:

Redo log header:

In first block of redo log header, we add a 4 bytes flag at end. It is fully backward compatible as the default value for the flag is zero and in all old version the memory is cleared before write.

/** 32 BITs flag */
constexpr uint32_t LOG_HEADER_FLAGS = LOG_HEADER_CREATOR_END;

/** Flag at BIT-1 to indicate if redo logging is disabled or not. */
constexpr uint32_t LOG_HEADER_FLAG_NO_LOGGING = 1;

/** Flag at BIT-2 to indicate if server is not recoverable on crash. This
is set only when redo logging is disabled and unset on slow shutdown after
all pages are flushed and synced to disk. */
constexpr uint32_t LOG_HEADER_FLAG_CRASH_UNSAFE = 2;

Data structures/methods

Log System:

We note the persistent flags so that we write the correct value each time we overwrite the header during rotation.

struct log_t {
  ...
  /** true if redo logging is disabled. Read and write with writer_mutex  */
  bool m_disable;

  /** true, if server is not recoverable. Read and write with writer_mutex */
  bool m_crash_unsafe;
}

Mini Transaction:

No logging is driven by mini-transaction system. We decide whether to log or not log while starting a mini transaction.

/** mtr global logging */
mtr_t::Logging {

  /** mtr global redo logging state.
  Enable Logging  :
  [ENABLED] -> [ENABLED_RESTRICT] -> [DISABLED]

  Disable Logging : 
  [DISABLED] -> [ENABLED_RESTRICT] -> [ENABLED_DBLWR] [ENABLED] */

  enum State : uint32_t {
    /* Redo Logging is enabled */
    ENABLED,
    /* Redo logging is enabled. All non-logging mtr are finished with the
    pages flushed and synced to disk. Double write is enabled. Some pages
    could be still getting written to disk without double-write. Not safe
    to crash. */
    ENABLED_DBLWR,
    /* Redo logging is enabled but there could be some mtrs still running
    in no logging mode. Redo archiving and clone are not allowed to start. 
    No double-write. */
    ENABLED_RESTRICT,
    /* Redo logging is disabled and all new mtrs would not generate any redo.
    Redo archiving and clone are not allowed. No double-write. */
    DISABLED
  };

  /** Disable mtr redo logging. Server is crash unsafe without logging.
  @param[in]  thd     server connection THD
  @return mysql error code. */
  int disable(THD *thd);

  /** Enable mtr redo logging. Ensure that the server is crash safe
  before returning.
  @param[in]  thd     server connection THD
  @return mysql error code. */
  int enable(THD *thd);
  ...

}

Disable REDOLOG Flow:

1. Disallow archiving to start
    ENABLED -> ENABLED_RESTRICT

2. Check if redo log archiving is active.
    ER_INNODB_REDO_ARCHIVING_ENABLED

3. Mark that it is unsafe to crash going forward.
    Set LOG_HEADER_FLAG_CRASH_UNSAFE, LOG_HEADER_FLAG_NO_LOGGING

4. Complete Disable redo logging
    ENABLED_RESTRICT -> DISABLED

Enable REDOLOG Flow:

1. Allow mtrs to generate redo log. 
    DISABLED -> ENABLED_RESTRICT

2. Wait for all no-log mtrs to finish and add dirty pages to disk.

3. Wait for dirty pages to flush and sync by forcing checkpoint at current LSN.
    - Write dummy mtr to generate redo log
    - Take latest checkpoint
    - Ensures all redo only changes including auto increment (dd buffer table) are persisted. 

4. Enable double write
     ENABLED_RESTRICT -> ENABLED_DBLWR

5. Request another checkpoint

6. Mark that it is safe to recover from crash
     Reset LOG_HEADER_FLAG_CRASH_UNSAFE, LOG_HEADER_FLAG_NO_LOGGING

7. Complete enabling redo log
    ENABLED_DBLWR -> ENABLED
  • How checkpoint ensures Redo only changes are persisted ?

    1. Checkpoint at any LSN must ensure that all previous operations that generated redo logs before the LSN are persisted otherwise - because redo logs are going to be discarded
    2. Checkpoint never refers to existing redo logs.

So, by 1 & 2 we can claim any redo-only operations that skipped redo log while redo-logging was disabled must have been persisted once the checkpoint is complete.