WL#9286: InnoDB: Support Transparent Data Encryption for Shared Tablespaces

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

Transparent Data Encryption, for InnoDB file-per-table tablesapces, was added to MySQL 5.7 (doc link here).

This worklog intends to extend tablespace encryption to General (Shared) Tablespaces. It would be possible to provide encryption information during CREATE/ALTER of General Tablespace with following SQL Syntax :

CREATE TABLESPACE my_tablespace ENCRYPTION='Y'/'N'; 
ALTER TABLESPACE my_tablespace ENCRYPTION='Y'/'N';

NOTE: If no encryption flag is provided during create, General Tablespaces will be created as Unencrypted.

Above encryption works on page level. Therefore,

  • Any table being created in an Encrypted (or Unencrypted) shared tablespace will have its data Encrypted (or Unencrypted).
  • All existing tables in shared tablespace will have its data Encrypted (or Unencrypted) if shared tablespace is ALTERed to Encrypted (or Unencrypted).
  • FR1: By default, CREATE TABLESPACE will create unencrypted tablespaces.
  • FR2: Valid values for ENCRYPTION attribute are Y/y, N/n. If anything else is provided errors should be thrown as follow :
 mysql> create tablespace encrypt_ts add datafile 'encrypt_ts.ibd' engine=INNODB encryption=TRUE;
 ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'TRUE' at line 1
 mysql> create tablespace encrypt_ts add datafile 'encrypt_ts.ibd' engine=INNODB encryption='R';
 ERROR 3184 (HY000): Invalid encryption option.
 mysql> alter tablespace encrypt_ts encryption='A';
 ERROR 3184 (HY000): Invalid encryption option.
 mysql> alter tablespace encrypt_ts encryption=; ERROR 3184 (HY000): Invalid encryption option.
 
  • FR3 : Following error not to be thrown anymore while creating/altering general shared tablespace with encryption option now .
  ERROR 3183 (HY000): This tablespace can't be encrypted.
  • FR4 : ENCRYPTION is not a table property but a tablespace property. Thus tables wouldn't have any metadata related to ENCRYPTION attached to them.
    • FR4.1 : CREATE/ALTER TABLE t1 ... ENCRYPTION=Y/N is allowed syntax (legacy behavior) and it will be used for file-per-table tablespace.

NOTE:This applies to partitioned table as well.

    • FR4.2 : CREATE/ALTER TABLE t1 ... TABLESPACE TS ENCRYPTION=Y/N is not allowed. An error will be thrown :
 ERROR 1478 (HY000) : With general/shared tablespace specified, ENCRYPTION is not accepted syntax for CREATE/ALTER table.
    • FR4.3 : Following query shows the ENCRYPTION option used only when a file-per-table tablespace is used during CREATE/ALTER TABLE ... as only for file-per-table tablespace this option is allowed. For tables using general tablespace, this column will be blank.
 mysql> SELECT TABLE_SCHEMA, TABLE_NAME, CREATE_OPTIONS FROM INFORMATION_SCHEMA.TABLES WHERE CREATE_OPTIONS LIKE '%ENCRYPTION="Y"%';
   +--------------+------------+----------------+
   | TABLE_SCHEMA | TABLE_NAME | CREATE_OPTIONS |
   +--------------+------------+----------------+
   | test         | t1         | ENCRYPTION="Y" |
   +--------------+------------+----------------+

NOTE: Above will help for dump/restore scenario for file-per-table tablespace as in this case, tablespace is not created separately.

  • FR5 : A new Column 'ENCRYPTION' is added with value Y and N to indicate if a tablespace is Encrypted or Unencrypted.
mysql> select SPACE, NAME, ENCRYPTION from information_schema.innodb_tablespaces;
   +-------+------------+------------+
   | SPACE | NAME       | ENCRYPTION |
   +-------+------------+------------+
   ...
   |     4 | test/t1    | N          |
   |     5 | encrypt_ts | Y          |
   +-------+------------+------------+
   1 row in set (0.00 sec)
  • FR6 : Modifying a general tablespace encryption doesn't affect tables' property in it, as encryption is not table property. Thus no changes in tables' metadata will be done.
  • FR7 : Shared tablespaces are not allowed in table Partitions i.e. all table partitions are to be file-per-table tablespace always. If attempted, following error thrown:
 ERROR 1478 (HY000) : InnoDB : Shared tablespace is not allowed in partitioned table.

NOTE : It is to be deprecated in 5.7.

  • FR8 : While moving a table from one tablespace to another tablespace, we gives following error if movement is from STRONG->WEAK. Opposite is allowed.
 ERROR ???? (?????) : Source tablespace is encrypted but target tablespace is not encrypted.
  • FR9 : During (Un)encrypted operation
    • FR9.1 : Concurrent DMLs should be allowed on tablespace tables.
    • FR9.2 : Concurrent DDLs should NOT be allowed on tablespace tables.
  • FR10 : ALTER TABLESPACE ... ENCRYPTION='Y'/'N' command returns when (un)enryption operation finishes i.e. (un)encryption happens in foreground thread synchronously.
  • FR11 : In case of crash during (un)encryption,
    • FR11.1 : After recovery, operation should be roll-forward i.e. should resume and finish.
    • FR11.2 : Above operation to be done in a background thread asynchronously.
    • FR11.3 : Server should continue to start.
  • FR12 : Progress monitoring information should be provided (via a new PFS stage instrument).
mysql> select * from performance_schema.events_stages_current\G
*************************** 1. row ***************************
         THREAD_ID: 40
          EVENT_ID: 14
      END_EVENT_ID: NULL
        EVENT_NAME: stage/innodb/alter tablespace (encryption)
            SOURCE: ut0stage.h:537
       TIMER_START: 25222469814000
         TIMER_END: 34781673863000
        TIMER_WAIT: 9559204049000
    WORK_COMPLETED: 10
    WORK_ESTIMATED: 575
  NESTING_EVENT_ID: 11
NESTING_EVENT_TYPE: STATEMENT
1 row in set (0.00 sec)

Where

WORK_ESTIMATED : Total number of pages in tablespace
WORK_COMPLETED : Total number of pages processed in tablespace
  • FR13 : A status information should be provided in INFORMATION_SCHEMA.INNODB_TABLESPACES in Col SPACE_STATE (Being added in WL#9508).
  • FR14 : Import/Export encrypted general tablespace. As of now, import/export for general tablespace is not supported. wl#9761 aims to do that. Once wl#9761 is implemented, Import/Export for encrypted general tablespace could be provided.
  • FR15 : Key rotation. (recovery after crash in key rotation)
  • FR16 : All possible scenarios of CREATE/ALTER TABLE/TABLESPACE with ENCRYPTION=Y/N, mentioned in table below should hold valid.[look at the file attachment ScenariosTable.pdf]
  • FR17 : Dump and restore using mysqldump/pump should work for all tables. NOTE: mysqldump doesn't dump "CREATE TABLESPACE ..." as of now. Thus for tables in general tablespaces, restore gives error "Tablespace doesn't exists".

Non-Functional requirements

Contents


(Un)Encryption Process

  • (Un)Encryption process is done synchronously in normal scenario. Only in crash/recovery scenario, its done in separate background thread asynchronously.
  • Also, (un)encryption process moves in sequence of page number. i.e. pageN is (un)encrypted before pageM, iff N

Well populated st_alter_tablespace* and dd::Tablespace* (old an new) is to be provided by runtime team to InnoDB.

/** This API handles CREATE, ALTER & DROP commands for InnoDB tablespaces.
@param[in]      hton            Handlerton of InnoDB
@param[in]      thd             Connection
@param[in]      alter_info      How to do the command
@param[in]      old_ts_def      Old version of dd::Tablespace object for the tablespace
@param[in,out]  new_ts_def      New version of dd::Tablespace object for the tablespace. Can be adjusted by SE. Changes will be persisted in the data-dictionary at statement commit.
@return MySQL error code*/
static
int
innobase_alter_tablespace(
        handlerton*             hton,
        THD*                    thd,
        st_alter_tablespace*    alter_info,
        const dd::Tablespace*   old_ts_def,
        dd::Tablespace*         new_ts_def)
{
...

...
}
/** Alter Encrypt/Unencrypt a tablespace.
@param[in]      thd             Connection
@param[in]      alter_info      How to do the command
@param[in]      old_dd_space    Old Tablespace metadata
@param[in]      new_dd_space    New Tablespace metadata
@param[in]      to_encrypt      true to encrypt, false to unencrypt
@return 0 on success, MySQL error code on failure*/
static
int
innobase_alter_encrypt_tablespace(
        THD*                    thd,
        st_alter_tablespace*    alter_info,
        const dd::Tablespace*   old_dd_space,
        dd::Tablespace*         new_dd_space,
        bool                    to_encrypt)
{
  - Do some initial checks.
  - Call fsp_encrypt_or_unencrypt_tablespace() which does actual (un)encryption.
  - Return result.
}

Following is the main method which is responsible for doing actual (un)encryption operation. Its called from innobase_alter_encrypt_tablespace() (in normal scenario) and from fsp_resume_encryption_unencryption() (in crash/recovery scenario).

/** Encrypt/Unencrypt a tablespace.
@param[in]      space_id        Tablespace id
@param[in]      from_page       page id from where operation to be done
@param[in]      to_encrypt      true if to encrypt, false if to unencrypt
@return 0 for success, otherwise error code */
dberr_t
fsp_encrypt_or_unencrypt_tablespace(
        space_id_t      space_id,
        page_no_t       from_page,
        bool            to_encrypt)
{
  /* Do actual (un)encryption as follow : Pseudo code */
  if (to_encrypt) {
        1. Prepare Encryption Info.
        2. Set encryption Info.
        3. Read page 0 and write encryption info.
        4. Update tablespace flag.
  } else {
        1. Reset encryption Info.
        2. Update tablespace flags
  }
  Write Operation Type and Progress (0 now) on page 0.
  While (All pages are read) {
        1. mark page dirty
        2. If ('n' pages are marked dirty) {
                 Update Progress info on page 0.
           }
  }
  Reset Operation Type and Progress info on page 0.
  If (UnEncryption was in progress) {
          Erase Encryption info from page 0.
}

(Un)Encryption Progress Information

For any general tablesapce on which (un)encryption process is going on, page 0 is used to keep some progress information.

Page 0

After SDI information, 5 Bytes are used in page 0 as follow:

  • 1 byte for 'Operation type'
  • 4 bytes for 'Progress'
|__________________________|||___________|
                                                 page 0

Operation Type (1 byte)

* Encryption      (0000 0001)
* Unencryprion    (0000 0010)

NOTE : only 2 bits are used. 6 bits are left for future used (if any).

Progress (4 bytes)

Its MaxPageNo till which (Un)Encryption process is completed successfully.

* Till page 10 operation is done         (00000000 00000000 00000000 00001010)

This information is used after recovery to resume the process (operation type) from the MaxPageNo stored in Progress Information.

DDL LOG Usage

At start of encryption, an entry is made in DDL LOG to indicate 'for which' tablespace (un)encryption process is going on. At successful completion of (un)encryption process, above entry is deleted from DDL LOG.

Normal Scenario

Following 'control flow diagram' depicts usage of DDL LOG in normal (NO CRASH) scenario:

T1 (session thread, alter encryption starts)
|
| (un)encryption) starts
| |--update page0 with operation/progress
| |--start internal trx
| |  |-- insert entry in ddl_log
| |--commit internal trx
| |--mark all pages dirty
| |--delete entry from ddl_log
| (un)encryption ends
|
T1 continues

Crash Scenario

If there is a crash in between when (un)encryption operation was going on. The recovery happens in a background thread i.e. server continues to start. At recover, ddl_log are read and entry for tablespace is found. And then ddl_log are 'replayed'. During this, entry is read and deleted from ddl_log. Following 'control flow diagram' depicts this behavior

T1 (bootstrap thread)
|--read ddl_log
|--replay all ddl_log
|  |-- read all tablespace entries and keep them in in-memory list.
|
|----------ASYNCH---------------------BG Thread which would resume (un)encryption operation.
| server continues to start           |--loop starts
|                                     |---for each tablespace in list, 
|                                     |----read page0 to get operation/progress.
|                                     |----resume (un)encryption operation onwards.
| server ready to accept connections  |--loop ends
|                                     |--delete ddl_log entry
|                                     BG Thread finishes.

NOTE : Need to make sure this thread is also instrumented in PFS.

Progress Monitor

A new instrument stage/innodb/alter tablespace ((un)encryption) is to be added.

mysql> select * from performance_schema.setup_instruments where name like "%alter tablespace%";
+------------------------------------------------------+---------+-------+------------+------------+---------------+
| NAME                                                 | ENABLED | TIMED | PROPERTIES | VOLATILITY | DOCUMENTATION |
+------------------------------------------------------+---------+-------+------------+------------+---------------+
| stage/innodb/alter tablespace ((un)encryption)       | YES     | YES   | progress   |          0 | NULL          |
+------------------------------------------------------+---------+-------+------------+------------+---------------+

Status Monitor

In (new) column SPACE_STATE, tablespace should be sown as 'BUSY'.

mysql> select * from information_schema.INNODB_TABLESPACES;
+------------+------------------+-------+ ... +----------------+-------------+
| SPACE      | NAME             | FLAG  |     | ALLOCATED_SIZE | SPACE_STATE |
+------------+------------------+-------+ ... +----------------+-------------+
|         56 | my_ts            | 26624 |     |          98304 | BUSY        |
+------------+------------------+-------+ ... +----------------+-------------+

Crash Recovery

During recovery after crash, unfinished (un)encryption operation is roll-forward. NOTE, this roll-forward is done in a background thread and recovery doesn't wait for this but it continues to start the server. So it is possible that server is started an ready to accept connection but (un)encryption process is being done in background.

During recovery, DDL logs are read to get the tablespaces for which (un)encryption process id unfinished. With this information, page 0 for each tablespace is read to make sure :

  • Which operation was goping on (Encryption or Unencryption)
  • Till what page # operation was successful.

And after that, respective operation is resumed for each tablespace. Details below:

Following datastructure is used to keep the tablespace, operation and progress :

struct recv_encryption_ts_record {
        /** tablespace id */
        space_id_t      space_id;

        /** Operation type Encryption/Unencryption */
        byte            operation;

        /** Progress, Max number of pages operation is done with */
        uint            progress;
};

/* A list where each element represents (Un)encryption progress for a tablespace */
extern std::list recv_encr_ts_list;

With above list (recv_encr_ts_list) is populated, we have information of all tablespaces with unfinished (Un)Encryption operation. Following method is to be executed in a separate thread to resume the operation.

/** Resume Encrypt/Unencrypt for tablespace(s) post recovery.
@param[in]      record_list     list (Tablespace, operation and progress)
@return 0 for success, otherwise error code */
dberr_t
fsp_resume_encryption_unencryption(
        std::list record_list)
{
  while (auto it : record_list)
  {
     fsp_encrypt_or_unencrypt_tablespace (it.space_id, it.progress + 1, it.operation)
  }
}

NOTE : There is a single background thread, doing this resume operation, so tablespaces (un)encryption resume operation will be done in sequentially. Which should be Ok as this is not very frequent scenario.