In my previous post (here) I talked about “InnoDB Transparent tablespace encryption” which was introduced in MySQL 5.7.11. That encryption was limited to file-per-table tablespaces. In MySQL 8.0.13, encryption for general tablespace is introduced. This blog post aims to give a peek into this new feature.
External Details:
New Syntax :
New syntax is added for CREATE/ALTER general tablespace for encryption property:
1 2 |
CREATE TABLESPACE my_tablespace ENCRYPTION='Y'/'N'; ALTER TABLESPACE my_tablespace ENCRYPTION='Y'/'N'; |
NOTE: If no encryption flag is provided during creation of the tablespace, it is created unencrypted (current behavior).
With the new data dictionary in 8.0 we can now have tablespace attributes in the server. In previous versions this was not possible to the .frm file base data dictionary. The server did not support tablespaces. Therefore we had a kludge that mapped files (tables) in the Server to InnoDB tablespaces. If you now create an encrypted tablespace or alter the tablespace attributes and encrypt it then all the tables in that tablespace will be encrypted automatically.
If a table is moved in between tablespaces, an UNENCRYPTED tablespace to an ENCRYPTED tablespace is allowed however the reverse operation will signal an error. This is to avoid unintentional data decryption.
INPLACE encryption:
Transparent Data Encryption (TDE), introduced in 5.7.11, uses method ‘COPY’ to do tablespace encryption i.e. a new (encrypted) copy of table is created and then data from old table is copied into new table. The general table-space encryption introduced in 8.0.13 uses method ‘INPLACE’.
NOTE: DML operations are allowed on the tables in tablespace while it is being encrypted. DDLs however are blocked for the duration of the encryption operation.
Observability :
We are constantly trying to improve the observability of long running operations and internal state. For this purpose a new column is added in Information Schema for information_schema.innodb_tablespaces to indicate whether the tablespace is encrypted or not.
1 2 3 4 5 6 7 8 9 |
mysql> select SPACE, NAME, ENCRYPTION from information_schema.innodb_tablespaces; +-------+------------+------------+ | SPACE | NAME | ENCRYPTION | +-------+------------+------------+ ... | 4 | test/t1 | N | | 5 | encrypt_ts | Y | +-------+------------+------------+ 1 row in set (0.00 sec) |
Progress Monitoring:
With the help of a new instrument ‘stage/innodb/alter tablespace (encryption)’, progress of ALTER ENCRYPTION can be monitored in real time. Eg.,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
mysql> select * from performance_schema.events_stages_current\G *************************** 1. row *************************** THREAD_ID: 40 EVENT_ID: 14 END_EVENT_ID: NULL EVENT_NAME: stage/innodb/alter tablespace (encryption) SOURCE: ut0stage.h:537 TIMER_START: 25222469814000 TIMER_END: 34781673863000 TIMER_WAIT: 9559204049000 WORK_COMPLETED: 10 WORK_ESTIMATED: 575 NESTING_EVENT_ID: 11 NESTING_EVENT_TYPE: STATEMENT 1 row in set (0.00 sec) |
1 2 3 |
Where WORK_ESTIMATED : Total number of pages in tablespace WORK_COMPLETED : Total number of pages processed in tablespace |
Internal Details
I/O layer i is responsible for the actual encryption/decryption processing. The entry/decrypt operation is done “transparently” during a read/write operation of a page from/to disk.
Reading from Disk :
If page is marked as encrypted, decrypt it using tablespace encryption information after reading it from the disk.
Writing to Disk :
if a tablespace is marked encrypted, encrypt page using tablespace encryption information before writing it to the disk.
This tablespace encryption information is maintained in page 0 (Header page) as described here.
So algorithm to encrypt an un–encrypted tablespace could be described as,
- Store encryption information (like tablespace key, IV etc) on page 0 (header page)
- Mark tablespace to be encrypted (in memory information)
- Read all tablespace pages one by one and mark them dirty
- Now when dirtied pages are flushed to disk,the I/O layer will know that these pages are to be encrypted and thus will encrypt them before they are written on disk.
Crash Recovery:
There is always a possibility of crash during encryption process is going on and it has to be taken care of. ENCRYPTION operation is always ROLL-FORWARD and never ROLLBACK.
We have two options here:
- During server restart after crash, find out what all tablespaces were being encrypted and RESTART the process for them.
- During encryption, keep track of the maximum page number (MaxPageNo) which have been encrypted and then during restart after crash, RESUME encryption process from next page onward.
Consider a case of very large tablespace. If crash happens just before the last page is processed, then using option 1 above would be significantly expensive and would be doing a lot of rework. So we opted for option 2 to avoid this unnecessary rework.
To make option 2 above work, we need to persist following two information :
- operation being performed (encryption or unencryption)
- progress of operation (till which page operation is completed).
5 bytes on page 0 (header page) is used to keep this information as described below:
- 1 byte for ‘Operation type’
- 4 bytes for ‘Progress’
1 |
|...|<operation type + progress info>|...| |
Operation Type (1 byte)
1 2 |
Encryption (0000 0001) Unencryprion (0000 0010) |
Progress (4 bytes)
Its MaxPageNo till which (Un)Encryption process is completed successfully.
1 |
Till page 10 its done (00000000 00000000 00000000 00001010) |
So after crash, during recovery, (un)encryption operation will resume from page 10.
DDL LOG :
When encryption of a tablespace is being altered, this DDL is logged in DDL LOG. DDL LOGs are read and applied first in case of crash/recover.
REDO LOG :
When encryption operation starts, page 0 is update with the operation type. When a page in tablespace is marked dirty, this page modification is REDO logged. Then, page 0 (header page) is modified with progress (i.e. update MaxPageNo).
CRASH-RECOVERY and RESUME-OPERATION
So crash/recovery for tablespace encryption happens as follow:
- DDL LOG recovery : Get, for what all tablespaces (un)encryption was in progress.
- REDO LOG recovery :
- Read/Apply changes to all dirty pages
- page 0 (header page) to find out WHAT operation to be resumed and FROM WHICH page operation is to be resume.
- From the page number (found in 2 above), resume operation (found in 2 above).
Now, step 3 above, could have been done in bootstrap thread i.e. main thread which is doing recovery. But consider a case, where there were encryption for multiple and large tablespaces was in progress before crash. It would have caused significant delay in server start if RESUME is done in main thread. So it was decided to have this RESUME operation to be done in background thread and let the main thread go ahead and continue to start the server.
NOTE: User can connect to server and start running queries while the background thread is busy doing (un)encryption. A possibility of crash during RESUME operation can’t be denied and is also taken care of i.e. RESUME will start again in next server start.
SUMMARY
- Encryption-at-rest is extended to general tablespaces so tables, which share tablespaces, can have data encrypted. NOTE: system tablespace is not covered in this feature.
- User might have specific requirement to have tables in general tablespace (For ex: to limit number of file handlers which would be more in case of file-per-table tablespace). With this feature, user can now have encryption-at-rest for that scenario as well.
- DMLs in parallel are allowed to ensure minimal impact on response to user.
- More insight of tablespace properties with extra column ENCRYPTION in information_schema.innodb_tablespaces.
- Encryption process is INPLACE and is always ROLLFORWARD.
- Monitoring of encryption process with performance_schema instrument.
For more information, click (here).