WL#13775: InnoDB: Encrypt DBLWR files
Currently, all the pages in the double write files are unencrypted. Even pages belonging to encrypted tablespaces, remain unencrypted in the double write files. The goal of this worklog is to enhance the security of the double write files. The purpose of this worklog is to ensure that the pages in double write files belonging to encrypted tablespaces are also encrypted. Likewise, the pages in dblwr files that belong to unencrypted tablespace remain unencrypted. There is no separate encryption key for the double write files. The pages are encrypted making use of the respective tablespace encryption keys. The same encrypted page that will be written to the data file of a tablespace is also written to the double write file. The high level steps would be:
- If transparent compression is enabled for a tablespace, do compression.
- If encryption is enabled for a tablespace, do encryption using the tablespace key.
- Flush the compressed+encrypted page to the dblwr file.
- Flush the same compressed+encrypted page to the data file of tablespace.
The double write files now contain different types of pages depending on the tablespaces. Some of them are unencrypted, some are encrypted with their respective tablespaces keys, and the remaining are compressed+encrypted with their respective tablespace keys.
Functional Requirements:
F1 - If a tablespace is encrypted, then its pages in the doublewrite file will also be encrypted using the same tablespace encryption key. The same copy of the page will be written to both the dblwr file and the data file.
F2 - If a tablespace is unencrypted, then its pages in the doublewrite file will also be unencrypted.
Non Functional Requirements:
NF1 - Since the same encrypted page is used for the data file and the dblwr file, there should be no overhead because of this WL.
- For an encrypted tablespace, a data page is encrypted once and it used for both the data file and the dblwr file.
- During recovery, if the dblwr page is encrypted, it is decrypted using the tablespace encryption key and checked for corruption.
Introduction:
When doublewrite is disabled, the transparent encryption and compression are done at the I/O layer as before. When it is enabled, we had to consider the following:
1. When a tablespace is encrypted, its buffer pages are encrypted only once and the same encrypted copy is used for both the doublewrite buffer/file as well as the data file.
2. When a tablespace is compressed, its buffer pages are compressed only once and the same compressed copy is used for both the dblwr buffer/file as well as the data file.
3. When a tablespace is compressed and encrypted, do compress first followed by encrypt and do these operations only once. The result is then used for both the dblwr file and the data file.
This means that the transparent compression and encryption must happen at the double write layer itself. I have chosen the function dblwr::write() to do this compression and encryption.
Compressed and/or Encrypted Frame:
The uncompressed buffer frame is contained in the buf_page_t structure itself. Once it is compressed and/or encrypted, it is held in a separate memory he file::Block structure. There was already a cache of these file::Block objects used for transparent compression and encryption. The same cache is used now also.
This file::Block* object for a corresponding buf_page_t structure is then passed around as an extra argument. The length of compressed data is also passed around as an extra argument.
Recovery:
Previously, the double write buffers are never encrypted. So checksum verification can be done directly. But now, since the double write file pages are encrypted, they need to be first unencrypted before checksum verification can be done. This is done in dblwr_recover_page() function.
Background Notes:
The number of double write (dblwr) files is given by dblwr::n_files. Each of these files are divided into two parts - the first part is for batch writing and the second part is for single page writing. The single page writing is used during sync page flushes. These are referred to as batch segments and single segments in our code base.
In a batch segment, we combine dblwr::n_pages and flush them together. In a single segment, each individual pages are flushed independently. All these pages use the unencrypted page even for encrypted tablespace. In this WL, we will use the encrypted page for those pages belonging to an encrypted tablespace.