WL#10957: Binary log encryption at rest

Affects: Server-8.0   —   Status: Complete   —   Priority: Low

This worklog implements the ability to encrypt binary and relay log files online. Encryption is optional. Users can decide whether the files should be encrypted or not.

User stories

  • As a MySQL DBA/operator/instance owner, I want to enable replication logs encryption so that no data shall be leaked to an operating system user having access to the file system where the MySQL server instance stores binary and relay log files.

  • As a Security Administrator, I want both binary log and relay log to be encrypted to follow the security compliance requirements.

  • As a backup tool I shall be able to truncate a given encrypted binary log file after copying it in order to store the copy aligned with the server's master positions of the snapshot without having to rewrite contents on the encrypted file.

Scope

  • Binary log cache encryption is not considered in this worklog.

Functional requirements

F-1. MySQL shall provide a way to generate encrypted binary and relay log files.

F-2. Binary and relay logs encryption shall be monolithic. When encryption is on, both binary and relay log files shall be encrypted. When encryption is off, both binary and relay log files shall not be encrypted.

Note: In below FRs, 'binary log' is used to describe both binary and relay logs.

F-3: Only new files shall be encrypted after the user enables binlog encryption.

F-4: Enabling or disabling binlog encryption shall rotate binary logs immediately, so the new configuration takes effect.

F-5: The encryption algorithm used to encrypt binary log files shall not be configurable.

F-6: The binlog master key management (creation/removal/retrieval) shall be handled by keyring service for MySQL.

F-7: A replica (including mysqlbinlog client) shall always receive replicated stream unencrypted, regardless of binlog encryption settings on the source server.

F-8: It shall be possible to truncate an encrypted binary log file without losing access to the contents stored before the truncated position.

F-9: It shall be possible to rotate the binlog master key.

Non-functional requirements

NF-1: It shall be possible to seek efficiently in the binlog stream in order to skip transactions or start at a given position.

NF-2: It shall support to upgrade MySQL.

Interface Specification

I-1: binlog_encryption

It is a new global variable for enabling or disabling binary log encryption. It should be persistable and settable on command line and in configuration files.

Item Detail
Scope Global
Dynamic Yes
Type Boolean
Required privileges to change it on-line SUPER or BINLOG_ENCRYPTION_ADMIN (See I-18)

It controls the encryption behavior of below files:

  • binary log files of this server, when binary logging is enabled.
  • relay log files of all channels (including GR channels) on this server.

Note: binary and relay log index files are not encrypted regardless of the encryption behavior.

I-2: Encrypted binary log file format

It is a new binary log file format used only when binlog_encryption option is on. Old plain binary log file format shall still be used when binlog_encryption is OFF. See details in "High Level Specification" section.

I-3: No new files

I-4: No changes in existing syntax

I-5: No new commands

I-6: No new tools

I-7: New "[ERROR]" message:

ER[_SERVER]_RPL_ENCRYPTION_FAILED_TO_FETCH_KEY "Failed to fetch key from keyring, please check if keyring plugin is loaded."

I-8: New "[ERROR]" message:

ER[_SERVER]_RPL_ENCRYPTION_KEY_NOT_FOUND "Can't find key from keyring, please check in the server log if a keyring plugin is loaded and initialized successfully."

I-9: New "[ERROR]" message:

ER[_SERVER]_RPL_ENCRYPTION_KEYRING_INVALID_KEY "Fetched an invalid key from keyring."

I-10: New "[ERROR]" message:

ER[_SERVER]_RPL_ENCRYPTION_HEADER_ERROR "Error reading a replication log encryption header: %s."

I-11: New "[Warning]" message:

ER[_SERVER]_RPL_ENCRYPTION_FAILED_TO_ROTATE_LOGS "Failed to rotate some logs after changing binlog encryption settings. Please fix the problem and rotate the logs manually."

I-12: New "[ERROR]" message:

ER[_SERVER]_RPL_ENCRYPTION_KEY_EXISTS_UNEXPECTED "Key %s exists unexpected."

I-13: New "[ERROR]" message:

ER[_SERVER]_RPL_ENCRYPTION_FAILED_TO_GENERATE_KEY "Failed to generate key, please check if keyring plugin is loaded."

I-14: New "[ERROR]" message:

ER[_SERVER]_RPL_ENCRYPTION_FAILED_TO_STORE_KEY "Failed to store key, please check if keyring plugin is loaded."

I-15: New "[ERROR|Warning]" message:

ER[_SERVER]_RPL_ENCRYPTION_FAILED_TO_REMOVE_KEY "Failed to remove key, please check if keyring plugin is loaded."

I-16: New "[ERROR]" message:

ER_RPL_ENCRYPTION_UNABLE_TO_CHANGE_OPTION "Failed to change replication_log_encrypt value. %-.80s."

I-17: New "[ERROR]" message:

ER_SERVER_RPL_ENCRYPTION_UNABLE_TO_INITIALIZE "Failed to initialize binlog encryption, please check if keyring plugin is loaded."

I-18: New dynamic privilege: "BINLOG_ENCRYPTION_ADMIN"

The new privilege shall be required for enabling and disabling the binlog_encryption option.

I-19: New column "Encrypted" at "SHOW BINARY LOGS" output

The SHOW BINARY LOGS statement list existing binary log files and their actual file size. A new column Encrypted shall provide boolean information:

  • NO: file is not encrypted;
  • YES: file is encrypted.

Note: for encrypted binary log files, the file size will not match the plain binary log events stream size because of the encryption header.

I-20: binlog_rotate_encryption_master_key_at_startup

It is a new global read only variable for rotating the binlog encryption master key at server startup. It should be "persistable only" and settable on command line and in configuration files.

Item Detail
Scope Global
Dynamic No
Type Boolean

When enabled with binlog_encryption disabled, the server shall throw a warning (see I-21).

When enabled but server fails to prepare the kerying for the master key rotation, the server shall throw an error and abort (see I-22).

I-21: New "[Warning]" message:

ER_SERVER_RPL_ENCRYPTION_IGNORE_ROTATE_MASTER_KEY_AT_STARTUP "Ignoring binlog_rotate_encryption_master_key_at_startup because binlog_encryption option is disabled."

I-22: New "[ERROR]" message:

ER_SERVER_RPL_ENCRYPTION_UNABLE_TO_ROTATE_MASTER_KEY_AT_STARTUP "Failed to rotate binlog encryption master key at startup, please check if keyring plugin is loaded."

High Level Specification

The replication logs encryption feature introduces a new binary log file format (I-2) to be used by encrypted binary and relay log files.

The new format shall only be used by encrypted log files. Unencrypted binary and relay log files shall keep using standard plain binary log format. This means that once the binlog_encryption variable is changed, the server shall rotate immediately the binary and all relay logs to start new binlog/relay log files respecting the new configuration (generating either encrypted or plain files).

Two tiers encryption

The replication logs encryption is designed to use two tiers: a per file password, used to encrypt/decrypt binary log file content and a replication encryption key, used to encrypt/decrypt sensitive data (the file password) in the encrypted binary log file header.

  | 2nd tier                    | 1st tier                   |
  |-----------------------------|----------------------------|

  +----------------+  protects  +---------------+  protects  +-----------+
  |   Replication  | ---------> | File Password | ---------> | File Data |
  | Encryption Key |            +---------------+            +-----------+
  +----------------+

A given replication encryption key may be used to encrypt/decrypt many binary and relay log files passwords, while a file password is intended to encrypt/decrypt a single binary or relay log file.

A set of binary and relay log files may rely on a set of replication encrypted keys, but the use of multiple replication master keys is not in the scope of this worklog (even though the general design shall support it).

  REK = Replication Encryption Key
  FP = File Password

        protects
  REK1 -----------+---+---+---+-------+
                  |   |   |   |       |
                  v   v   v   v       v
                 FP1 FP2 FP3 FP4 ... FPx            --> rely on REK1

        protects
  REK2 ------------+---+-----+-----+---------+
                   |   |     |     |         |
                   v   v     v     v         v
                 FPm FPm+1 FPm+2 FPm+3 ... FPm+y    --> rely on REK2

   .
   .
   .

        protects
  REKj ------------+---+-----+-----+---------+
                   |   |     |     |         |
                   v   v     v     v         v
                 FPn FPn+1 FPn+2 FPn+3 ... FPn+z    --> rely on REKj

From design point of view, it shall be possible to replace an encrypted binary log file header with a new one, encrypting the file password with a new replication encryption key. So, if a given replication encryption key (REK2 for example) is suspected to be compromised, it shall be possible for a server to generate a replacement replication encryption key (REKj+1) and then iterate over all encrypted log files to "re-encrypt" their passwords (from FPn+z to FPm, last to first) by overwriting the encrypted file header with one relying on the new key.

Replication encryption keys

Replication encryption keys shall be created/stored into/retrieved from a keyring using a keyring service for MySQL. It shall be used to protect sensitive information in the new binary log encrypted file header.

They shall be identified in the keyring using the following format:

  • MySQLReplicationKey_{UUID}_{SEQ_NO}

Where:

  • MySQLReplicationKey is a prefix;
  • {UUID} is the MySQL server's UUID that generated the key;
  • {SEQ_NO} is the global replication master key sequence number.

For the purpose of this worklog, the sequence number will always be 1. Future worklogs shall handle the generation replication encryption keys with sequence numbers greater than 1.

The last replication encryption key generated by a server is the key that shall be used to protect new binary and relay log files passwords. This key will be referenced in this worklog design as replication master key.

File password

Each per file password shall be randomly generated (by using my_rand_buffer) on file creation and stored encrypted using the replication master key into the encrypted binary log file header.

Conceptually, knowing its file password shall be enough to decrypt a given encrypted binary log file. But, as each file password is stored encrypted (in the encrypted binary log file header) using a replication encryption key, an encrypted binary log file can only be decrypted when knowing the replication encryption key that encrypted its file password.

Backup and recovery procedures shall always take into consideration the backup and recovery of the keyring when binary log encryption is (or was) enabled. Losing replication encryption keys still required for server operation (i.e. that protects existing binary or relay log files passwords) may lead to not being able to startup the server at all.

Backup and recovery of file passwords are not a topic to be concerned about as they are stored in the encrypted binary and relay log files headers (and backed up/recovered with the files).

Encrypted binary log file format

The new encrypted binary log file format is composed of two parts: the encrypted binary log file header and the encrypted data:

  +---------------------+
  |  Encryption Header  |
  +---------------------+
  |   Encrypted Data    |
  +---------------------+

Encrypted binary log file header

The header of an encrypted binary log file includes information to be used by the server when decrypting the encrypted data.

It has fixed length of 512 bytes including some reserved space. 512 bytes is write atomic (when writing or updating the header, either the whole header is written into disk successfully or nothing is written into disk).

It also gives us the ability to add more content to the header without having to shift the encrypted data.

  +------------------------+----------------------------------------------+
  | MAGIC HEADER (4 bytes) | Replication logs encryption version (1 byte) |
  +------------------------+----------------------------------------------+
  |             Replication Encryption Key ID (60 to 69 bytes)            |
  +-----------------------------------------------------------------------+
  |                   Encrypted File Password (33 bytes)                  |
  +-----------------------------------------------------------------------+
  |               IV For Encrypting File Password (17 bytes)              |
  +-----------------------------------------------------------------------+
  |                       Padding (388 to 397 bytes)                      |
  +-----------------------------------------------------------------------+
              Encrypted binary log file header format version 1

Magic header (4 bytes):

Fixed 4 bytes content: 0xFD62696E. It is similar to Binlog Magic Header (which is: 0xFE62696E).

Reading the magic header shall be enough to distinguish from encrypted and unencrypted binary log files.

Replication logs encryption version (1 byte):

It is 1 in this worklog design.

The version 1 defines the following:

  • 512 bytes to encrypted file header;
  • Replication encryption key is of AES type with 32 bytes;
  • IV for encrypting file password is a random string with 16 bytes;
  • File password with 32 bytes and encrypted using aes-cbc-256 (stored in the header as encrypted file password);
  • Data stream is encrypted/decrypted using aes-ctr-256;
  • File key and nonce to encrypt/decrypt data stream is generated from file password with 1 round SHA512 hash with no salt;
  • The IV used to encrypt/decrypt data stream is composed by the upper 64 bits of the nonce in the IV's upper 64 bits and the block counter in the IV's lower 64 bits;

The version shall be bumped in the future if any of above mentioned definitions are changed.

Replication encryption key ID:

Key ID of the replication master key that encrypted the file password. It has variable length.

The key ID will be used when dealing with keyring operations.

It is stored in the header using TLV format (see below).

  Length:
   TLV type                (1 byte)  +
   String length           (1 byte)  +
   "MySQLReplicationKey_" (20 bytes) +
   Server UUID            (36 bytes) +
   "_"                     (1 byte)  +
   Sequence Number         (1 byte minimum | 10 bytes maximum)
                         --------------------------------------
                           60 (minimum)    | 69 (maximum)

Encrypted file password (32 bytes + 1 for type):

A random string encrypted with replication master key and IV. It is used to generate the key for encrypting/decryption the binary log file content.

It is stored in the header using TV format (see below).

It has fixed length of 32 bytes.

IV For Encrypting File Password (16 bytes + 1 for type):

It is used with the replication master key to encrypt the file password.

It is stored in the header using TV format (see below).

It has fixed length of 16 bytes.

Padding:

Unused header space (filled with 0).

TV Format:

  offset: 0        1
         +--------+---------+
         |  Type  |  Value  |
         +--------+---------+

Type uses 1 byte. Value has a fixed length.

TLV Format:

  offset: 0        1          2 to 10
         +--------+----------+---------+
         |  Type  |  Length  |  Value  |
         +--------+----------+---------+

Type uses 1 byte. Length uses 1 to 9 bytes (see net_field_length()). Value has a variable length.

Version 1 field types:

Type Description
1 Replication encryption key ID
2 Encrypted file password
3 Initialization Vector (IV)

Encrypted data:

It is the encrypted binary log file starting from the binlog magic header.

Note: Users should be able to decrypt the encrypted data by using openssl command. We use openssl command to verify the encryption is correct in this worklog.

File Password

It is 32 bytes random data generated by using my_rand_buffer.

File encryption

  • File key and nonce are used to encrypt file data. They are generated by using file password with 1 round SHA512 hash. SHA512 digest has 512 bits (64 bytes). The first 32 bytes of the digest are used as file key. And the following 8 bytes are used as the nonce.
  • AES uses 16 bytes IV. The IV we use is composed of 8 bytes from nonce and 8 bytes from a block counter (IV = nonce | counter).
  • As each AES block contains 16 bytes (128 bits), the ciphers are able to address up to 16 * 2^(8*8) (262144 petabytes) of encrypted data.

Replication encryption keys sequence number

Each replication encryption key has its own sequence number.

We use a sequence number to allow a server to rely on multiple replication encryption keys (even from the same server UUID).

Let's consider some properties:

P-1: A server only generates/remove own keys

A server shall never generate or remove keys on behalf of other server instance, as this may lead to make other server instances to lose access to encrypted files when they share a keyring service.

P-2: A server shall be able to determine the replication master key

A server shall always be able to determine the replication master key it shall use to encrypt the file password of new binary and relay log files if needed (i.e. if the binlog_encryption option is ON).

P-3: On automatic replication master key generation

A server shall only generate a new replication master key automatically when binlog_encryption option is set to ON and the server is not aware of any other previously generated replication encryption key.

Setting binlog_encryption to ON in this case shall generate a replication encryption key ID with sequence number = 1 to be used as replication master key. If the replication key with sequence number = 1 already exists, the automatic key generation shall increase the sequence number and try again until finding a sequence number with no respective replication key in the keyring.

P-4: A server shall avoid losing keys

A server shall never overwrite a key in the keyring. Overwriting a key when files still rely on it will lead to loss of access to the files contents.

A server shall never remove a key from keyring without being sure that the key is not needed anymore. When there are issues in either binary or relay log initialization (i.e. a file listed in an index file is not readable), a server shall not take actions that could turn inaccessible a key that may be required after fixing binary or relay log issues.

P-5: On replication master key sequence number generation

The sequence number used to identify the replication master key shall increase by 1 once the server is requested to rotate the replication master key.

P-6: A server shall be able to cleanup unused keys from keyring

Keeping old replication encryption keys (not needed to decrypt existing files) on the keyring may lead to issues in the keyring access (see BUG#22607137).

P-7: Replication master key rotation shall trigger logs rotation

Once a new replication master key is generated, the server shall flush binary (first, if enabled) and relay logs for all channels to force their rotation start using the newly generated key.

Note: Group replication channels are not rotated when the logs are requested to be flushed. So encryption setting will not be effective on group replication channels until their logs are rotated. This is not a big issue because GR also require binary logging enable on the server, and the binary log can be used as main source for the replication master key.

P-8: Re-encryption of already encrypted files

Re-encryption of already encrypted files (changing the replication encryption key) shall traverse the file index, moving from the last to the first elements.

P-9: Replication master key rotation shall be an atomic operation.

When stopped in the middle of a replication master key rotation process, the server shall be able to continue the process once restated.

P-10: Replication logs re-encryption shall be an atomic operation.

When stopped in the middle of a re-encryption process, the server shall be able to continue the process once restated.

Master key identification, rotation and keyring cleanup

The replication encryption infrastructure will rely on the keyring for storing the sequence number of the replication master key, the sequence number of the last replication encryption key that was purged from keyring and also for storing transient states of the replication master key rotation.

These are the keys that can be generated/retrieved/removed from keyring:

Key ID What is stored in keyring Type of what is stored in keyring
MySQLReplicationKey_{UUID}_{SEQNO} An ordinary replication encryption key A key (i.e. AES key) to encrypt/decrypt files passwords
MySQLReplicationKey_{UUID} The sequence number of the replication master key A sequence number (uint32_t)
MySQLReplicationKey_{UUID}_last_purged The sequence number of the last purged replication encryption key A sequence number (uint32_t)
MySQLReplicationKey_{UUID}_old (*) The sequence number of the master key when the replication master key rotation started A sequence number (uint32_t)
MySQLReplicationKey_{UUID}_new The sequence number of the new master key to be used after replication master key rotation finishes A sequence number (uint32_t)

The keys storing sequence numbers shall use AES key type with 16 bytes key length. See https://dev.mysql.com/doc/mysql-security-excerpt/5.7/en/keyring-key-types.html for additional info.

A-1: Replication master key rotation

Replication master key rotation shall only be possible when binlog_encryption is on.

Requests to rotate the replication master key when binlog_encryption is off shall return an error (I-13).

If this master key is the server first master key rotation, the server will abort to start reporting I-17.

A-1.1: Set that the master key rotation has started

Store a key "MySQLReplicationKey_{UUID}_old" with the master key SEQNO as its value on keyring.

This is not needed on WL#10957 as it shall only have a particular case of master key rotation (it only generates the replication encryption key with sequence number = 1).

A-1.2: Determine the next SEQNO to be used by a new master key

Do: master key SEQNO = master key SEQNO + 1 (by P-5). Then, check if a replication encryption key exists on keyring with the new master key SEQNO.

If the key exists the server shall throw an error (I-12).

In order to be compliant with P-4, the server shall repeat this process until reaching a master key SEQNO without a corresponding replication encryption key on keyring.

Store a key "MySQLReplicationKey_{UUID}_new" with the master key SEQNO as its value on keyring. If unable to store the key, the server shall throw an error (I-14).

A-1.3: Generate the new replication master key

Request the keyring to generate a new "MySQLReplicationKey_{UUID}_{SEQNO}" key using master key SEQNO as SEQNO.

If the server cannot generate the new key, it will throw an error (I-13).

A-1.4: Remove old master key "index" from keyring

Remove the "MySQLReplicationKey_{UUID}" key from keyring (if it exists). If unable to remove the key, the server shall throw an error (I-15).

A-1.5: Store new master key "index" on keyring

Store "MySQLReplicationKey_{UUID}" on keyring with the new master key SEQNO. This will allow the server to be compliant with P-2. If unable to store the key, the server shall throw an error (I-14).

A-1.6: Rotate replication logs

This is needed to be compliant with P-7.

Server may throw a warning message (I-11) when it fails to rotate the binary log or any replication channel relay log.

A-1.6.1: Rotate the binary log

When binary logging is enabled, rotate the binary log to create a new binary log file encrypted with new master key.

A-1.6.1.1: Get minimum SEQNO from binary logs

Starting from the first entry on the index file, iterating up to the last one: - If the file is encrypted, set "minimum SEQNO" as the SEQNO from the encryption key ID in the encryption header and exit the loop. - If unable to open the file, set "minimum SEQNO" as 0 to avoid key cleanup and exit the loop.

A-1.6.2: Rotate the relay logs

For each existing replication channel, rotate the relay log (except for GR applier) to create a new relay log file encrypted with new master key.

A-1.6.2.1: Get minimum SEQNO from relay logs

Starting from the first entry on the index file, iterating up to the last one: - If the file is encrypted, set "minimum SEQNO" as the SEQNO from the encryption key ID in the encryption header and exit the loop. - If unable to open the file, set "minimum SEQNO" as 0 to avoid key cleanup and exit the loop.

A-1.6.3: Evaluate minimum SEQNO used

The minimum SEQNO used shall be the minimum SEQNO from all minimum SEQNO from binary and relay logs. If the server is not able to evaluate the minimum SEQNO used by binary logs (when enabled) or by any of the replication channels, minimum SEQNO used shall be set to 0 and a warning shall be thrown.

At this point, the replication encryption infrastructure shall have two sequence numbers to drive the replication encryption keys cleanup from keyring:

  • The last purged SEQNO: retrieved from keyring's "MySQLReplicationKey_{UUID}_last_purged" or 0 when the key is not present on keyring;
  • The minimum SEQNO used: retrieved from binary and relay log encrypted files.
A-1.7: Purge unused old replication encryption keys from keyring

Set purge start = last purged SEQNO. If purge start is 0, make purge start = 1. Set purge end = minimum SEQNO used - 1.

Loop from purge start to purge end, remove all the keys like "MySQLReplicationKey_{UUID}_{SEQNO}" from the keyring.

If purge end > 0 and purge end != last purged SEQNO: make last purged SEQNO = purge end, remove "MySQLReplicationKey_{UUID}_last_purged" key from keyring and store "MySQLReplicationKey_{UUID}_last_purged" key into keyring with the new last purged SEQNO.

Remove "MySQLReplicationKey_{UUID}_old" key from keyring.

A-1.8: Finalize replication master key rotation

Remove "MySQLReplicationKey_{UUID}_new" key from keyring. If unable to remove the key, the server shall throw a warning (I-15).

Replication master key rotation procedure is complete.

A-2: Replication encryption infrastructure initialization

When binlog_encryption is on, a keyring service must be available. When binlog_encryption is off, a keyring service might be available to read encrypted binary and relay log files.

When binlog_encryption is on, call A-3.

When failed to set binlog_encryption to on or when binlog_encryption is off, call A-4.

A-3: Enable binlog encryption

Make last purged SEQNO = 0, master key SEQNO = 0, master key = <empty>, old master key SEQNO = -1, new master key SEQNO = -1.

Try to retrieve "MySQLReplicationKey_{UUID}" from keyring. Store its content into master key SEQNO.

Try to retrieve "MySQLReplicationKey_{UUID}_last_purged" from keyring. Store its content into last purged SEQNO.

Try to retrieve "MySQLReplicationKey_{UUID}_old" from keyring. Store its content into old master key SEQNO.

Try to retrieve "MySQLReplicationKey_{UUID}_new" from keyring. Store its content into new master key SEQNO.

Take action according to the following table:

master key SEQNO old master key SEQNO new master key SEQNO new master key Action
doesn't exist doesn't exist doesn't exist N/A This is the first time the option is enable. Rotate the master key SEQNO to '1' (*).
n > 0 doesn't exist doesn't exist N/A Ordinary server startup with known replication master key.
n > 0 n doesn't exist N/A Continue replication master key rotation from A-1.2.
n > 0 n m > n doesn't exists Continue replication master key rotation from A-1.3.
n > 0 n m > n exists Continue replication master key rotation from A-1.4.
doesn't exist n > 0 m > n exists Continue replication master key rotation from A-1.5.
m > 0 n m > n exists Continue replication master key rotation from A-1.6.
m > 0 doesn't exist m exists Continue replication master key rotation from A-1.8.

Any other combination from the table shall be reported as an error failing to set the binlog_encryption to ON and (TODO: abort?).

Set binlog_encryption to on.

(*) Or to the first sequence number without an existing respective replication key in the keyring.

A-4: Disable replication logs encryption

Make last purged SEQNO = 0, master key SEQNO = 0, master key = <empty>.

When binary logging is enabled, rotate the binary log to create a new binary log file with plain binary log file format.

For each existing replication channel, rotate the relay log (except for GR applier) to create a new relay log file with plain binary log file format.

Use cases

Consider the following diagram with events and states about replication logs encryption feature and server operations:

                       (E-1)
   .-----------.   Enable Option    .-----------.
  | Server OFF  | ---------------> | Server OFF  |
  | Option OFF  | <--------------- |  Option ON  |
   '-----------'   Disable Option   '-----------'
        ^ |            (E-2)             ^ |
        | |                              | |
  (E-3) | | (E-4)                  (E-7) | | (E-8)
   Stop | | Start                   Stop | | Start
        | |                              | |
        | V            (E-5)             | V
   .-----------.   Enable Option    .-----------.
  |  Server ON  | ---------------> |  Server ON  |
  | Option OFF  | <--------------- |  Option ON  |
   '-----------'   Disable Option   '-----------'
      ^     |          (E-6)           ^     |
      |     |                          |     |
       '---'                            '---'
   Key Rotation                     Key Rotation
       (E-9)                            (E-10)


     +---------------------------------------+
     |   Option = binlog_encryption option   |
     +---------------------------------------+

Consider also:

  • option: binlog_encryption option;
  • binlog: the binary log;
  • chN: a replication channel;
  • key: a replication encryption key;
  • master key: the replication master key;

Let's now describe some use cases to allow us to define the sequence number recovery logic and its limitations.

UC-01: Encryption off (never on)

A new master key is generated in E-5 or E-8 (by P-3). Disabling and re-enabling the option shall not generate new master keys as there is already one generated and defined as master key (by P-2).

UC-02: Binary logging off and no channels

This server configuration does not take benefit of using the replication logs encryption feature yet, but it is able to enable the option.

If a new master key is generated in E-5 or E-8 (by P-3), it will be stored in the keyring and this information shall not be lost upon server restart (P-2).

If any chN is created after enabling the option, it will generate encrypted relay log files relying on the master key.

UC-03: Binary logging on, no channels, encryption off (on at least once)

On E-4 and E-8 (by P-2) the server the server shall recover the master key.

UC-04: A given server instance is cloned having binlog_encryption on

Consider a server with UUID = UUID_A and binlog_encryption set to on.

After some time operating it shall have all its binary and relay logs containing UUID_A + SEQ_NO = 1 on the encrypted binary log header.

This server is then shut down and its files are copied to clone the server (the configuration file with server UUID is removed so the replica will generate a new UUID).

The replica is started (E-8) and generates a new UUID (UUID_B).

By P-2, the server shall figure out that it never generated a replication encryption key before, even having all binary and relay log already encrypted, because all replication encryption keys used were generated by other server instance (UUID_A). The server shall then follow P-3 and generate a new replication master key with SEQNO = 1.

UC-05: A server unable to initialize the binary log properly

Upon server startup, when binary logging is enable and binary log files already exists, the server initialization will open an read at least the format description event of the last binary log file in order to decide if the server crashed (and need to be recovered) or not.

When the last binary log file cannot be read, the server aborts the startup (regardless of --binlog-error-action option value) with following errors messages (example):

  [ERROR] ... [Server] Binlog has bad magic number;  It's not a binary log
                       file that can be used by this version of MySQL.
  [ERROR] ... [Server] Can't init tc log
  [ERROR] ... [Server] Aborting

So, a server with binlog_encryption option enabled that is unable to initialize the binary log will never be able to generate or remove replication encryption keys because it will not startup at all.

UC-06: A server was unable to initialize a given replication channel properly

Suppose:

File Encrypted SEQNO
relay-bin-ch1.000070 Yes 3
relay-bin-ch1.000071 Yes 3
relay-bin-ch1.000072 Yes 4
relay-bin-ch2.000090 Yes 3
relay-bin-ch2.000091 Yes 3
relay-bin-ch2.000092 Yes 4

master key SEQNO = 4. last purged SEQNO = 2.

Suppose that server was restarted and was unable to retrieve replication encryption keys information from ch1 (the channel failed to initialize).

File Encrypted SEQNO Initialized
relay-bin-ch1.000070 Yes 3 No
relay-bin-ch1.000071 Yes 3 No
relay-bin-ch1.000072 Yes 4 No
relay-bin-ch2.000090 Yes 3 Yes
relay-bin-ch2.000091 Yes 3 Yes
relay-bin-ch2.000092 Yes 4 Yes
relay-bin-ch2.000093 Yes 4 Yes

master key SEQNO = 4. last purged SEQNO = 2. ch1: Not initialized. ch2: Initialized.

The replication threads start for ch2 and purge relay-bin-ch2.000090, create a relay-bin-ch2.000094 and then purge relay-bin-ch2.000091.

File Encrypted SEQNO Initialized
relay-bin-ch1.000070 Yes 3 No
relay-bin-ch1.000071 Yes 3 No
relay-bin-ch1.000072 Yes 4 No
relay-bin-ch2.000092 Yes 4 Yes
relay-bin-ch2.000093 Yes 4 Yes
relay-bin-ch2.000094 Yes 4 Yes

master key SEQNO = 4. last purged SEQNO = 2. ch1: Not initialized. ch2: Initialized.

The DBA/operator request a replication master key rotation.

File Encrypted SEQNO Initialized
relay-bin-ch1.000070 Yes 3 No
relay-bin-ch1.000071 Yes 3 No
relay-bin-ch1.000072 Yes 4 No
relay-bin-ch2.000092 Yes 4 Yes
relay-bin-ch2.000093 Yes 4 Yes
relay-bin-ch2.000094 Yes 4 Yes
relay-bin-ch2.000095 Yes 5 Yes

master key SEQNO = 5. last purged SEQNO = 2. ch1: Not initialized. ch2: Initialized.

Because of the ch1, no keys shall be removed after the rotation.

Key with SEQNO = 3 appears to not be needed anymore, but the server shall not remove it from the keyring because (by P-4) there are channels that weren't initialized (and the server cannot ensure if is not needed anymore).

With respect to allowing replication master key rotation when there are uninitialized channels, it shall be fine. Allowing this (using the example above), once recovering ch1 after rotating the replication master key twice, we may have the following situation:

File Encrypted SEQNO Initialized
relay-bin-ch1.000070 Yes 3 Yes
relay-bin-ch1.000071 Yes 3 Yes
relay-bin-ch1.000072 Yes 4 Yes
relay-bin-ch1.000073 Yes 6 Yes
relay-bin-ch2.000098 Yes 6 Yes
relay-bin-ch2.000099 Yes 6 Yes

master key SEQNO = 6. last purged SEQNO = 2. ch1: Initialized. ch2: Initialized.

After purging relay-bin-ch1.000070 and relay-bin-ch1.000071, a master key rotation shall trigger the removal or key 3 bumping last purged SEQNO to 3 too.

After purging relay-bin-ch1.000072, a master key rotation shall trigger the removal or key 4 bumping last purged SEQNO to 4 too.

Removal of replication encryption keys from keyring

This worklog doesn't remove the replication master key from the keyring, even after RESET MASTER and RESET SLAVE are called.

Limitations

L-1: Encrypted binary log file header format

L-1.1: Global key ID must be at most 255 bytes and use ascii7.
L-1.2: Version must be 1.
L-1.3: Padding must be all zeros.

If the reader finds that any of this does not hold, it shall generate the error defined in I-10 (see Interface Specification).

L-2: Operations

L-2.1: The server does not convert existing plain files to encrypted.
L-2.2: The server does not convert existing encrypted files to plain.

References

Cipher for binary log file encryption

Binary log files are encrypted using aes-ctr-256 mode.

For aes-ctr-256 details, please check:

  • https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#CTR
  • https://tools.ietf.org/html/rfc3686

Cipher for file password encryption

The file passwords are encrypted using aes-cbc-256 mode.

Replication encryption keys are aes-256 keys. Each file password is encrypted by replication encryption key and a random IV. Encrypted file passwords are stored in encrypted binary log file header.

Steps Planning

Step 1: Add code to read encrypted binary/relay log files

An encrypted binary/relay log file shall be readable regardless of replication encryption option.

Step 2: Introduce the option to enable/disable replication encryption

Introduce the binlog_encryption option to enable replication logs encryption.

Step 3: mysqlbinlog changes

Ensure that mysqlbinlog will not crash and will throw appropriate error messages.

Low-Level Design

Replication encryption key ID

The sequence number of key IDs has below rules:

  • The sequence number starts from 1.
  • It will increase 1 each time generating a new replication key. But generating new key is not part of this worklog.
  • The sequence number is defined as unsigned int. So it should never exceed UINT32_MAX. When sequence number arrives UINT32_MAX, it should fail to generate new replication encryption key.

Store master key sequence number into keyring

For recovering replication master key ID sequence number, current sequence number is stored into the keyring.

Sequence number recovery

When starting the server, there is a process to get the master key sequence number and the last purged sequence number from keyring.

Since the maximum sequence number will be used to encrypt the new generated replication logs when binary log encryption is ON, it needs to get the master key sequence number before open any replication log.

So we need to change the initialization process of replication logs.

The original process looks like:

  1. Initialize binary log index. Create a new index file it doesn't exist.
  2. Generate a new binary log and open it.
  3. Read channels information from repositories (either tables or files).
  4. For each channel: 4.1. Initialize index file. 4.2. Generate a new relay log and open it.

The new process looks like:

  1. Initialize master key sequence number and its corresponding replication encription key from keyring.
  2. Initialize binary log index. Create a new index file it doesn't exist.
  3. Generate a new binary log and open it.
  4. Read channels information from repositories (either tables or files).
  5. For each channel: 5.1. Initialize index file. 5.2. Generate a new relay log and open it.

Binary log file encryption (V1)

File Password

It is 32 bytes random data generated by using my_rand_buffer.

File encryption

  • File key and nonce are used to encrypt file data. They are generated by using file password with 1 round SHA512 hash. SHA512 digest has 512 bits (64 bytes). The first 32 bytes of the digest is used as file key. And the following 8 bytes is used as the nonce.

  • Aes uses 16 bytes IV. The IV is 8 bytes nonce plus 8 bytes counter.

IV = nonce | counter.

Counter is stored in big-endian.

  • Salt is not used.

The encryption implementation can be verified by openssl tools:

  openssl enc -d -aes-256-ctr
    -K <KEY> -iv <IV> -nosalt
    -in <encrypted file> -out <decrypted file>

class Rpl_encryption

The class holds the logic of binary log encryption.

Enable/disable encryption

  • bool enable(THD* thd);
  • bool disable(THD* thd);
  • bool is_enabled();

Manage global keys

Generate keys, fetch keys from keyring, purge keys.

  typedef std::basic_string<unsigned char> Key_string;
  struct Global_key {
      std::string m_id;
      Key_string m_key;
    };

Key_string get_key(const std::string &key_id);

Return the key with given key_id to caller.

Global_key get_current_key();

Return current key to caller.

m_seq_num_max:

The maximum replication encryption key sequence number used in binary or relay log files files. It is used to generate new sequence number.

m_seq_number_max is initialized at server startup.

m_keys:

It is a map for caching keys. Thus avoids to connect to keyring often.

uint32_t get_sequence_number():

Return the maximum sequence number.

rpl_encryption

The global instance of Rpl_encryption.

class Aes_ctr_cipher

It implements ctr encrypt/decrypt features.

bool open(const Key_string &password);

Open a cipher. It generates ctr key, iv from the password, set counter to 0. And then initialize the cipher context.

void close();

Release cipher context.

bool encrypt(unsigned char *dest, const unsigned char *src, int length);

Encrypt data.

bool decrypt(unsigned char *dest, const unsigned char *src, int length);

Decrypt data.

bool set_stream_offset(uint64_t offset);

To support seek feature for random position encrypt or decrypt.

class Rpl_encryption_ostream

It implements encryption feature into an output stream.

It derives from Truncatable_ostream. Features include:

  • Initialize file password and cipher;
  • Add encryption header into binlog file;
  • Encrypt the plain data with AES-CTR-256 cipher;

class Rpl_encryption_istream

It implements decryption feature into an input stream.

It derives from Basic_seekable_istream. Features include:

  • Read encryption header to initialize file password and cipher;
  • Decrypt data;