WL#13239: Enable/disable primary key checks on slaves

Affects: Server-8.0   —   Status: Complete

EXECUTIVE SUMMARY

After this worklog, the user or operator shall be able to prevent the applier threads to: (a) unconditionally disallow creating or altering tables that end up without a primary key; (b) relax the privilege handling to be able replay incoming DDLs in the replication stream that set the SESSION scope of the system variable sql_require_primary_key.

USER STORIES

U1. As a replication user using a fan-in topology to aggregate data from several sources: - I want to prevent unwanted changes to the existing table key structure.

FUNCTIONAL REQUIREMENTS

FR1. With CHANGE MASTER option REQUIRE_TABLE_PRIMARY_KEY_CHECK=ON, the user shall be allowed to create a channel with @@session.sql_require_primary_key=1 that is not overriden by the master. This channel shall forbid the replication of operations that create or alter tables that end up without a primary key.

FR2. With CHANGE MASTER option REQUIRE_TABLE_PRIMARY_KEY_CHECK=OFF, the user shall be allowed to create a channel with @@session.sql_require_primary_key=0 that is not overriden by the master. This channel shall always allow the replication of operations that create or alter tables that end up without a primary key.

FR3. With the values ON or OFF for the CHANGE MASTER option REQUIRE_TABLE_PRIMARY_KEY_CHECK, when using a privileged user in the replication applier, it shall not be necessary to give it admin privileges like SESSION_VARIABLES_ADMIN.

FR4. For compatibility with existing deployments where an admin blocks any schema change by users not having SESSION_VARIABLES_ADMIN privileges, disabling the @@session.sql_require_primary_key will still require at least SESSION_VARIABLES_ADMIN when REQUIRE_TABLE_PRIMARY_KEY_CHECK is set to STREAM, being that the default value.

NON FUNCTIONAL REQUIREMENTS

NF1. This worklog shall have no performance penalty on replication.

1. Background

With the introduction of privilege checks on replication channels, the database administrator now has a much finer control over what happens on slave servers. This is critical when replicating from outside a security boundary where servers might have unknown settings, data or insecure access policies among other issues.

One of the current limitations however pertains with the privileges needed by some replicated options being the one focused here the server option: sql_require_primary_key

sql_require_primary_key : Whether statements that create new tables or alter the structure of existing tables enforce the requirement that tables have a primary key.

The fact that all CREATE TABLE and ALTER TABLE replicated transactions include this variable raises some issues in a secured channel.

  • In order to replicate CREATE TABLE and ALTER TABLE queries the privileged checks user associated to a replication channel must have SESSION_VARIABLES_ADMIN privileges.

  • Even if the user sets @@global.sql_require_primary_key = 1 for the slave, the master can still create keyless tables as this option is always replicated. Users can't secure a slave server against it.

  • As SESSION_VARIABLES_ADMIN is needed to disable or enable sql_require_primary_key it is not possible to secure the slave server by simply revoking this privilege for the privileged checks user in order to solve the above issue.

In summary the issues are:

  • We want to avoid giving an high level privilege like SESSION_VARIABLES_ADMIN to a replication privileged checks user in order to in order to set sql_require_primary_key in the replication channel session, when replicating table schema changes from the primary/master.

  • As a secondary goal, we want to be able to override the value that is sent from the master completely for replication channels, so that we enforce a channel defined value and not the one from the master.

And again, the creation of keyless tables in this context means an explicit table creation or the removal of a table primary keys.

2. BASICS

We can now proceed to present the decisions made in the worklog and explain the reasoning behind them.

2.1. Requiring table primary keys in the slave

Before introducing the new option, we first want to look at the current behavior. When manipulating the requirements for table primary keys in both the master and slave the matrix of interactions for sql_require_primary_key is:

Case Master Slave Assessment
A 1 0 OK
B 0 1 Problem
C 0 0 OK
D 1 1 OK

Looking at it we can analyze each case

  • A : If the master requires tables to have primary keys but the slave doesn't care, there are no issues on replication and so the slave does not need to change sql_require_primary_key.

  • B: This is the true problematic case. If the master doesn't require primary keys but the slave has this restrictions, then the slave must adapt to master queries (by setting the variable on its session) or abort the transaction.

  • C and D: If the policy is the same on both master and slave there is no need to change sql_require_primary_key when replicating.

The conclusion here is that only when a master wants to replicate the creation or alteration of tables with no primary key restrictions into a slave that has them do we see a policy clash between the two servers.

It also means that if the value for sql_require_primary_key is globally disabled in the slave, any changes replicated by the master in the channel context, on a normal scenario, should not go against the user defined restrictions for the server.

2.2 Introducing REQUIRE_TABLE_PRIMARY_KEY_CHECK

To have a more explicit control of how we enforce or not sql_require_primary_key in the slave we now introduce

CHANGE MASTER TO REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM | ON | OFF

To the user the behavior is

  • ON: The slave requires primary keys. It sets session.sql_require_primary_key=1 and we never change this value for the slave context.

  • OFF: The slave does not require primary keys on tables. It sets session.sql_require_primary_key=0 and we never change this value for the slave context.

  • STREAM (With no privileged user): The slave follows the legacy behavior. It sets whatever value comes from the master.

  • STREAM (With a privileged user): Same as above. The difference is that user must have SESSION_VARIABLES_ADMIN privileges for this operation.

We should explain here the reasoning behind STREAM.

The first reason is that, it being the default, will respect whatever setup users have deployed before upgrade. The ON value is more restrictive than the old default so it is not a viable choice. The OFF value, while similar to STREAM, would allow the replication of table creation and alter queries when the database admin configured a privileged user with no SESSION_VARIABLES_ADMIN privileges to prevent it .

Second reason is to make the behavior more explicit to the end user without the need of presenting behavior matrixes.

2.3 Privileges and sql_require_primary_key

Following the requirement to remove the need to give the slave applier privileged users the privilege SESSION_VARIABLES_ADMIN we now evaluate the current proposal from that perspective.

The first point to make is that this decision is per channel, so whatever is configured, it will only restrict the operation of that unit.

For value STREAM we follow the current server/slave behavior, meaning the user associated to a channel will need SESSION_VARIABLES_ADMIN privileges to set sql_require_primary_key. Hence there is no changes to the current security policies here.

For values ON and OFF we set for the session the value of sql_require_primary_key when we execute the CHANGE MASTER command. With this option the user can now restrict or not the behavior for this channel at the configuration stage meaning no more privileges are needed during execution thus fulfilling this worklog objective.

Also, we can compare this behavior to the one of a DBA creating a client session in the slave and disabling sql_require_primary_key for that said session. Tables with no primary keys are now allowed in that session.

To execute a CHANGE MASTER command the user needs REPLICATION_SLAVE_ADMIN or SUPER, privileges with which a replication channel can be created now and in the past with no restrictions when setting sql_require_primary_key.

In conclusion we are not making the slave any less secure, being the opposite true as there is now a more fine control over execution.

3. USER INTERFACE

3.1. New syntax for CHANGE MASTER TO

This section describes new syntax for CHANGE MASTER

There is one new SQL statement clause:

CHANGE MASTER TO REQUIRE_TABLE_PRIMARY_KEY_CHECK = [STREAM|ON|OFF]

The behavior is:

  1. When a new replication channel is created using CHANGE MASTER, and no REQUIRE_TABLE_PRIMARY_KEY_CHECK clause is specified, then the channel shall be configured with REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM.

  2. When RESET SLAVE is used, it shall not affect REQUIRE_TABLE_PRIMARY_KEY_CHECK.

  3. When RESET SLAVE ALL is used in such a way that all replication channels are removed and a new default channel is created, then the new default channel shall have REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM.

  4. This setting can only be set when the slave applier is stopped (SQL threads).

  5. This option is also available for use in group replication channels.

4. PERSISTENT CONFIGURATION

File

When @@global.relay_log_info_repository=FILE, the file shall contain one extra line:

  • the flag REQUIRE_TABLE_PRIMARY_KEY_CHECK

Table

When @@global.relay_log_info_repository=TABLE, the mysql.slave_relay_log_info table shall contain one extra column:

 REQUIRE_TABLE_PRIMARY_KEY_CHECK
 ENUM('STREAM','ON','OFF')
 NOT NULL
 DEFAULT 'STREAM'
 COMMENT 'See section 6.2 : D1'

Clone

This flag is cloned between donor and receiver. When the server restarts after clone it shall either try to recover the present relay info information or reset it in case of a group replication channel. Either way, the info is preserved.

5. UPGRADES

  1. When the server is upgraded from a version that does not have this worklog, to a version that has it, all existing replication channels shall have REQUIRE_TABLE_PRIMARY_KEY_CHECK = 'STREAM'.

6.OBSERVABILITY

6.1 Performance_schema

The performance_schema.replication_applier_configuration table shall have the following new column:

  REQUIRE_TABLE_PRIMARY_KEY_CHECK
  ENUM('STREAM','ON','OFF')
  COMMENT 'See section 6.2 : D1'

This shall show for the current channel what is value for REQUIRE_TABLE_PRIMARY_KEY_CHECK.

6.2 Error and description messages

Invalid REQUIRE_PRIMARY_KEYS value is passed to CHANGE MASTER

We use the standard error:

ER_PARSE_ERROR

"You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ... "

Errors related to sql_require_primary_key

No changes are made to the existing errors

  • ER_SPECIFIC_ACCESS_DENIED_ERROR

Used on the slave when it checks if the a user can set sql_require_primary_key

  • ER_TABLE_WITHOUT_PK

Propagated as a slave applier error, it is thrown when some replicated query violates the restriction of sql_require_primary_key.

D1. Description of REQUIRE_TABLE_PRIMARY_KEY_CHECK column on tables

"Indicates what is the channel policy regarding tables having primary keys on create and alter table queries"

Development Plan

1.1 Add fields to relay log info

Base: Add a new member in Relay_log_info to store *REQUIRE_TABLE_PRIMARY_KEY_CHECK (File and Table).
Add the field to the relay log class.
Make the constructors initialize it.

Tests:

  • Verify compilation.

Depends on LLD:

  • None

1.2 Add info to performance schema

Observability: Implement REQUIRE_TABLE_PRIMARY_KEY_CHECK column in performance_schema.replication_applier_configuration.

Tests:

  • Verify that the column contains 'STREAM' by default
  • On a further phase test that the value is correct (Depends on phase 1.3)

Depends on LLD:

  • Step 1.1

1.3 Add the syntax for CHANGE MASTER

Syntax: Add syntax for CHANGE MASTER TO REQUIRE_TABLE_PRIMARY_KEY_CHECK.
Make CHANGE MASTER update the Relay_log_info members.

Tests:

  • Verify that the CHANGE MASTER syntax is accepted.
  • Verify that the CHANGE MASTER has observable effects on the p_s table.
  • Verify that CHANGE MASTER fails when the channel SQL thread is running.
  • Verify that CHANGE MASTER is accepted for group replication channels with this OPTION.

Depends on LLD:

  • Step 1.2 (hence also 1.1)

1.4 Persistent configuration: Save the configuration in replication repositories.

Tests:

  • REQUIRE_TABLE_PRIMARY_KEY_CHECK configuration exists in table/file.
  • REQUIRE_TABLE_PRIMARY_KEY_CHECK configuration survives a server restart.
  • Verify that RESET SLAVE has no observable effects for REQUIRE_TABLE_PRIMARY_KEY_CHECK.
  • Verify that RESET SLAVE ALL has observable effects for REQUIRE_TABLE_PRIMARY_KEY_CHECK.

Depends on LLD:

  • Step 1.3 (hence also 1.1, 1.2)

1.5 Upgrades: Add the column to mysql.slave_relay_log_info during upgrade.

The value of the new column shall be dependent on the data of the table, depending if the channel is a group replication channel. The same rules apply to this field when reading from old relay log info files.

Tests:

  • TC1: Verify that the column is added when doing an upgrade from 5.7
  • TC2: Verify that the column is added when doing an upgrade from 8.0
  • TC3: Verify that the column value is 'STREAM' for standard channels.
  • TC4: Verify that TC3 is true when upgrading from Table repositories
  • TC5: Verify that TC3 is true when upgrading from File repositories

Depends on LLD:

  • Steps 1.4

1.6 Make the SQL thread set session values for sql_require_primary_key

Base: On the handle_slave_sql method make the thread set the values for sql_require_primary_key according to the values of REQUIRE_TABLE_PRIMARY_KEY_CHECK Same for MTS workers.

If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM : Don't change the session value for sql_require_primary_key

If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON : Set the session value sql_require_primary_key to 1.

If REQUIRE_TABLE_PRIMARY_KEY_CHECK = OFF : Set the session value sql_require_primary_key to 0.

Depends on LLD:

  • Depends on 1.1

Tests:

  • See 1.7

1.7 Change how Query_log_event handles sql_require_primary_key

Base: Change the way we handle sql_require_primary_key on Query_log_event::do_apply_event.

If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM: Set the session value of sql_require_primary_key with the replicated value but only only allow it if the privileged user has SESSION_VARIABLES_ADMIN privileges

If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON or OFF: Ignore the replicated value for sql_require_primary_key.

Tests:

  • If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM: Verify that you can't replicate a table creation or alteration query when the channel privileged user does not have SESSION_VARIABLES_ADMIN or a superior privilege.
  • If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM: Verify that you can replicate a table creation or alteration query (with or without a primary key) when the channel privileged user has SESSION_VARIABLES_ADMIN or a superior privilege.
  • If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON: Verify that you cannot replicate a table creation or alteration query that results on a table with no primary key being the master value for sql_require_primary_key 0 or 1.
  • If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON: Verify that you can replicate a table creation or alteration query that results on a table with a primary key.
  • If REQUIRE_TABLE_PRIMARY_KEY_CHECK = OFF: Verify that you can replicate a table creation or alteration query that results on a table with no primary key being the master value for sql_require_primary_key 0 or 1.
  • If REQUIRE_TABLE_PRIMARY_KEY_CHECK = OFF: Verify that you can replicate a table creation or alteration query that results on a table with a primary key.

Depends on LLD:

  • Depends on 1.6