WL#13239: Enable/disable primary key checks on slaves
EXECUTIVE SUMMARY
After this worklog, the user or operator shall be able to
prevent the applier threads to: (a) unconditionally disallow
creating or altering tables that end up without a primary key;
(b) relax the privilege handling to be able replay incoming
DDLs in the replication stream that set the SESSION scope of
the system variable sql_require_primary_key
.
USER STORIES
U1. As a replication user using a fan-in topology to aggregate data from several sources: - I want to prevent unwanted changes to the existing table key structure.
FUNCTIONAL REQUIREMENTS
FR1. With CHANGE MASTER
option REQUIRE_TABLE_PRIMARY_KEY_CHECK=ON
,
the user shall be allowed to create a channel with
@@session.sql_require_primary_key=1
that is not overriden by the
master. This channel shall forbid the replication of operations that
create or alter tables that end up without a primary key.
FR2. With CHANGE MASTER
option
REQUIRE_TABLE_PRIMARY_KEY_CHECK=OFF
, the user shall be allowed to
create a channel with @@session.sql_require_primary_key=0
that is
not overriden by the master. This channel shall always allow the
replication of operations that create or alter tables that end up
without a primary key.
FR3. With the values ON or OFF for the CHANGE MASTER
option
REQUIRE_TABLE_PRIMARY_KEY_CHECK
, when using a privileged user in the
replication applier, it shall not be necessary to give it admin
privileges like SESSION_VARIABLES_ADMIN
.
FR4. For compatibility with existing deployments where an admin blocks
any schema change by users not having SESSION_VARIABLES_ADMIN
privileges, disabling the @@session.sql_require_primary_key
will
still require at least SESSION_VARIABLES_ADMIN
when
REQUIRE_TABLE_PRIMARY_KEY_CHECK is set to STREAM, being that the
default value.
NON FUNCTIONAL REQUIREMENTS
NF1. This worklog shall have no performance penalty on replication.
1. Background
With the introduction of privilege checks on replication channels, the database administrator now has a much finer control over what happens on slave servers. This is critical when replicating from outside a security boundary where servers might have unknown settings, data or insecure access policies among other issues.
One of the current limitations however pertains with the privileges
needed by some replicated options being the one focused here the
server option: sql_require_primary_key
sql_require_primary_key : Whether statements that create new tables or alter the structure of existing tables enforce the requirement that tables have a primary key.
The fact that all CREATE TABLE
and ALTER TABLE
replicated transactions
include this variable raises some issues in a secured channel.
In order to replicate
CREATE TABLE
andALTER TABLE
queries the privileged checks user associated to a replication channel must haveSESSION_VARIABLES_ADMIN privileges
.Even if the user sets
@@global.sql_require_primary_key = 1
for the slave, the master can still create keyless tables as this option is always replicated. Users can't secure a slave server against it.As
SESSION_VARIABLES_ADMIN
is needed to disable or enablesql_require_primary_key
it is not possible to secure the slave server by simply revoking this privilege for the privileged checks user in order to solve the above issue.
In summary the issues are:
We want to avoid giving an high level privilege like
SESSION_VARIABLES_ADMIN
to a replication privileged checks user in order to in order to setsql_require_primary_key
in the replication channel session, when replicating table schema changes from the primary/master.As a secondary goal, we want to be able to override the value that is sent from the master completely for replication channels, so that we enforce a channel defined value and not the one from the master.
And again, the creation of keyless tables in this context means an explicit table creation or the removal of a table primary keys.
2. BASICS
We can now proceed to present the decisions made in the worklog and explain the reasoning behind them.
2.1. Requiring table primary keys in the slave
Before introducing the new option, we first want to look at the
current behavior. When manipulating the requirements for table
primary keys in both the master and slave the matrix of interactions
for sql_require_primary_key
is:
Case | Master | Slave | Assessment |
---|---|---|---|
A | 1 | 0 | OK |
B | 0 | 1 | Problem |
C | 0 | 0 | OK |
D | 1 | 1 | OK |
Looking at it we can analyze each case
A : If the master requires tables to have primary keys but the slave doesn't care, there are no issues on replication and so the slave does not need to change
sql_require_primary_key
.B: This is the true problematic case. If the master doesn't require primary keys but the slave has this restrictions, then the slave must adapt to master queries (by setting the variable on its session) or abort the transaction.
C and D: If the policy is the same on both master and slave there is no need to change
sql_require_primary_key
when replicating.
The conclusion here is that only when a master wants to replicate the creation or alteration of tables with no primary key restrictions into a slave that has them do we see a policy clash between the two servers.
It also means that if the value for sql_require_primary_key
is
globally disabled in the slave, any changes replicated by the master in
the channel context, on a normal scenario, should not go against the
user defined restrictions for the server.
2.2 Introducing REQUIRE_TABLE_PRIMARY_KEY_CHECK
To have a more explicit control of how we enforce or not
sql_require_primary_key
in the slave we now introduce
CHANGE MASTER TO REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM | ON | OFF
To the user the behavior is
ON: The slave requires primary keys. It sets
session.sql_require_primary_key=1
and we never change this value for the slave context.OFF: The slave does not require primary keys on tables. It sets
session.sql_require_primary_key=0
and we never change this value for the slave context.STREAM (With no privileged user): The slave follows the legacy behavior. It sets whatever value comes from the master.
STREAM (With a privileged user): Same as above. The difference is that user must have
SESSION_VARIABLES_ADMIN
privileges for this operation.
We should explain here the reasoning behind STREAM.
The first reason is that, it being the default, will respect whatever
setup users have deployed before upgrade. The ON value is more
restrictive than the old default so it is not a viable choice. The
OFF value, while similar to STREAM, would allow the
replication of table creation and alter queries when the database
admin configured a privileged user with no SESSION_VARIABLES_ADMIN
privileges to prevent it .
Second reason is to make the behavior more explicit to the end user without the need of presenting behavior matrixes.
2.3 Privileges and sql_require_primary_key
Following the requirement to remove the need to give the slave applier
privileged users the privilege SESSION_VARIABLES_ADMIN
we now
evaluate the current proposal from that perspective.
The first point to make is that this decision is per channel, so whatever is configured, it will only restrict the operation of that unit.
For value STREAM we follow the current server/slave behavior,
meaning the user associated to a channel will need
SESSION_VARIABLES_ADMIN
privileges to set sql_require_primary_key
.
Hence there is no changes to the current security policies here.
For values ON and OFF we set for the session the value of
sql_require_primary_key
when we execute the CHANGE MASTER
command.
With this option the user can now restrict or not the behavior for
this channel at the configuration stage meaning no more privileges are
needed during execution thus fulfilling this worklog objective.
Also, we can compare this behavior to the one of a DBA creating a
client session in the slave and disabling sql_require_primary_key
for that said session. Tables with no primary keys are now allowed in
that session.
To execute a CHANGE MASTER
command the user needs
REPLICATION_SLAVE_ADMIN
or SUPER
, privileges with which a
replication channel can be created now and in the past with no
restrictions when setting sql_require_primary_key
.
In conclusion we are not making the slave any less secure, being the opposite true as there is now a more fine control over execution.
3. USER INTERFACE
3.1. New syntax for CHANGE MASTER TO
This section describes new syntax for CHANGE MASTER
There is one new SQL statement clause:
CHANGE MASTER TO REQUIRE_TABLE_PRIMARY_KEY_CHECK = [STREAM|ON|OFF]
The behavior is:
When a new replication channel is created using CHANGE MASTER, and no REQUIRE_TABLE_PRIMARY_KEY_CHECK clause is specified, then the channel shall be configured with REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM.
When
RESET SLAVE
is used, it shall not affect REQUIRE_TABLE_PRIMARY_KEY_CHECK.When
RESET SLAVE ALL
is used in such a way that all replication channels are removed and a new default channel is created, then the new default channel shall have REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM.This setting can only be set when the slave applier is stopped (SQL threads).
This option is also available for use in group replication channels.
4. PERSISTENT CONFIGURATION
File
When @@global.relay_log_info_repository=FILE
, the file shall
contain one extra line:
- the flag REQUIRE_TABLE_PRIMARY_KEY_CHECK
Table
When @@global.relay_log_info_repository=TABLE
, the mysql.slave_relay_log_info
table shall contain one extra
column:
REQUIRE_TABLE_PRIMARY_KEY_CHECK
ENUM('STREAM','ON','OFF')
NOT NULL
DEFAULT 'STREAM'
COMMENT 'See section 6.2 : D1'
Clone
This flag is cloned between donor and receiver. When the server restarts after clone it shall either try to recover the present relay info information or reset it in case of a group replication channel. Either way, the info is preserved.
5. UPGRADES
- When the server is upgraded from a version that does not have this worklog, to a version that has it, all existing replication channels shall have REQUIRE_TABLE_PRIMARY_KEY_CHECK = 'STREAM'.
6.OBSERVABILITY
6.1 Performance_schema
The performance_schema.replication_applier_configuration
table
shall have the following new column:
REQUIRE_TABLE_PRIMARY_KEY_CHECK
ENUM('STREAM','ON','OFF')
COMMENT 'See section 6.2 : D1'
This shall show for the current channel what is value for REQUIRE_TABLE_PRIMARY_KEY_CHECK.
6.2 Error and description messages
Invalid REQUIRE_PRIMARY_KEYS value is passed to CHANGE MASTER
We use the standard error:
ER_PARSE_ERROR
"You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ... "
Errors related to sql_require_primary_key
No changes are made to the existing errors
- ER_SPECIFIC_ACCESS_DENIED_ERROR
Used on the slave when it checks if the a user can set sql_require_primary_key
- ER_TABLE_WITHOUT_PK
Propagated as a slave applier error, it is thrown when some replicated
query violates the restriction of sql_require_primary_key
.
D1. Description of REQUIRE_TABLE_PRIMARY_KEY_CHECK column on tables
"Indicates what is the channel policy regarding tables having primary keys on create and alter table queries"
Development Plan
1.1 Add fields to relay log info
Base: Add a new member in Relay_log_info
to store *REQUIRE_TABLE_PRIMARY_KEY_CHECK (File and Table).
Add the field to the relay log class.
Make the constructors initialize it.
Tests:
- Verify compilation.
Depends on LLD:
- None
1.2 Add info to performance schema
Observability: Implement REQUIRE_TABLE_PRIMARY_KEY_CHECK column
in performance_schema.replication_applier_configuration
.
Tests:
- Verify that the column contains
'STREAM'
by default - On a further phase test that the value is correct (Depends on phase 1.3)
Depends on LLD:
- Step 1.1
1.3 Add the syntax for CHANGE MASTER
Syntax: Add syntax for CHANGE MASTER TO REQUIRE_TABLE_PRIMARY_KEY_CHECK
.
Make CHANGE MASTER
update the Relay_log_info
members.
Tests:
- Verify that the CHANGE MASTER syntax is accepted.
- Verify that the CHANGE MASTER has observable effects on the p_s table.
- Verify that CHANGE MASTER fails when the channel SQL thread is running.
- Verify that CHANGE MASTER is accepted for group replication channels with this OPTION.
Depends on LLD:
- Step 1.2 (hence also 1.1)
1.4 Persistent configuration: Save the configuration in replication repositories.
Tests:
- REQUIRE_TABLE_PRIMARY_KEY_CHECK configuration exists in table/file.
- REQUIRE_TABLE_PRIMARY_KEY_CHECK configuration survives a server restart.
- Verify that RESET SLAVE has no observable effects for
REQUIRE_TABLE_PRIMARY_KEY_CHECK
. - Verify that RESET SLAVE ALL has observable effects for
REQUIRE_TABLE_PRIMARY_KEY_CHECK
.
Depends on LLD:
- Step 1.3 (hence also 1.1, 1.2)
1.5 Upgrades: Add the column to mysql.slave_relay_log_info during upgrade.
The value of the new column shall be dependent on the data of the table, depending if the channel is a group replication channel. The same rules apply to this field when reading from old relay log info files.
Tests:
- TC1: Verify that the column is added when doing an upgrade from 5.7
- TC2: Verify that the column is added when doing an upgrade from 8.0
- TC3: Verify that the column value is 'STREAM' for standard channels.
- TC4: Verify that TC3 is true when upgrading from Table repositories
- TC5: Verify that TC3 is true when upgrading from File repositories
Depends on LLD:
- Steps 1.4
1.6 Make the SQL thread set session values for sql_require_primary_key
Base: On the handle_slave_sql
method make the thread set the values for
sql_require_primary_key
according to the values of
REQUIRE_TABLE_PRIMARY_KEY_CHECK
Same for MTS workers.
If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM : Don't change the session
value for sql_require_primary_key
If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON : Set the session value
sql_require_primary_key
to 1.
If REQUIRE_TABLE_PRIMARY_KEY_CHECK = OFF : Set the session value
sql_require_primary_key
to 0.
Depends on LLD:
- Depends on 1.1
Tests:
- See 1.7
1.7 Change how Query_log_event handles sql_require_primary_key
Base: Change the way we handle sql_require_primary_key
on
Query_log_event::do_apply_event
.
If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM: Set the session value of
sql_require_primary_key
with the replicated value but only only
allow it if the privileged user has SESSION_VARIABLES_ADMIN
privileges
If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON or OFF: Ignore the replicated
value for sql_require_primary_key
.
Tests:
- If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM: Verify that you can't
replicate a table creation or alteration query when the channel
privileged user does not have
SESSION_VARIABLES_ADMIN
or a superior privilege. - If REQUIRE_TABLE_PRIMARY_KEY_CHECK = STREAM: Verify that you can
replicate a table creation or alteration query (with or without
a primary key) when the channel privileged user has
SESSION_VARIABLES_ADMIN
or a superior privilege. - If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON: Verify that you cannot replicate
a table creation or alteration query that results on a table with no
primary key being the master value for
sql_require_primary_key
0 or 1. - If REQUIRE_TABLE_PRIMARY_KEY_CHECK = ON: Verify that you can replicate a table creation or alteration query that results on a table with a primary key.
- If REQUIRE_TABLE_PRIMARY_KEY_CHECK = OFF: Verify that you can replicate a
table creation or alteration query that results on a table with no
primary key being the master value for
sql_require_primary_key
0 or 1. - If REQUIRE_TABLE_PRIMARY_KEY_CHECK = OFF: Verify that you can replicate a table creation or alteration query that results on a table with a primary key.
Depends on LLD:
- Depends on 1.6