WL#12007: Refactor the ACL queries rewrite APIs
Affects: Server-8.0 — Status: Complete
Intent ------ Improve the SQL query rewrite APIs so that it become easy to extend and maintain query rewriting in future. Motivation ---------- We find it extremely difficult to make changes with new requirements/features especially for the ACL DDLs. It happens due to following reasons : 1. There is no general framework for query rewriting. There are multiple ways rewrite APIs are called. 2. Some of the ACL DDL APIs are rigid as they do not adhere to single responsibility principle. A simple change causes a cascade of subsequent changes. 3. More the changes required means more chances of breakage in many places. This worklog requires to skip the REPLACE
clause from the binlog but it has to be written in other types of logs. It turns out that cost of refactoring the current APIs outweighs the cost of making adjustments in the current design, finding and fixing bugs later on. In fact we discovered bug with current design. Objective --------- Following changes are keeping in the mind the ACL DDLs but they are applicable to rewriting of any SQL statement. -- Re-factor the rewrite APIs so that it becomes easy to extend and maintain them. -- Add the rewritten SQL statement consistently across the same target type (i.e. General, SQL query and audit log). It is especially needed keeping in mind the ACL DDLs but it is applicable to any other SQL statement as well.
FR-1: There must not be any impact on current rewriting behavior of SQL statements except the following: a) CREATE USER b) ALTER USER FR-2: The output of the query rewrite may vary according to the target type. For instance : a) Plain text password is converted to literal value
in the audit log. b) Hash of plain text password is added to the binlog. FR-4: Password specified in the SQL statement must be replaced by the literal value in case of general, audit and slow query log. FR-4.1: The literal must be added without any quotation marks so the statement would fail if the user were to cut & paste it without filling in the real password. FR-5: There must not be any changes in the following behaviors: FR-5.1 : If password is not specified in the SQL statement then literal value must not be added in the rewritten SQL statement for audit, general and slow query logs. FR-5.2 : If the password is specified in the SQL statement then hash of the same should be added in the rewritten SQL statement for the binlog. FR-6: The rewritten statement should be as close to original SQL statement as possible. FR-6.1 : If the SQL statement doesn't contain the plain text password then output in the same consumer types may not be consistently same. FR-7: SHOW CREATE USER must print the current values of the clauses which are not specified in the statement. For instance: mysql> create user test; mysql> show create user test\G *************************** 1. row *************************** CREATE USER for test@%: CREATE USER 'test'@'%' IDENTIFIED WITH 'caching_sha2_password' REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK PASSWORD HISTORY DEFAULT PASSWORD REUSE INTERVAL DEFAULT FR-8: The default value of unspecified clauses in the CREATE|ALTER USER statements must not be added in the audit,general and slow query logs. FR-8.1 : The rewritten statement in the TEXTLOG contains the PASSWORD HISTORY and PASSWORD REUSE clauses if they are specified in the SQL statement. FR-8.2 : The rewritten statement in the TEXTLOG contains the auth plugin information if IDENTIFIED WITH clause is specified in the SQL statement. FR-9: There must not be any impact on firewall plugin even after upgrading the server due to this work.
High-Level Specification ======================== Background: ----------- We rewrite a SQL statement for following reasons: a) Obfuscate the password as following We replace the password with its hash where we have the latter, otherwise we replace the password by a literal "
", with *no* quotation marks so the statement would fail if the user were to cut & paste it without filling in the real password. Replacing the plain text password with the hash value is primarily done to write the statement safely to the binlogs. It is not necessary to write the same in the in the usual log files. b) Skip some clauses from ACL DDLs before adding the statement to binlog. Prior to WL#6595, the notion had been that once the SQL statement was rewritten for binlog, the same was good to go in the subsequent logs. Now, the default PASSWORD HISTORY and PASSWORD REUSE clauses are added across all the logs always but added to the binlog only if they are specified in the SQL statement. c) Needs to re-synthesize some of the SQL statements. There are following three mediums where we add the rewritten SQL statement. a) Usual log files that includes General, slow query and audit log. b) binlog files c) Standard output to print the result of queries like SHOW CREATE USER. The statement is added to logs at different point in time. It is added to the general log just before parsing of the statement, once the statement is executed successfully it is written to the binlog followed by audit log/slow log/performance schema etc. Problems -------- 1) A SQL statement which contains a plain text password is rewritten in two different type in the different logs. By the time the statement is written to the general log, it contains a plain text password which is replaced by the literal value. Once the statement is processed the plain text password is converted into hash. Now, the query is rewritten for binlog that uses the hash value over the literal value to obfuscate the password. This point onward the query written with literal value in the general log, is written with the hashes in the subsequent logs(i.e.audit/slow-query log). Above behavior is *ambiguous* and *inconsistent* because log files convey the same message even though the amount of information present in them may vary. Hence they must have same SQL statement. 2) Current rewrite design is difficult to maintain and extend. For instance - Refer Bug#24911117 (Sakila) fix in RB#19217. There are two problems with the fix due the limitation of current design: a. A simple flag value is trickle down from the high level APIs to the low level API. b. Though the change is only for SHOW CREATE USER statement but it affects other statements as well(Unless a condition check is added). 3) As described in the background paragraph above, if some of the clauses are not specified in the SQL statement then their default value is added to the statement rewritten for the general log but not for the audit, slow query and binlogs. It happens because once the query is rewritten for the binlog the same is used for subsequent logs(Refer Bug#27967905). We have decided not to add the default value of unspecified clauses in the SQL statement for the log files. This will ensure that same rewritten SQL statement is added across the audit,general and slow query logs. 4) The behavior in point#3 might not have got enough attention so far but it does now due to upcoming WL#11544. We need to skip the REPLACE clause from the binlog but it has to be present in all other log files. Due to obscure design of current rewrite APIs it has become tedious to do any further changes. This worklog proposes an alternate design to fix the current problems, and make it easier to extend the rewrite functionality if so desired in future. Interface Specification ======================= IS-1: Move the rewrite API implementation into the class hierarchy. IS-2: Define an interface to rewrite any SQL statement. IS-2.1: A concrete class must provide the implementation of particular statement. IS-3: All the rewrite requests must go through a wrapper method. Clients must not create the concrete objects directly. IS-4: Define the types of target for which query will be rewritten. For instance : LOG - General, slow and audit logs BINLOG - Binary log CONSOLE - Standard output IS-4.1: Specify the target type for which a query is rewritten. IS-5: Move the common operation across concrete implementations to a separate util class.
1) Added following class hierarchy. There is an abstract base class Rewriter. All the concrete classes have to implement the method rewrite(). +-----------+ | Rewriter | +-----------+ ^ | | +-----------------+---------------+------------+-----+------------so on | | | | | | | | | | Rewriter_user Rewriter_set Rewriter_slave_start | Rewriter_create_server ^ ^ | | | Rewriter_change_master | Rewriter_set_password | +-------------+---------------+ | | | | | | | | Rewriter_show_create_user | | | Rewriter_alter_user | Rewriter_create user 2) Added a Rewriter_util class to wrap the common utility methods used across the classes. 3) Some of the ACL DDLs rewrite additional parameters so as much like earlier, there are two interface functions to rewrite the query. The clients must not create the rewriter objects directly, instead they should call following functions as desired. void mysql_rewrite_query() void mysql_rewrite_acl_ddl()
Copyright (c) 2000, 2020, Oracle Corporation and/or its affiliates. All rights reserved.