WL#12007: Refactor the ACL queries rewrite APIs

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

Intent
------
Improve the SQL query rewrite APIs so that it become easy to extend and
maintain query rewriting in future.

Motivation
---------- 
We find it extremely difficult to make changes with new requirements/features
especially for the ACL DDLs. It happens due to following reasons :

1. There is no general framework for query rewriting. There are multiple ways
   rewrite APIs are called.
2. Some of the ACL DDL APIs are rigid as they do not adhere to single
   responsibility principle. A simple change causes a cascade of subsequent
   changes. 
3. More the changes required means more chances of breakage in many places.
      
This worklog requires to skip the REPLACE  <password> clause 
from the binlog but it has to be written in other types of
logs. It turns out that cost of refactoring the current APIs outweighs the cost
of making adjustments in the current design, finding  and fixing bugs later on.
In fact we discovered bug with current design.

Objective
---------
Following changes are keeping in the mind the ACL DDLs but they are applicable 
to rewriting of any SQL statement.

-- Re-factor the rewrite APIs so that it becomes easy to extend and maintain
   them.
   
-- Add the rewritten SQL statement consistently across the same target type
   (i.e. General, SQL query and audit log). It is especially needed keeping 
   in mind the ACL DDLs but it is applicable to any other SQL statement as
   well.
FR-1: There must not be any impact on current rewriting behavior of
      SQL statements except the following:
      a) CREATE USER
      b) ALTER USER

FR-2: The output of the query rewrite may vary according to the target type.
      For instance : 
      a) Plain text password is converted to literal value <secret> in the
         audit log.
      b) Hash of plain text password is added to the binlog.
   

FR-4: Password specified in the SQL statement must be replaced by the literal
      value <secret> in case of general, audit and slow query log.

      FR-4.1: The literal <secret> must be added without any quotation marks so
      the statement would fail if the user were to cut & paste it without 
      filling in  the real password. 

FR-5: There must not be any changes in the following behaviors:

      FR-5.1 : If password is not specified in the SQL statement then literal 
               value <secret> must not be added in the rewritten SQL 
               statement for audit, general and slow query logs.

      FR-5.2 : If the password is specified in the SQL statement then hash of
               the same should be added in the rewritten SQL statement for
               the binlog.

FR-6: The rewritten statement should be as close to original SQL statement as
      possible.
      
      FR-6.1 : If the SQL statement doesn't contain the plain text password
      then output in the same consumer types may not be consistently same.

FR-7: SHOW CREATE USER must print the current values of the clauses which
      are not specified in the statement.
      For instance:
      mysql> create user test;
      mysql> show create user test\G
             *************************** 1. row ***************************
             CREATE USER for test@%: CREATE USER 'test'@'%' IDENTIFIED WITH
             'caching_sha2_password' REQUIRE NONE PASSWORD EXPIRE DEFAULT
             ACCOUNT UNLOCK PASSWORD HISTORY DEFAULT PASSWORD REUSE INTERVAL
             DEFAULT 

FR-8: The default value of unspecified clauses in the CREATE|ALTER USER 
      statements must not be added in the audit,general and slow query logs.

      FR-8.1 : The rewritten statement in the TEXTLOG contains the PASSWORD 
      HISTORY and PASSWORD REUSE clauses if they are specified in the SQL 
      statement.
      
      FR-8.2 : The rewritten statement in the TEXTLOG contains the auth plugin 
      information if IDENTIFIED WITH clause is specified in the SQL statement.

FR-9: There must not be any impact on firewall plugin even after upgrading
      the server due to this work.

High-Level Specification
========================

Background:
-----------

We rewrite a SQL statement for following reasons:

a) Obfuscate the password as following

   We replace the password with its hash where we have the latter,
   otherwise we replace the password by a literal "<secret>", with *no*
   quotation marks so the statement would fail if the user were to cut & paste
   it without filling in the real password.
   Replacing the plain text password with the hash value is primarily done to
   write the statement safely to the binlogs. It is not necessary to write the
   same in the in the usual log files.

b) Skip some clauses from ACL DDLs before adding the statement to binlog.

   Prior to WL#6595, the notion had been that once the SQL statement was
   rewritten for binlog, the same was good to go in the subsequent logs.
   Now, the default PASSWORD HISTORY and PASSWORD REUSE clauses are added 
   across all the logs always but added to the binlog only if they are
   specified in the SQL statement.

c) Needs to re-synthesize some of the SQL statements.

There are following three mediums where we add the rewritten SQL statement.

a) Usual log files that includes General, slow query and audit log.
b) binlog files
c) Standard output to print the result of queries like SHOW CREATE USER.

The statement is added to logs at different point in time.
It is added to the general log just before parsing of the statement, 
once the statement is executed successfully it is written to the binlog
followed by audit log/slow log/performance schema etc.

Problems
--------

1) A SQL statement which contains a plain text password is rewritten in two
   different type in the different logs. By the time the statement is written
   to the general log, it contains a plain text password which is replaced by
   the literal value. Once the statement is processed the plain text password
   is converted into hash. Now, the query is rewritten for binlog that uses
   the hash value over the literal value to obfuscate the password.
   This point onward the query written with literal value in the general log, 
   is written with the hashes in the subsequent logs(i.e.audit/slow-query log). 
   
Above behavior is *ambiguous* and *inconsistent* because log files convey the
same message even though the amount of information present in them may vary. 
Hence they must have same SQL statement.
     
2) Current rewrite design is difficult to maintain and extend.
   For instance - Refer Bug#24911117 (Sakila) fix in RB#19217.
   There are two problems with the fix due the limitation of current design:
   a. A simple flag value is trickle down from the high level APIs to the
      low level API.
   b. Though the change is only for SHOW CREATE USER statement but it affects
      other statements as well(Unless a condition check is added).

3) As described in the background paragraph above, if some of the clauses 
   are not specified in the SQL statement then their default value is added 
   to the statement rewritten for the general log but not for the audit,
   slow query and binlogs. It happens because once the query is rewritten for
   the binlog the same is used for subsequent logs(Refer Bug#27967905).
   
   We have decided not to add the default value of unspecified clauses 
   in the SQL statement for the log files. This will ensure that same rewritten
   SQL statement is added across the audit,general and slow query logs.

4) The behavior in point#3 might not have got enough attention so far but it
   does now due to upcoming WL#11544. We need to skip the REPLACE <password>
   clause from the binlog but it has to be present in all other log files.
   Due to obscure design of current rewrite APIs it has become tedious to do
   any further changes.
   
This worklog proposes an alternate design to fix the current problems,
and make it easier to extend the rewrite functionality if so desired in future.

Interface Specification
=======================

IS-1: Move the rewrite API implementation into the class hierarchy. 

IS-2: Define an interface to rewrite any SQL statement. 
      
      IS-2.1: A concrete class must provide the implementation of
              particular statement.

IS-3: All the rewrite requests must go through a wrapper method. Clients must
      not create the concrete objects directly.

IS-4: Define the types of target for which query will be rewritten.
      For instance :
        LOG - General, slow and audit logs
        BINLOG - Binary log
        CONSOLE - Standard output
      
      IS-4.1: Specify the target type for which a query is rewritten.

IS-5: Move the common operation across concrete implementations to a separate
      util class.
1) Added following class hierarchy.
   There is an abstract base class Rewriter. All the concrete classes have to
   implement the method rewrite().
   
                              +-----------+
                              |  Rewriter |
                              +-----------+
                                    ^
                                    |
                                    |
        +-----------------+---------------+------------+-----+------------so on
        |                 |               |            |     |
        |                 |               |            |     |
  Rewriter_user      Rewriter_set Rewriter_slave_start | Rewriter_create_server
        ^                 ^                            |     
        |                 |                     Rewriter_change_master    
        |        Rewriter_set_password
        |
+-------------+---------------+
|             |               |
|             |               |
|             |   Rewriter_show_create_user
|             |
|     Rewriter_alter_user
|
Rewriter_create user
                    

2) Added a Rewriter_util class to wrap the common utility methods used
   across the classes.
   
3) Some of the ACL DDLs rewrite additional parameters so as much like earlier, 
   there are two interface functions to rewrite the query. The clients must
   not create the rewriter objects directly, instead they should call following
   functions as desired.
   
   void mysql_rewrite_query()
   void mysql_rewrite_acl_ddl()