WL#12364: Kill administration for system users

Affects: Server-8.0 — Status: Complete

WL#12098 introduced a new dynamic privilege 'SYSTEM_USER'. The user who is
granted this privilege can be treated as power user. In other words, a user
can modify another user with SYSTEM_USER privilege only if former too has 
SYSTEM_USER privilege. 

This also means that a user who is not granted 'SYSTEM_USER' privilege should
not be able to kill another user's session/query who is granted 'SYSTEM_USER'
privilege (i.e. a power user).

Currently one can use the KILL command to kill other user's session/query only if
they have the CONNECTION_ADMIN/SUPER privilege. There's no discrimination of what
users they can use it on.

The goal of this worklog is to disallow the KILL command terminating the power
user's session/query from another user's session who is granted 
CONNECTION_ADMIN/SUPER but not SYSTEM_USER privilege. Only one power user should 
be able to kill session/query of another power user.

Definitions
===========
Power_user 
----------
A user who is granted at least SYSTEM_USER privilege through GRANT statement.

Regular_user
------------
A user who is granted at least CONNECTION_ADMIN privilege but not the
SYSTEM_USER privilege.

Power_session
-------------
A user session that gets SYSTEM_USER privilege through role activation. 
                        OR 
User already had SYSTEM_USER privilege before the session was established.
                  
Regular_session
---------------
A user session that is not power_session.

Functional Requirements
=======================

FR-01: Regular_user MUST NOT be able to KILL the power_session.
        
    FR-01.1: Regular_user MAY be able to KILL the EXISTING connections of a
             user who is just elevated as power_user through GRANT statement.

    FR-01.2: Regular_user MAY NOT be able to KILL the EXISTING connections of
             a user who is just demoted from power_user through REVOKE
             statement.
             
FR-02: Regular user MUST NOT be able to KILL the queries running in
       power_session.
      
FR-03: A regular_session MUST become power_session by activating the role
       that has SYSTEM_USER privilege granted.
       
FR-04: A power_session MUST become regular_session after deactivating the
       role has SYSTEM_USER privilege granted.
       
FR-05: A regular_session MAY NOT become power_session by simply granting
       SYSTEM_USER privilege to the user unless connection is re-established
       or a role is activated/deactivated.
       
FR-06: A power_session MAY NOT become regular_session by simply revoking
       SYSTEM_USER privilege from user unless connection is re-established or
       a role is activated/deactivated.
       
FR-07: Running a SQL statement as definer who has SYSTEM_USER privilege MUST
       NOT elevate the regular_session as power_session even temporarily.

Interface Specification
=======================
No changes required.

High-Level Specification
========================

S-1: Design Considerations:

We evaluated the following design approaches.

S-1.1: Probe, the security context of current session and the security context
of the session asked to be killed, for the SYSTEM_USER privilege.
If former does not have SYSTEM_USER privilege but latter has then former
throws error.
This seems easiest and simplest approach but it would have a potential
synchronization issues. Security_context may change temporarily while
executing SQL statements.
Imagine, there are two sessions t1 and t2; t2 wants to kill t1. t1 is
in the middle of executing a query that has changed the
Security_context temporarily. The changed security context does not have
SYSTEM_USER privilege granted. When t2 probes the Security_context of t1
it sees that t1 doesn't have SYSTEM_USER privilege hence, it can kill
the t1 session which is not correct.

S-1.2 THD maintains the original Security_context as m_main_security_ctx
member variable. we could consider probing the original Security_context
instead of the current security context. But we identified at least
following circumstances when the original security_context may be changed
as well.
- The client connection may change the current user by issuing the
COM_CHANGE_USER command.
- SET ROLE statement may change the original security_context
- Plugin/component may also change the original security_context through
security_context services.

Hence, this approach would not solve the synchronization issue identified
with the S-1.1 completely.

S-1.3 Add an atomic flag in THD class. Turn ON the flag when user's session
is authenticated or a role that has SYSTEM_USER privilege is activated.
Turn OFF the flag when a role that has SYSTEM_USER privilege is
deactivated.
This approach solves the synchronization issues identified in the above
two approaches. But, it has following two limitations:

- The existing sessions of a user are not updated if user is granted
SYSTEM_USER privilege through other user. It could mean that
connections, before SYSTEM_USER privilege was granted to an user,
could still be killed by the user who does not have SYSTEM_USER
privilege.

- There could be small window of race conditions in between of
SQL statement, changing the security_context and updating the atomic
flag in the THD. For instance:

mysql_set_active_role() {
Security_context *sctx = thd->security_context();
ret = sctx->activate_role(role->user, role->host, true);

/*
Here, session could still be killed just before the user got
system_user privilege but flag in the THD not yet updated
*/

thd->set_system_user_flag();
}

Either of S-1.1 or S-1.2 would require to take the lock (i.e. Lock_thd_data)
when the security_context is accessed through THD::security_context() API.
This may have potential performance impact. S-1.3 has two limitations but
We can live them since these are uncommon scenarios.

After carefully evaluating above three approaches, we decided to go with S-1.3.

S-2: Rationale behind FR-03,04 is that role activation/de-activation is a
session specific task. Therefore, SET ROLE statement can promote a session
from regular to power or vice-versa. We probe the user's security_context
for SYSTEM_USER privilege before executing the SET ROLE statement.

S-3 Rationale behind Fr-05,06 is that usually privileges are granted/revoked
to a user through another user's session. It would be very expensive
operation to reflect the change in user's privilege to its existing
connections. Therefore, we update the system_user flag in the connection
either when connection establishes or roles activated/deactivated.

S-4: Rationale behind FR-07 is that running a SQL statement with power user as
definer does not affect the user's session in anyways. Instead, it might
be required to execute the SQL statements. Therefore, regular session
might get killed even if it is running a SQL statement with definer who
has SYSTEM_USER privilege.

S-5: Impact on replication :- None

S-6: Impact on upgrade :- None

S-7: Threads not created by the users are categorized as system threads in
server. That raises an interesting question

"Can regular user kill the system threads ?".

To answer above question. We prepare following matrix threads created by
server. Objective of the matrix is to categorize the threads in four
quadrant so that:

- we can identify which system_threads that gets the SYSTEM_USER privilege
by the virtue of skip-grants flag. Hence, they fall into to (1) quadrant
and it is safe to not allow end-users to kill them. Since end-users
cannot them spawn them anyways.

- We can identify which system threads are safe for users in (4) quadrant
to kill.

- We can identify system threads in (3) quadrant that needs to be moved to
1st quadrant so that users in (4) quadrant cannot kill them.

- The remaining system threads in the 3rd quadrant will probably will be
safe to be killed by users in 4th quadrant. Threads added in (3)
quadrant need to be evaluated for the fact if they can be run by
SYSTEM_USER as definer.

Internal threads External threads
┌──────────────────────────────────────┬─────────────────────────────┐
│ (1) │ (2) │
With │ SYSTEM_THREAD_SLAVE_IO │ │
system │ SYSTEM_THREAD_SLAVE_SQL │ User connections with │
_user │ SYSTEM_THREAD_EVENT_SCHEDULER │ SYSTEM_USER privilege │
│ SYSTEM_THREAD_NDBCLUSTER_BINLOG │ │
│ SYSTEM_THREAD_SLAVE_WORKER │ │
│ SYSTEM_THREAD_COMPRESS_GTID_TABLE │ │
│ │ │
│ │ │
├──────────────────────────────────────┼─────────────────────────────┤
│ (3) │ (4) │
│ │ │
│ SYSTEM_THREAD_EVENT_WORKER │ User connections without │
without │ │ SYSTEM_USER privilege │
system │ │ │
_user │ Threads that will be moved to (1) │ │
│ SYSTEM_THREAD_INFO_REPOSITORY │ │
│ SYSTEM_THREAD_DD_INITIALIZE │ │
│ SYSTEM_THREAD_DD_RESTART │ │
│ SYSTEM_THREAD_SERVER_INITIALIZE │ │
│ SYSTEM_THREAD_INIT_FILE │ │
│ SYSTEM_THREAD_SERVER_UPGRADE │ │
│ SYSTEM_THREAD_BACKGROUND │ │
│ │ │
└──────────────────────────────────────┴─────────────────────────────┘

- SYSTEM_THREAD_EVENT_WORKER
Is not granted any privilege. It runs with the definer's security
context at the time of executing the event. It is safe for them to be
killed by users in 4th quadrant if definer does not have SYSTEM_USER
privilege. Hence, we can keep them in the 3rd quadrant.

We could grant SYSTEM_USER privilege to the following threads even though
most of them are bootstrap threads so end-users will not see them.

- SYSTEM_THREAD_INFO_REPOSITORY
Created during bootstrap process or mysql startup. Used to read rpl_info
table

- SYSTEM_THREAD_DD_INITIALIZE
Bootstrap thread. Runs during server initialization. No privileges
granted.

- SYSTEM_THREAD_DD_RESTART
Bootstrap thread. Runs during server initialization. No privileges
granted.

- SYSTEM_THREAD_SERVER_INITIALIZE
Runs during server initialization. No privileges granted.

- SYSTEM_THREAD_INIT_FILE
Runs during server initialization. No privileges granted.

- SYSTEM_THREAD_SERVER_UPGRADE
Bootstrap thread. Runs during server initialization. No privileges
granted.

- SYSTEM_THREAD_BACKGROUND
Do not list processlist_id in the performance_schema.threads table.
It means they cannot be killed by KILL command.
Hence, we can keep them in the 3rd quadrant.