WL#3984: Revise locking of mysql.general_log and mysql.slow_log

Affects: Server-5.1   —   Status: Complete

This Work Log entry details changes in the architecture / design implemented
with the fix for BUG#25422 (Hang with log tables).

This is not a new feature, but rework on an existing feature (log tables),
introduced by WL#1019.

Problem:

The LOGGER class in 5.1 introduced fake THD objects like:
- Log_to_csv_event_handler::general_log_thd
- Log_to_csv_event_handler::slow_log_thd
which are not associated to a real thread running in the server,
and implement a special logic for locking mysql.general_log and mysql.slow_log.

This in general cause many integration / architectural issues with the
implementation of other features in the server, and must be revisited.

In particular, see:
- BUG#25422 (Hang with log tables)
- BUG#29129 (Resetting general_log while the GLOBAL READ LOCK is set causes
  a deadlock)
A "log table" is defined as either mysql.general_log or mysql.slow_log.

1)

In general, log tables are considered PERFORMANCE_SCHEMA tables,
meaning that the primary purpose of these tables is to provide an
interface for the user to *observe* the runtime execution of the system.
These tables are not intended to provide a way for a user to
*interfere* with the runtime execution of the system.

Rationale:
This was identified in WL#1019, and repeated here to emphasise later design
choices related to locking.
PERFORMANCE_SCHEMA tables are for runtime execution what INFORMATION_SCHEMA
tables are for the system meta-data.

[Discussion between Marc and Peter, Uppsala, June 2007]

2)

The global read lock has no effect on a log table.

Rationale:
This is both:
- the correct result: the system should run -- and write logs -- even when
executing with
a global read lock in effect. A query that only reads data (select 1) should
not be blocked when printing to the general_log or the slow log.
- preserving compatibility with 5.0: When logs are printed to *files* instead
of tables, logging is available even with a global read lock in effect.
Users have the expectation that, changing *how* a log is recorded (table or
file) should not impact *when* a log can be recorded.

3)

The global READ_ONLY state has no effect on a log table.

Rationale:
Same as for the global read lock in 2)
The system should still be able to print in the general_log and the slow_log,
even in a read only state.

4)

TRUNCATE TABLE is a valid operation on a log table.

Rationale:
There must be a way to prevent the content of the log to grow forever.
This is needed because logs are *persistent*, and the size of a log is not
artificially constrained (to N entries or the age of the entry)

Historically, system administrators on unix platforms would perform:
- a /usr/bin/rm of a log *file*
- a FLUSH LOGS to force the system to close a file handle and reopen the file,
to effectively truncate a log file.

Performing this type of operations unfortunately does not work on windows
platforms (cant remove a file while it's opened).

Providing a TRUNCATE TABLE  is a cleaner way to perform a truncate,
that works in all platforms.

[Discussion between Marc and Trudy, Uppsala, June 2007]

Note:
This is already implemented, documenting existing features.

5)

RENAME TABLE is a valid operation on a log table,
and can be used atomically for log rotation, as in:
- create table general_log_new like general_log;
- rename table general_log TO renamed_general_log, general_log_new TO general_log;

Rationale:
There must be a way to capture / archive / purge logs, which is when "log
rotation" techniques comes handy.
Historically, system administrators on unix platforms would perform:
- /usr/bin/mv of a log *file* to another file name,
- /usr/bin/touch of the log *file*
- a FLUSH LOGS to force the system to close a file handle and reopen the file,
to effectively cause a log rotation.

Performing this type of operations unfortunately does not work on windows
platforms (cant rename a file while it's opened).

Providing a RENAME TABLE  is a cleaner way to perform this rotation,
that works in all platforms.

[Discussion between Marc and Trudy, Uppsala, June 2007]

Note:
This is already implemented, documenting existing features.

6)

CREATE TABLE, ALTER TABLE, DROP TABLE are valid operations on a log table.

Rationale:
The issue here is not related to usage patterns, but to the implementation of
the server. The server code does not embed the full table definition of tables
used by the system (like log tables), but relies on the installation scripts
to create / alter / drop these tables in --bootstrap mode.
For the installation script to work, CREATE / DROP / ALTER have to be supported
on log tables, since these are presented as queries.

Note that this existing "feature" has already been abused by some to perform
ALTER TABLES against log tables to enhance the system in unforeseen ways (like
adding auto increment columns, or changing the storage engine and not use CSV)

7)

FLUSH LOGS

It's unclear if FLUSH LOGS should flush only log files, and/or log tables.

Documenting the proposed implementation choice:
FLUSH LOGS does flush the log *files*, but does not flush the log *tables*.

Note:
This is a change compared to the current implementation.

8)

FLUSH TABLES

It's unclear if FLUSH TABLES on a log table should flush the table or not,
and how it can be redundant with FLUSH LOGS.

Documenting the proposed implementation choice:
FLUSH TABLES does flush all the tables, including log tables.

Note:
This is a change compared to the current implementation,
which silently ignores log tables.

Rationale for 7) and 8):
These commands are seen as a way to interfere directly with the server
implementation. The "FLUSH LOGS" command was first introduced when logs
were stored in files, and should be understood as "FLUSH LOG FILES".
The fact that the exact behavior is driven by implementation considerations
is not a concern, just a matter of documentation.
Having "FLUSH TABLES" actually flush all the tables, including the log tables,
seems a natural choice.

9)

LOCK TABLES can *not* be used on log tables.

Rationale:
Holding a lock in a user session on a performance schema table is a very
potent deny of service attack.
Since:
- new connections are logged into the general log
- new queries are logged into the general log, *before* being executed,
this can lead to dead locks where:
- the server does not respond to new connections,
- the server dead locks on "UNLOCK TABLES" queries ... humm.

The previous implementation:
- would allow: lock tables mysql.slow_log READ LOCAL
- would deny: lock tables mysql.general_log READ
to prevent some deadlocks already.

Because of different combinations with:
- SET GLOBAL general_log = ON
- TRUNCATE TABLE mysql.general_log
even holding a READ LOCAL lock can dead lock the system, and should be
prevented.

This is a change compared to the current implementation.

10)

INSERT / UPDATE / DELETE can *not* be used on log tables.

Rationale:
This is a direct consequence of point 1), since log tables are
considered to be PERFORMANCE_SCHEMA tables.
Obviously, there exist a mechanism to insert data in the logs,
but this mechanism is private to the server implementation and
not exposed to a user via SQL queries, even SUPER.

Note from monty: You should be able to INSERT into the general log tables if the
logging is not enabled. (If not, you will have a problem restoring a mysqldump o
f the log tables).
Implementation of WL#3984 changed the design as follows:

1)

bool TABLE_SHARE::system_table and bool TABLE_SHARE::log_table are related.
Instead of multiplying various "is_X" "is_Y" boolean flags, the attribute
TABLE_CATEGORY TABLE_SHARE::table_category has been introduced, that better
describe the nature of the table.

2)

The enum TABLE_CATEGORY has been introduced, see the comments there.

Special exceptions regarding FLUSH TABLES WITH READ LOCK or
SET GLOBAL READ_ONLY=ON, in lock.cc, are now implemented with more clarity
with the use of honor_global_locks().

In general, helpers like honor_XYZ() are encouraged, as this breaks the
correlation between
- how a table behaves with regards to 1 property (global read lock, ...)
- what the table is used for (user, system, ...)

3)

my_bool TABLE_SHARE::locked_by_logger has been removed, this was an artificial
hack.

4)

The previous implementation confused:
- 'system table' with ENGINE = MYISAM
- 'log table' with ENGINE = CSV
so that logic related to enforcing constraints for what the SQL layer
considers 'system' or 'log' tables was not implemented in the SQL layer but
in the storage engines.

The methods:
- handler::check_if_locking_is_allowed
- handler::check_if_log_table_locking_is_allowed
- ha_tina::check_if_locking_is_allowed
- ha_myisam::check_if_locking_is_allowed
have been removed, and the logic implemented in lock.cc instead.

5)

The fake threads, Log_to_csv_event_handler::general_log_thd and
Log_to_csv_event_handler::slow_log_thd, are removed

6)

In general, a lot of code in log.h / log.cc dealing with opening / closing
tables has been removed.

7)

Writing to the log tables is still a privileged operation.
Before, it was enforced by testing that the 'current thread is the logger
thread'.
Now, it's enforced by the use either :
- CF_WRITE_LOGS_COMMAND, marking commands that are allowed to write logs,
- MYSQL_LOCK_PERF_SCHEMA, marking calls from the LOGGER

8)

TRUNCATE TABLE does not interfere with the logger implementation any more,
all the code used to 'shutdown the logger, truncate, then restart the logger'
is removed, and the operation is online.

9)

RENAME TABLE does not interfere with the logger implementation any more,
regular table locking rules should apply even with log tables.

10)

LOGGER.lock() does not serialize all calls using a mutex, which creates
a performance bottleneck, but uses a rw_lock instead.

11)

Writing to log files is still serialized, since the code in
MYSQL_QUERY_LOG::write() is not thread safe.

12)

Writing to log tables is done in parallel.

13)

LOGGER.lock_shared() and LOGGER.lock_exclusive() locks are only used when
changing where logs are printed to.
All the occurences of LOGGER.lock() introduced in the various implementation
of sql statements have been removed.

14)

open_ltable() now takes an explicit lock_flags parameter.