WL#3984: Revise locking of mysql.general_log and mysql.slow_log
Affects: Server-5.1 — Status: Complete — Priority: Medium
This Work Log entry details changes in the architecture / design implemented with the fix for BUG#25422 (Hang with log tables). This is not a new feature, but rework on an existing feature (log tables), introduced by WL#1019. Problem: The LOGGER class in 5.1 introduced fake THD objects like: - Log_to_csv_event_handler::general_log_thd - Log_to_csv_event_handler::slow_log_thd which are not associated to a real thread running in the server, and implement a special logic for locking mysql.general_log and mysql.slow_log. This in general cause many integration / architectural issues with the implementation of other features in the server, and must be revisited. In particular, see: - BUG#25422 (Hang with log tables) - BUG#29129 (Resetting general_log while the GLOBAL READ LOCK is set causes a deadlock)
A "log table" is defined as either mysql.general_log or mysql.slow_log. 1) In general, log tables are considered PERFORMANCE_SCHEMA tables, meaning that the primary purpose of these tables is to provide an interface for the user to *observe* the runtime execution of the system. These tables are not intended to provide a way for a user to *interfere* with the runtime execution of the system. Rationale: This was identified in WL#1019, and repeated here to emphasise later design choices related to locking. PERFORMANCE_SCHEMA tables are for runtime execution what INFORMATION_SCHEMA tables are for the system meta-data. [Discussion between Marc and Peter, Uppsala, June 2007] 2) The global read lock has no effect on a log table. Rationale: This is both: - the correct result: the system should run -- and write logs -- even when executing with a global read lock in effect. A query that only reads data (select 1) should not be blocked when printing to the general_log or the slow log. - preserving compatibility with 5.0: When logs are printed to *files* instead of tables, logging is available even with a global read lock in effect. Users have the expectation that, changing *how* a log is recorded (table or file) should not impact *when* a log can be recorded. 3) The global READ_ONLY state has no effect on a log table. Rationale: Same as for the global read lock in 2) The system should still be able to print in the general_log and the slow_log, even in a read only state. 4) TRUNCATE TABLE is a valid operation on a log table. Rationale: There must be a way to prevent the content of the log to grow forever. This is needed because logs are *persistent*, and the size of a log is not artificially constrained (to N entries or the age of the entry) Historically, system administrators on unix platforms would perform: - a /usr/bin/rm of a log *file* - a FLUSH LOGS to force the system to close a file handle and reopen the file, to effectively truncate a log file. Performing this type of operations unfortunately does not work on windows platforms (cant remove a file while it's opened). Providing a TRUNCATE TABLE <log table> is a cleaner way to perform a truncate, that works in all platforms. [Discussion between Marc and Trudy, Uppsala, June 2007] Note: This is already implemented, documenting existing features. 5) RENAME TABLE is a valid operation on a log table, and can be used atomically for log rotation, as in: - create table general_log_new like general_log; - rename table general_log TO renamed_general_log, general_log_new TO general_log; Rationale: There must be a way to capture / archive / purge logs, which is when "log rotation" techniques comes handy. Historically, system administrators on unix platforms would perform: - /usr/bin/mv of a log *file* to another file name, - /usr/bin/touch of the log *file* - a FLUSH LOGS to force the system to close a file handle and reopen the file, to effectively cause a log rotation. Performing this type of operations unfortunately does not work on windows platforms (cant rename a file while it's opened). Providing a RENAME TABLE <log table> is a cleaner way to perform this rotation, that works in all platforms. [Discussion between Marc and Trudy, Uppsala, June 2007] Note: This is already implemented, documenting existing features. 6) CREATE TABLE, ALTER TABLE, DROP TABLE are valid operations on a log table. Rationale: The issue here is not related to usage patterns, but to the implementation of the server. The server code does not embed the full table definition of tables used by the system (like log tables), but relies on the installation scripts to create / alter / drop these tables in --bootstrap mode. For the installation script to work, CREATE / DROP / ALTER have to be supported on log tables, since these are presented as queries. Note that this existing "feature" has already been abused by some to perform ALTER TABLES against log tables to enhance the system in unforeseen ways (like adding auto increment columns, or changing the storage engine and not use CSV) 7) FLUSH LOGS It's unclear if FLUSH LOGS should flush only log files, and/or log tables. Documenting the proposed implementation choice: FLUSH LOGS does flush the log *files*, but does not flush the log *tables*. Note: This is a change compared to the current implementation. 8) FLUSH TABLES It's unclear if FLUSH TABLES on a log table should flush the table or not, and how it can be redundant with FLUSH LOGS. Documenting the proposed implementation choice: FLUSH TABLES does flush all the tables, including log tables. Note: This is a change compared to the current implementation, which silently ignores log tables. Rationale for 7) and 8): These commands are seen as a way to interfere directly with the server implementation. The "FLUSH LOGS" command was first introduced when logs were stored in files, and should be understood as "FLUSH LOG FILES". The fact that the exact behavior is driven by implementation considerations is not a concern, just a matter of documentation. Having "FLUSH TABLES" actually flush all the tables, including the log tables, seems a natural choice. 9) LOCK TABLES can *not* be used on log tables. Rationale: Holding a lock in a user session on a performance schema table is a very potent deny of service attack. Since: - new connections are logged into the general log - new queries are logged into the general log, *before* being executed, this can lead to dead locks where: - the server does not respond to new connections, - the server dead locks on "UNLOCK TABLES" queries ... humm. The previous implementation: - would allow: lock tables mysql.slow_log READ LOCAL - would deny: lock tables mysql.general_log READ to prevent some deadlocks already. Because of different combinations with: - SET GLOBAL general_log = ON - TRUNCATE TABLE mysql.general_log even holding a READ LOCAL lock can dead lock the system, and should be prevented. This is a change compared to the current implementation. 10) INSERT / UPDATE / DELETE can *not* be used on log tables. Rationale: This is a direct consequence of point 1), since log tables are considered to be PERFORMANCE_SCHEMA tables. Obviously, there exist a mechanism to insert data in the logs, but this mechanism is private to the server implementation and not exposed to a user via SQL queries, even SUPER. Note from monty: You should be able to INSERT into the general log tables if the logging is not enabled. (If not, you will have a problem restoring a mysqldump o f the log tables).
Implementation of WL#3984 changed the design as follows: 1) bool TABLE_SHARE::system_table and bool TABLE_SHARE::log_table are related. Instead of multiplying various "is_X" "is_Y" boolean flags, the attribute TABLE_CATEGORY TABLE_SHARE::table_category has been introduced, that better describe the nature of the table. 2) The enum TABLE_CATEGORY has been introduced, see the comments there. Special exceptions regarding FLUSH TABLES WITH READ LOCK or SET GLOBAL READ_ONLY=ON, in lock.cc, are now implemented with more clarity with the use of honor_global_locks(). In general, helpers like honor_XYZ() are encouraged, as this breaks the correlation between - how a table behaves with regards to 1 property (global read lock, ...) - what the table is used for (user, system, ...) 3) my_bool TABLE_SHARE::locked_by_logger has been removed, this was an artificial hack. 4) The previous implementation confused: - 'system table' with ENGINE = MYISAM - 'log table' with ENGINE = CSV so that logic related to enforcing constraints for what the SQL layer considers 'system' or 'log' tables was not implemented in the SQL layer but in the storage engines. The methods: - handler::check_if_locking_is_allowed - handler::check_if_log_table_locking_is_allowed - ha_tina::check_if_locking_is_allowed - ha_myisam::check_if_locking_is_allowed have been removed, and the logic implemented in lock.cc instead. 5) The fake threads, Log_to_csv_event_handler::general_log_thd and Log_to_csv_event_handler::slow_log_thd, are removed 6) In general, a lot of code in log.h / log.cc dealing with opening / closing tables has been removed. 7) Writing to the log tables is still a privileged operation. Before, it was enforced by testing that the 'current thread is the logger thread'. Now, it's enforced by the use either : - CF_WRITE_LOGS_COMMAND, marking commands that are allowed to write logs, - MYSQL_LOCK_PERF_SCHEMA, marking calls from the LOGGER 8) TRUNCATE TABLE does not interfere with the logger implementation any more, all the code used to 'shutdown the logger, truncate, then restart the logger' is removed, and the operation is online. 9) RENAME TABLE does not interfere with the logger implementation any more, regular table locking rules should apply even with log tables. 10) LOGGER.lock() does not serialize all calls using a mutex, which creates a performance bottleneck, but uses a rw_lock instead. 11) Writing to log files is still serialized, since the code in MYSQL_QUERY_LOG::write() is not thread safe. 12) Writing to log tables is done in parallel. 13) LOGGER.lock_shared() and LOGGER.lock_exclusive() locks are only used when changing where logs are printed to. All the occurences of LOGGER.lock() introduced in the various implementation of sql statements have been removed. 14) open_ltable() now takes an explicit lock_flags parameter.
Copyright (c) 2000, 2015, Oracle Corporation and/or its affiliates. All rights reserved.