WL#6407: Code reorganization to avoid race condition between server main thread and the kill server thread
Affects: Server-5.7
—
Status: Complete
Lot of issues were fixed in the past related to server shutdown and ready_to_exit flag. It is also believed to be the root cause of some unexplained spurious failures in automated tests. Refer Bug#11763896 for more information. Currently the Server Main Thread and the Kill Server Thread are synchronized based on a ready_to_exit flag. Once this flag is set, both threads perform cleanup operations concurrently which sometimes leads to double free of resources. Currently, resource cleanup(clean_up() and mysqld_exit()) is common code for abort and shutdown. Cleanup resources is done from main thread, signal thread or kill server thread and the activities are distributed among these threads making it vulnerable to sporadic issues. This worklog comprises the following activities a)Reorganize code such that cleanup of resources is performed only by main thread, instead of kill server thread. b)In storage/perfschema/pfs_lock.h, assert for double free is relaxed(refer function allocated_to_free()). Remove ready_to_exit flag check in this function. One goal of this worklog is to enable currently disabled code in shutdown_performance_schema and cleanup_performance_schema, and to remove all the PFS-related valgrind suppression patterns. c)Lot of Mutexes/Read-Write Locks, Condition variables are defined in global scope(in mysqld.cc) and acquire/release mutex is spread everywhere in the code. This leads to multiple mutexes being used to guard same global variable, wrong order of acquisition leading to deadlock, etc. In order to avoid these issues, these pthread primitives should be encapsulated with the global object which it guards, rather than being acquired/released all over the code. User Documentation ================== This is internal work only. No user docs required.
Existing Signal Handling Mechanism ---------------------------------- Signals such as SIGINT, SIGTERM, SIGQUIT are handled by a separate thread namely signal handling thread (signal_thread) whereas SIGSEGV, SIGABRT, SIGILL, SIGFPE are handled using old style signal handling mechanism by registering a signal handler namely handle_fatal_signal(). Signals are blocked by mysqld main thread and all threads including client handling service threads created by mysqld main thread inherit this signal mask except signal handling thread which explicitly waits on these signals using sigwait() function. Existing shutdown Mechanism --------------------------- Signal handling thread is created using DETACHABLE flag and synchronization between signal handling thread, kill server thread and mysqld main thread for cleanup is done using ready_to_exit flag On receipt of termination signals, signal handling thread creates kill server thread (using DETACHABLE flag) which is responsible for performing cleanup of resources and shutdown of ancillary daemon threads. clean_up() function sets ready_to_exit flag such that mysqld main thread shall continue to perform rest of the cleanup operations and exit. If shutdown request is sent from client(mysqladmin), kill_mysql() function is called which inturn sends termination signals to signal handling thread to initiate shutdown. poll() and select() are called in blocking mode because of which abort_loop flag (set by signal handling thread or kill_mysql() function during server shutdown) is never checked and does not break accept loop in handle_connections_sockets() function. Inorder to exit from the loop, either signal handling thread or kill server thread, closes/shutdown server socket in close_connections() function. unireg_abort() is called from kill server thread for signals other than SIGINT and SIGTERM from signal handling thread. The unireq_abort() function is designed to be called only from the main thread in case of error during server startup. If called from any other thread, it may end up executing mysqld_exit() function concurrently along with the main thread leading to double free of mutexes and other resources handled by mysqld_exit() function In mysql5.1, kill_server() is registered as signal handler for all signals, hence probability of this race condition is high in case of 5.1 source base compared to 5.5 and trunk. Changes to Shutdown Mechanism ------------------------------ The ready_to_exit flag is removed to avoid cleanup operations being performed by more than one thread. This is ensured by creating signal handling thread using PTHREAD_JOINABLE flag and join it with mysqld main thread before cleanup operations are performed by mysqld main thread. For Windows platform, shutdown handler thread is created using PTHREAD_JOINABLE flag and joined with mysqld main thread before cleanup operations are performed by mysqld main thread. clean_up() function which is executed from signal handling thread or kill server thread is moved to the server main thread such that thread which allocates resources will perform cleanup activities rather than signal handler thread or kill server thread. kill server thread is not required and will be removed as close_connections() call will be called by signal thread on receipt of termination signal. On Windows, it will be called by shutdown handler thread. After changes, high level thread flow and functionality would be as follows mysqld main thread | | init_resources(mutex,etc) | | - - - - - - - - - - - - - ->create signal thread | | | Wait for signal create_handlers/slaves/etc | | close_connections() | | join signal thread<- - - - - - - - - - exit thread | cleanup_resources() | mysqld_exit() After signal handling thread exits, fatal signals such as SIGSEGV will not be deregistered and will be available for debugging issues during shutdown. Other Modifications -------------------- The mysql implementation of pthread_join() for windows does not work when the thread to be joined is already finished. Two new functions will be added which will redirect to pthread functions in case of non Windows OS and emulate the following behavior in case of Windows. mythread_create() function will create thread using _beginthreadex and returns the handle. mythread_join() will take handle as input and wait for the thread to finish its work and join. The above two functions will be used only while creating/joining signal handling thread. Modified assert in storage/perfschema/pfs_lock.h which relaxes double free when ready_to_exit flag is set. After changes, it will lead to an assert if double free occurs. Commented cleanup code in performance schema(shutdown_performance_schema() and cleanup_performance_schema() function) is uncommented. kill_in_progress and shutdown_in_progress flags are removed. Removed killed_threads status variable as it is not displayed in show status. Impact ------ Following modules can be affected a)operating system related code b)performance schema c)embedded mode Regression testing needs to be done on above modules for all operating systems
Code Reorganization ------------------- Following two classes namely Global_THD_manager and Blocked_thread_manager will be added as part of code reorganization. Global_THD_manager ------------------ Global_THD_manager is singleton and encapsulates access to global THD list and associated mutexes and condition variables. It maintains set of all registered threads(global_THD_list) and provides mutators for inserting(add_thd()) and removing(remove_thd()) an element. It also provides functions to find THDs and perform some action for all THDs such as get_thd_count(), find_thd() and do_for_all_thds(). LOCK_THD_count mutex which guards THD list object is encapsulated within this class itself and is acquired/released when corresponding accessors or mutators are called. COND_THD_count which is used to notify/wait when new threads are registered/ deregistered is also moved to this class. Thread level statistics such as thread_created, num_thread_running are moved from mysqld.cc to Global_THD_manager class. Mutexes and read/write lock(thread_running_lock, LOCK_THD_count) which guards these statistics are encapsulated in this class. A) enum_thd_lock_type --------------------- It is used in Global_THD_manager::add_thd() and remove_thd() function to specify whether LOCK_THD_count should be acquired during the operation. It is defined as below enum enum_thd_lock_type { THD_NO_LOCK=0, //do not acquire LOCK_THD_count mutex THD_LOCK //acquire LOCK_THD_count mutex }; B) Do_THD and Do_THD_Impl -------------------------- These two classes help in implementing Global_THD_manager::do_for_all_thd() method. To perform some function on all thd in global thread list, user needs to subclass Do_THD_Impl and override operator(). class Do_THD_Impl { public: virtual ~Do_THD_Impl() {} virtual void operator()(THD*) = 0; }; class Do_THD : public std::unary_function{ public: explicit Do_THD(Do_THD_Impl *impl) : m_impl(impl) {} void operator()(THD* thd) { m_impl->operator()(thd); } private: Do_THD_Impl *m_impl; }; In the current code, following source code perform actions on all thds in global thread list. This is rewritten to use do_for_all_thd() method. a) Adjust offset of binary log file for slaves(binlog.cc). b) Count threads which are using bin log file(binlog.cc). c) To count number of worker threads in event scheduler (event_scheduler.cc). d) Set KILL_CONNECTION flag on all thds(mysqld.cc). e) Close vio connection for all thds(mysqld.cc). f) List client process information (sql_show.cc). g) I_S on client process(sql_show.cc) h) Collect status of all threads(sql_show.cc) C) Find_THD and Find_THD_Impl ----------------------------- These two classes help in implementing Global_THD_manager::find_thd() method. To find matching THD from global thread list, user needs to subclass Find_THD_Impl and override operator() to embed logic to find matching THD. class Find_THD_Impl { public: virtual ~Find_THD_Impl() {} virtual bool operator()(THD*) = 0; }; class Find_THD : public std::unary_function { public: explicit Find_THD(Find_THD_Impl *impl) : m_impl(impl) {} bool operator()(THD* thd) { return m_impl->operator()(thd); } private: Find_THD_Impl *m_impl; }; Before overridden operator() implementation is called, Global_THD_manager::find_thd() acquires LOCK_THD_count mutex. Class which subclass Find_THD_Impl and override operator() should acquire LOCK_THD_data mutex while holding LOCK_THD_count mutex to avoid race condition. So it should be acquired inside operator () override method. Also note that, caller of Global_THD_manager::find_thd() function need to release LOCK_THD_data mutex. Please refer sample implementation given in Section E for more information. In the current code, following source code search for thd from the global thread list. This is rewritten to use find_thd() method. a) Find zombie dump thread from the global thd list(rpl_master.cc). b) To find thd based on the thread id for kill_one_thread() function (sql_parse.cc). D) Global_THD_manager class --------------------------- class Global_THD_manager { public: static Global_THD_manager* get_instance(); static void destroy_instance(); /* Call func function for all thds in global thd list after taking local copy of global thd list. Acquires LOCK_thd_remove to prevent removal from global_thd_list. */ int do_for_all_thd_copy(Do_THD_Impl *func); // Call func function for all thds in global thd list. int do_for_all_thd(Do_THD_Impl *func); /* This function calls func() for all thds in global thd list to find matching thd specified in func(). Returns NULL if no thd matches. Note: Class which subclass Find_THD_Impl and override operator() should acquire LOCK_THD_data mutex as this mutex should be acquired while holding LOCK_THD_count mutex to avoid race condition. Caller of this function need to release LOCK_THD_data mutex. */ THD* find_thd(Find_THD_Impl *func); /* Add THD to global THD list. If lock_type is THD_NO_LOCK it assumes that caller already holds LOCK_THD_count mutex. */ void add_thd(THD *thd, enum_thd_lock_type lock_type); /* Remove THD to global THD list. If lock_type is THD_NO_LOCK it assumes that caller already holds LOCK_THD_count mutex. */ void remove_thd(THD *thd, enum_thd_lock_type lock_type); // Accessors/mutators for status variable thread_running. uint get_num_thread_running() { return num_thread_running; } void inc_thread_running(); void dec_thread_running(); // Accessors/mutators for status variable thread_created. void inc_thread_created(); ulonglong get_num_thread_created(); // Acquire LOCK_THD_count mutex. void acquire_thd_lock(); // Release LOCK_THD_count mutex. void release_thd_lock(); /* Assert used in functions to validate LOCK_THD_count mutex is [not] held by caller. */ void assert_if_not_mutex_owner(); void assert_if_mutex_owner(); // Wait on COND_THD_count void wait_thd(); /* Perform timed wait on COND_THD_count. Returns zero on success, EINTR when interrupted, ETIMEDOUT if the absolute time specified by abstime passes before the condition is signaled or broadcasted. */ int timed_wait_thd(struct timespec *abstime); // Sends broadcast to all threads waiting on COND_THD_count. void notify_all_thd(); // Returns the count of items in global_THD_list. uint get_thd_count(); private: Global_THD_manager(); //singleton ~Global_THD_manager(); // Initializes condition variables and mutex. void init(); void deinit(); // Singleton instance. static Global_THD_manager *thd_manager; std::set *global_thd_list; uint global_thd_count; mysql_cond_t COND_thd_count; // Mutex to guard global_THD_list. mysql_mutex_t LOCK_thd_count; // Mutex used to guard removal of elements from global_thd_list. mysql_mutex_t LOCK_thd_remove; // Guards thread_running statistics. my_atomic_rwlock_t thread_running_lock; // Count of active threads which are running queries in the system. uint num_thread_running; // Cumulative number of threads created by mysqld daemon. ulonglong thread_created; }; E)Find_thd_with_id class ------------------------- Sample code to find thd from global thd list by using(implementing) Find_THD_Impl interface and calling find_thd() method is given below /* Callback function used by kill_one_thread to find thd based on the thread id. */ class Find_thd_with_id: public Find_THD_Impl { public: Find_thd_with_id(ulong value): m_id(value) {} virtual void operator()(THD *thd) { if (thd->get_command() == COM_DAEMON) return false; if (thd->thread_id == m_id) { mysql_mutex_lock(&thd->LOCK_thd_data); return true; } return false; } private: ulong m_id; }; Below code snippet can be used, in order to get the matching thd //id represents the thread id to be searched for Global_THD_manager *thd_manager= Global_THD_manager::get_instance(); Find_thd_with_id find_thd_with_id(id); THD* tmp= thd_manager->find_thd(&find_thd_with_id); if (tmp) { // Perform operations using tmp mysql_mutex_unlock(&tmp->LOCK_thd_data); } Note that LOCK_thd_data is released after calling find_thd(). F) Blocked_thread_manager class -------------------------------- Blocked_thread_manager class is singleton and it has responsibility of managing thread cache. Idle threads are added to thread cache and reused when work arrives from clients. After servicing the client, thread calls block_thread() method to register itself in the cache and waits on conditional variable COND_thread_cache till mysqld main thread calls wakeup_thread() to service new incoming client connection. waiting_thd_list and statistics such as blocked_thread_count, max_cached_threads and condition variables such as COND_thread_cache, COND_flush_thread_cache are moved to this class from mysqld.cc class Blocked_thread_manager { public: static Blocked_thread_manager* get_instance(); static void destroy_instance(); // Kill all threads which are in cache. called during shutdown void kill_cached_threads(); // Block idle thread to wait till work arrives bool block_thread(); // Wakeup idle thread to perform some task for thd bool wakeup_thread(THD *thd); // Returns the total number of blocked threads uint get_num_blocked_threads() { return blocked_thread_count; } private: Blocked_thread_manager(); //singleton ~Blocked_thread_manager(); // Singleton instance static Blocked_thread_manager *blocked_thread_manager; // Holds THD which will be picked up by blocked thread on receipt on //COND_thread_cache signal std::list *waiting_thd_list; // Condition variable on which on blocked thread waits mysql_cond_t COND_thread_cache; // Condition variable used during shutdown to stop all cached threads mysql_cond_t COND_flush_thread_cache; // Set during shutdown to stop all blocked threads bool kill_blocked_threads_flag; // Represents the number of threads to be waken up uint wake_thread; // Represents the total number of threads blocked uint blocked_thread_count; // Represents system variable thread_cache_size uint max_cached_threads; };
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.