WL#4601: Remove fastmutex from the server sources
Affects: Server-5.7 — Status: Complete — Priority: Medium
The MySQL SQL-layer (i.e. not InnoDB) has a custom spin-lock mutex implementation called "fast mutex". This mutex implementation is enabled by default on release builds (-DBUILD_CONFIG=mysql_release) for Linux. MySQL "fast" mutexes have a series of shortcomings and should be removed. 1) Spin-wait loops are hard to get right Due to the superscalar nature of modern processors, busy waiting can incur significant costs if not done properly as the processor generally needs to enforce certain constraints. But how to properly delay the loop tends to vary a lot between specific processors and is a burden that should be handled by the system provided implementation and not us. For example, Intel suggests the use of the PAUSE instruction as a hint for the processor and highlights that it's very important for hyperthreaded CPUs. There is also specific delay strategies on ppc, s390, ia64, etc. More details in subsection 2.1: http://www.intel.com/cd/ids/developer/asmo-na/eng/17689.htm The point is that system provided implementations knows better how to deal with this, which may eventually lead to better performance and power consumption. 2) Useless on single processor systems Busy waiting on single processor systems is completely useless as the thread will spin on the CPU until its time quantum expires. System implementations of spinlocks/mutexes usually know when the code is built for or is running on a uniprocessor system and will skip the busy waiting. 3) Fixed-point arithmetic The 'fast' mutex implementation relies on a PRNG to produce values for the spin delay. The formula for calculating the spin count includes a floating-point division operation that can be a bottleneck for multiprocessors systems that only have a single floating point unit (FPU) (ie: Sun SPARC Enterprise T5240). 4) Adaptive Mutex Nowadays most mainstream systems implement adaptive locks that provide a balance between busy waiting benefits and disadvantages by spinning on a lock only for a limited period and by not spinning at all if the lock is being held by a thread that is not currently running. Solaris mutexes are adaptive by default, Linux provides a attribute to make mutexes adaptive, etc. Another advantage of relying on system provided adaptive locks is that they usually offer environment variables or other means by which one can control the spin count, number of spinners, etc. 5) Starvation Constantly polling the mutex with pthread_mutex_trylock without eventually blocking until the mutex becomes available may lead to potential resource starvation if there is a high demand for the lock. Theoretically this may happen if the thread polling the mutex is the victim of unfavorable scheduling. 6) Offers no measurable advantage Various user reports and benchmarks have shown that there is no measurable performance advantage when MySQL is compiled with 'fast' mutex. 7) Naïveté Pretending that we can implement faster mutexes with such a simple code is just naive and labeling then as "fast mutex" without any evidence is misleading to our users. Associated bugs: BUG#38941: fast mutexes in MySQL 5.1 have mutex contention when calling random() BUG#37703: MySQL performance with and without Fast Mutexes using Sysbench Workload BUG#72805: mutex_delay() creating excess memory traffic, GCC mem barrier needed BUG#72806: mutex_delay() missing x86 pause instruction optimization BUG#72807: Set thread priority in my_pthread_fastmutex_lock User Documentation ================== http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-8.html
NF-1: Removing FAST_MUTEX support should have no negative performance implications.
I-1: The WL will remove the WITH_FAST_MUTEXES CMake build option. Since this option was enabled by default for release builds on Linux, it means that the release builds will change to now use default OS mutexes.
Copyright (c) 2000, 2016, Oracle Corporation and/or its affiliates. All rights reserved.