WL#12616: InnoDB: Make number of PAUSES in spin loops configurable

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

When performing a spin locking loop to check if the mutex or rw-lock become
free, we usually pick a small random `delay` (say a dice roll), multiply it by 50 
and perform that many PAUSE instructions.
The comment for ut_delay() states this procedure was calibrated on 100 MHz 
Pentium. Things have changed since then. 
In particular, Skylake processors have a much slower (~15x) PAUSE instruction than 
other machines.
The purpose of this WL is to let the end user configure this algorithm, so that it 
is better suited to the particular processor.
FR1. By default, the patch should not affect the system in any way (in particular 
the default value of new sys var should be chosen as to simulate the old 
behaviour), so that upgrading the system will not accidentally affect it.

FR2. It must be possible to configure InnoDB running on a system on which PAUSE 
instruction takes 10x as much time as on "regular" system, in such a way, that the 
observed duration of performed chains of PAUSES take the same time it would on 
regular system. In other words, if on "regular" system the observed duration 
follows unif{0,6}*50*Regular_pause_duration distribution, then we want to have 
similar distribution on a system where single PAUSE takes 
Regular_pause_duration*10, so that we can support processor architectures with 
much slower PAUSES.
(NOTE: arguably this is difficult to test using "external" tools, other than 
debugging/looking at the code - ideas on how to test FR2, are welcomed)


FR3. The end user should have ability to fine tune the system (w.r.t. to number of 
PAUSES per spin) in real time on a running InnoDB instance, so that she or he can 
take a more end-to-end view on performance, as it is not clear that original value 
of 50 or suggested value of 50/10 is the actual optimum.
When performing a spin locking loop to check if the mutex or rw-lock become
free, we usually pick a number `delay` from range 0..@@innodb_spin_wait_delay
uniformly at random, and pass it to ut_delay(delay) which performs 50*delay calls 
to UT_RELAX_CPU() (which translates to PAUSE).
The comment for ut_delay() states it was calibrated on 100 MHz Pentium.

Things have changed since then. 
In particular, Skylake processors have a much slower (~15x) PAUSE instruction than 
other machines, which means that to achieve a similar slowdown we would need to 
reduce @@innodb_spin_wait_delay 15 times, but this is 

a) not possible, as the default value of it is 6, and thus 6/15 after rounding is 
either zero or 1
b) not equivalent, as the goal of randomization seems to be to decorrelate 
multiple 
waiters, and if we shrink the range from which the random `delay` is picked, then 
the ppb of collision (picking same number as another thread) is higher, and 
c) the granularity of "50" will not be affected, so one can not get values smaller 
than 50 but larger than 0

The constant 50 should instead be platform-dependent (auto-tuning) or configurable 
(new dynamic sys-var). This would allow running our software on variety of 
processors which differ in their implementation of PAUSE instruction. It would 
also give power to the user to fine tune the value based on real world data (say, 
by observing the lock contention, or time wasted in ut_delay while the server 
performs typical transactions).


New system variables introduced:
================================
@@innodb_spin_wait_pause_multiplier 
- a global, dynamic, system variable, ranging from 0 to 100, with default 
equal to the backward-compatible value of 50
- can be accessed like this:
set global innodb_spin_wait_pause_multiplier=10;
select @@global.innodb_spin_wait_pause_multiplier;
This WL will modify the `ulint ut_delay(uling delay)` function which is used to 
perform PAUSE instructions. 
As of today it performs delay*50 PAUSES. 
The new implementation will replace the constant 50 with a dynamic system 
variable, with default value of 50.

So, there are two changes needed:
1. introduction of a new dynamic innodb-specific system variable
2. modification of the waiting loop so that the variable's current value is used 
instead of 50

Some care must be taken to avoid unintentionally affecting the performance of the 
loop, by reading the current value of sys var itself. The sufficient solution is 
to use non-atomic read, and do it only once before the loop.