WL#4439: Multi Range Read implementation for Falcon

Affects: Server-9.x   —   Status: Un-Assigned   —   Priority: Medium

Implement Multi Range Read for Falcon storage engine. Falcon already
internally does something similar to DS-MRR [*], but it is limited to
individual ranges/index lookups.

We could extend this to work across multiple ranges.

[*] See WL#2475 

1. Why this is needed and existing MRR roots in Falcon
2. Falcon/MRR implementation
2.1 Use Falcon's own memory
2.2 No scan interruption/continuation
2.3 Range association done by SQL layer
2.3 Falcon/MRR scan may return out-of-range records


1. Why this is needed and existing MRR roots in Falcon

When MySQL asks falcon to scan a range by calling


Falcon will reads records in record number order - it will first 
collect record numbers in a sparse bitmap and then do a record retrieval

When one runs a join 

SELECT * FROM City, Airport WHERE City.ID= Airport.City

when handler calls will be as follows:

  city->rnd_next() = 'Boston';
    airport->index_read('Boston') = 'Logan'
    airport->index_next_same() = EOF
  city->rnd_next() = 'Moscow';
    airport->index_read('Moscow') = 'Sheremetyevo'
    airport->index_next_same() = 'Vnukovo'
    airport->index_next_same() = 'Domodedovo'
    airport->index_next_same() = EOF

  city->rnd_next() = 'Paris';
    airport->index_read('Paris') = 'Charles De Gaulle'
    airport->index_next_same() = 'Orly'
    airport->index_next_same() = EOF

  city->rnd_next() = 'Danvers';
    airport->index_read('Danvers') = EOF

Lots of ranges with small number of records in each (typical for joins)
will break Falcon's sweep-read strategy

/* Btw, customers do file CSC cases with *huge* ranges, e.g.
   SELECT * FROM tbl WHERE key IN (val1, ......, val10K) */

2. Falcon/MRR implementation
The SQL Layer part is ready: there is MRR interface (WL#2474) and Batched
Key Access (WL#2771). We just need to extend Falcon's sweep-read functionality
to span across ranges.

2.1 Use Falcon's own memory
MRR interface is designed with centralized buffer space management in mind - 
the SQL layer is supposed to allocate a single MRR buffer and then share it 
with all involved storage engines and tables (this is a step in a direction
towards being able to control memory usage).

Decision made at Manchester,MA meeting: Falcon will still use its own internal
memory buffers, like it does for regular range/ref scans. This is more in line 
with Falcon's architecture. The case when there is not enough memory for the
record bitmap is considered extreme, we intentionally do not handle it
(see all arguments re 64 bit architectures, linux's overcommit etc)

2.2 No scan interruption/continuation
Following the direction of the previous decision: since we assume there is
enough memory for the record buffer, the entire Falcon/MRR scan can be done 
in one "collect rowids, retrieve records" iteration.

This makes Falcon/MRR code much easier as Falcon has no natural way to
interrupt an index scan and then continue (Sergey P tried to add it and it
proved to be non-trivial. Non-triviality of this has been confirmed by Ann).

2.3 Range association done by SQL layer
Falcon accumulates rowids (falcon's term is "record numbers") in a sparse bitmap, 
which means that at record retrieval stage it won't be able to tell which
range the record is expected to be in.

The apparent resolution is to add the code that will use a hash table to
determine which range the obtained record falls into (hash table is enough
because we're primarily interested in equality ranges).

This will be done at the SQL layer. Falcon/MRR will only set
MRR_NO_ASSOCIATION flag to signal that it is not able to do range association.

2.3 Falcon/MRR scan may return out-of-range records
See previous section. In falcon, it might turn out that the record has been
altered by another transaction and now is outside of the scanned range.

For single range scans, Falcon code checks if the record falls within the
range before returning it.  In MRR scan Falcon will not be able to do that
because it would not know which of the ranges the record is supposed to be in.

Since range association is done at the SQL layer, the check if returned record
is contained within some of the scanned ranges will be done there as well.

That is, Falcon/MRR scan may return records that do not fall into any range.
The caller is expected to filter them out.
None needed as HLS is detailed enough.