WL#3599: Benchmark demonstrating the performance of loose index scan for aggregate functions with distinct

Affects: Server-6.0   —   Status: Complete   —   Priority: Low

Currently MySQL executes COUNT(DISTINCT expr1, ..., exprN) by executing
a table or index scan that fills a temporary data structure (table or
tree) with all distinct values, and then returning the count of the
distinct values in that temporary data structure.

In cases when there are applicable indexes it is possible to execute
such queries much faster via the "loose index scan" technique employed
for execution of GROUP BY and DISTINCT queries, as implemented by
WL#1724.

The goals of this task are :

1. Measure the performance benefits realized by the implementation of 
   WL#3220 "Loose index scan for aggregate functions"
2. Present these measurements in a ready to use graphical form.
Benchmark task; see LLD for details.
No further HLS needed.
1. Introduction

When executing AGGREGATE_FUNC(DISTINCT ...) MySQL may use index scan
to go over the input rows.
This benchmark compares performing full index scan to retrieve the 
input rows of AGGREGATE_FUNC(DISTINCT ...) with "loose index scan" (1).
Loose index scan uses an index to "jump" on the start of each matching
group of rows. Then it goes over the qualifying subset of rows, skips 
the rest of the rows and jumps on the next group. The index used may 
include additional columns after the grouping columns.
In the worst case (when there are no rows to skip) loose index scan 
will simply go through all the rows in the index, witch is a
equivalent of a full index scan.
In both cases either an in-memory structure or a disk based
structure can be used to accumulate the distinct rows. 

We measure how query execution times change when using loose index 
scan compared to using full index scan. 
This change (percent, %) is measured as a function of the following 
parameters :
 - Result set size - rows
 - Distinct values in the result set - percent (0-100)

 - Location of the DISTINCT table (memory or disk)

2. Experimental setup

2.1 Test Database

We use the following MyISAM tables :
create table t11 (a char(100), KEY (a));
create table t12 (a char(100), KEY (a));
create table t13 (a char(100), KEY (a));
create table t14 (a char(100), KEY (a));
create table t21 (a char(100), KEY (a));
create table t22 (a char(100), KEY (a));
create table t23 (a char(100), KEY (a));
create table t24 (a char(100), KEY (a));
create table t31 (a char(100), KEY (a));
create table t32 (a char(100), KEY (a));
create table t33 (a char(100), KEY (a));
create table t34 (a char(100), KEY (a));
create table t41 (a char(100), KEY (a));
create table t42 (a char(100), KEY (a));
create table t43 (a char(100), KEY (a));
create table t44 (a char(100), KEY (a));

We populate the tables according to the following table :
-----------------------------
|Rows   |distinct values, % |
|power  |-------------------|
|of two |  10|  35|  60|  85| 
|-------|----|----|----|----|
|   20  | t11| t12| t13| t14|
|   21  | t21| t22| t23| t24|   
|   22  | t31| t32| t33| t34|
|   23  | t41| t42| t43| t44|
-----------------------------

(e.g. we fill 7 million rows in t43 and we use 5 033 164 
distinct values evenly distributed : 5 033 164 is 60% of 
8 388 608 (2^23) rows).
This makes the index of the tables to occupy approximately 
from 100MB (for t11) to 800M (for t44).

2.2 Server Configuration

We make sure the whole index will fit into memory cache 
(key_buffer_size > index size). We set "key_buffer_size=900M" 
to be on the safe side.

We examine cases where all the distinct values can fit into
memory and cases where they must be written to disk. 

We use 10%-85% (step 25%) distinct values (in uniform groups) 
for values of column 'a'. This makes for distinct values sizes
from 10M (for t11) to 680M (for t44).


we set for the in-memory graphic:
max_heap_table_size=700M

and for the disk graphic:
max_heap_table_size=9M


2.3 Test Queries

The optimization applies to AVG(DISTINCT), SUM(DISTINCT) and 
COUNT(DISTINCT).
We use the following two sets of queries. We must measure
both AVG and COUNT because of the different implementation of 
AVG/SUM(DISTINCT) and COUNT(DISTINCT) : COUNT only uses the 
number of elements from the DISTINCT calculation structure 
and does not iterate over the distinct rows, whereas AVG/SUM 
iterates.


Loose index scan:
SELECT COUNT(DISTINCT a) FROM t1;
SELECT AVG(DISTINCT a) FROM t1;


Full index scan:
SELECT COUNT(DISTINCT a + 1) FROM t1;
SELECT AVG(DISTINCT a + 1) FROM t1;


2.4 Hardware Requirements

The test must be run on a box that has at least 1.8G of physical 
memory available to mysqld (as we use 900M for key cache and 
max. 680M of memory for distinct rows) so we don't go to OS swap.

3. Benchmark implementation/execution

The benchmark will consist of the following steps in a perl 
script with two command line options (on by default):
 --load : execute 3.1
 --run  : execute 3.2

3.1 Fill the tables. We use queries like e.g:
  INSERT INTO t43 VALUES (5033164 * RAND());
  
  INSERT INTO t43 SELECT 5033164 * RAND();
  (repeat 22 times until we get 8 388 608 rows)
 
3.2 Perform the measurement (pseudo-code) :
  
 For each table <tij> where i = 1..4 and j = 1..4) do
   set num_rows :=  from table in 2.1.
   set distinct_pct:= from table in 2.1.
  
   execute "FLUSH QUERY CACHE"  -- free the cache
   execute "LOAD INDEX INTO CACHE tij"
   For <heap_table_size> = 9M,700M do

     set <out_file> = if (<heap_table_size> == 9M, 
                          "disk", "memory")
     execute "set @@max_heap_table_size = <heap_table_size>"

     -- warm up the cache (loose)
     execute "SELECT COUNT(DISTINCT a) FROM tij;"

     -- measure loose index scan
     start_time := now();
     execute "SELECT COUNT(DISTINCT a) FROM tij;"
     duration_loose_count:= now() - start_time(); 
     output to <out_file>"_count_loose.csv" (
        num_rows, distinct_pct, duration_loose_count)

     -- warm up the cache (loose)
     execute "SELECT AVG(DISTINCT a) FROM tij;"
     -- measure loose index scan
     start_time := now();
     execute "SELECT AVG(DISTINCT a) FROM tij;"
     duration_loose_avg:= now() - start_time();
     output to <out_file>"_avg_loose.csv" (
        num_rows, distinct_pct, duration_loose_avg)
      
     -- warm up the cache (full)
     execute "SELECT COUNT(DISTINCT a + 1) FROM tij;"
 
     -- measure full index scan
     start_time := now();
     execute "SELECT COUNT(DISTINCT a + 1) FROM tij;"
     duration_full_count := now() - start_time();
     output to <out_file>"_count_full.csv" (
        num_rows, distinct_pct, duration_full_count)

     -- warm up the cache (full)
     execute "SELECT AVG(DISTINCT a + 1) FROM tij;"
     -- measure full index scan
     start_time := now();
     execute "SELECT AVG(DISTINCT a + 1) FROM tij;"
     duration_full_avg := now() - start_time();
     output to <out_file>"_avg_full.csv" (
        num_rows, distinct_pct, duration_full_avg)

     -- output summary
     output to <out_file>"_count_summary.csv" (
              num_rows, distinct_pct, 
              (<duration_full_count> - <duration_loose_count>) / 
              <duration_full_count> * 100)  
     output to <out_file>"_avg_summary.csv" (
              num_rows, distinct_pct, 
              (<duration_full_avg> - <duration_loose_avg>) / 
              <duration_full_avg> * 100)  
   Next <heap_table_size>
 Next <table>

Two sets of six files in csv format will be produced by the 
script for further processing by MS Excel / oCalc : one for 
each polygon.

 

4. Outcome

We make 3D polygons where we have :
 
X - Number of processed rows
Y - % of distinct rows in X
Z - query execution time/difference in times, %

X and Y depend on the table being used. see the table in 2.1 for 
values.

We measure 4 points for each of the X and Y axis, making 16 points 
in total per polygon.

For each aggregate function we have 6 output files (in two groups). 
First group is contains absolute data (i.e. times for the Z axis): 
 - disk_loose.csv : loose index scan with DISTINCT on disk
 - disk_full.csv : full index scan with DISTINCT on disk
 - memory_loose.csv : loose index scan with DISTINCT in memory
 - memory_full.csv : full index scan with DISTINCT in memory

The second group contains time gains (loose vs. full index scan)
 in percents for Z :
 - memory_summary.csv : DISTINCT in memory
 - disk_summary.csv : DISTINCT on disk  

4.1 For each aggregate function two 3D plots will be made : 
 "Disk" (for disk based DISTINCT) and "Memory" (for memory based 
  DISTINCT). Each plot will have 2 polygons : for loose and full 
  index scan. 

For "Disk" we draw 2 polygons out of files "disk_loose.csv" and 
"disk_full.csv".

For "Memory" we draw 2 polygons out of files "memory_loose.csv" 
and "memory_loose.csv".

4.2 For each aggregate function a third "Summary" plot will be made. 
The third plot will also have two polygons : for disk and memory 
based DISTINCT.

We use "disk_summary.csv" and "memory_summary.csv" to draw the two
polygons.

References:
1. http://dev.mysql.com/doc/refman/5.1/en/loose-index-scan.html