WL#11322: SUPPORT LOOSE INDEX RANGE SCANS FOR LOW CARDINALITY

Affects: Server-8.0   —   Status: Complete

Based on the patch provided by Facebook.

This WL improves the optimizer range scans for certain queries. Suppose we
have the following:

CREATE TABLE t1 (f1 INT NOT NULL, f2 INT NOT NULL, PRIMARY KEY(f1, f2));
INSERT INTO t1 VALUES
(1,1), (1,2), (1,3), (1,4), (1,5),
(2,1), (2,2), (2,3), (2,4), (2,5);
INSERT INTO t1 SELECT f1, f2 + 5 FROM t1;
INSERT INTO t1 SELECT f1, f2 + 10 FROM t1;
INSERT INTO t1 SELECT f1, f2 + 20 FROM t1;
INSERT INTO t1 SELECT f1, f2 + 40 FROM t1;
ANALYZE TABLE t1;

EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;

MySQL currently chooses index scan to fetch all the rows and later filters the
rows in server using predicate f2 > 40. A range scan cannot be used for this 
case because there is no predicate on the first key-part (f1) of the index. 
However optimizer can be smarter than this. It can use the current loose index
scan to scan for distinct values of the first key_part (f1) and do a sub-range
scan on this distinct prefix for f2 > 40." 

Simplified algorithm for the case above:

1. Get distinct value of the first key part(f1 = 1).
2. Construct the range based on the first and
   second key part(f1 = 1 AND f2 > 40)
3. Do range scan.
4. Get next distinct value of the first key part(f1 = 2).
5. Construct the range based on the first and
   second key part(f1 = 2 AND f2 > 40)
4. Do range scan.

In this case number of accessed rows are decreased as MySQL skips the records
which do not qualify for  the constructed range.

Example:

Without skip scan algorithm:
---
set optimizer_switch='skip_scan=off';
EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;
id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	t1	NULL	index	NULL	PRIMARY	8	NULL	160	33.33	Using where; Using index
Warnings:
Note	1003	/* select#1 */ select `test`.`t1`.`f1` AS `f1`,`test`.`t1`.`f2` AS `f2` from `test`.`t1` where (`test`.`t1`.`f2` > 40)

SHOW STATUS LIKE 'handler_read%';
Variable_name	Value
Handler_read_first	1
Handler_read_key	1
Handler_read_last	0
Handler_read_next	160
Handler_read_prev	0
Handler_read_rnd	0
Handler_read_rnd_next	0
---

With skip scan algorithm:
---
set optimizer_switch='skip_scan=on';
EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;
id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	t1	NULL	range	PRIMARY	PRIMARY	8	NULL	53	100.00	Using where; Using index for skip scan
Warnings:
Note	1003	/* select#1 */ select `test`.`t1`.`f1` AS `f1`,`test`.`t1`.`f2` AS `f2` from `test`.`t1` where (`test`.`t1`.`f2` > 40)

SHOW STATUS LIKE 'handler_read%';
Variable_name	Value
Handler_read_first	1
Handler_read_key	5
Handler_read_last	0
Handler_read_next	80
Handler_read_prev	0
Handler_read_rnd	0
Handler_read_rnd_next	0
---

See also See BUG#88103.