WL#13377: Add support for hash outer, anti and semi join

Affects: Server-8.0   —   Status: Complete

WL#2241 added support for hash inner join as a replacement for block nested-
loop. This worklog aims to implement the remaining types of joins supported in 
mysql; outer-, anti- and semijoin. As with WL#2241, this worklog will simply 
replace block nested-loop with hash join. The optimizer will still generate 
execution plans thinking it will execute a block nested-loop.

In WL#2241, we chose not to replace block nested-loop with hash join if the only 
join condition was a non equi-join. We will lift this restriction in this 
worklog. This means that all queries using BNL will execute using hash join 
given that the query does not contain anything that the iterator executor does 
not yet support (i.e. BKA outer join).

Example plans

Inner non equi-join

EXPLAIN FORMAT=tree SELECT * FROM t1 JOIN t2 ON t1.col1 < t2.col1;
-> Filter: (t1.col1 < t2.col1)  (cost=0.70 rows=1)
    -> Inner hash join  (cost=0.70 rows=1)
        -> Table scan on t2  (cost=0.35 rows=1)
        -> Hash
            -> Table scan on t1  (cost=0.35 rows=1)

Semijoin

EXPLAIN FORMAT=tree SELECT * FROM t1 WHERE (t1.i) IN (SELECT t2.i FROM t2);
-> Hash semijoin (t2.i = t1.i)  (cost=0.83 rows=2)
    -> Table scan on t1  (cost=0.45 rows=2)
    -> Hash
        -> Table scan on t2  (cost=0.18 rows=2)

Antijoin

EXPLAIN FORMAT=tree
  SELECT * FROM t2 WHERE NOT EXISTS (SELECT * FROM t1 WHERE t1.col1 = t2.col1);
-> Hash anti-join (t1.col1 = t2.col1)  (cost=1.15 rows=6)
    -> Table scan on t2  (cost=0.55 rows=3)
    -> Hash
        -> Table scan on t1  (cost=0.30 rows=2)

Left outer join

EXPLAIN FORMAT=tree SELECT * FROM t1 LEFT JOIN t2 ON t1.col1 = t2.col1;
-> Left hash join (t2.col1 = t1.col1)  (cost=0.63 rows=2)
    -> Table scan on t1  (cost=0.45 rows=2)
    -> Hash
        -> Table scan on t2  (cost=0.18 rows=1)

Right outer join
Note that MySQL rewrites all right outer joins to left outer joins.

EXPLAIN FORMAT=tree SELECT * FROM t1 LEFT JOIN t2 ON t1.col1 = t2.col1;
 -> Left hash join (t1.col1 = t2.col1)  (cost=0.70 rows=1)
    -> Table scan on t2  (cost=0.35 rows=1)
    -> Hash
        -> Table scan on t1  (cost=0.35 rows=1)