Table expressions in the FROM
clause of a
query are simplified in many cases.
At the parser stage, queries with right outer join operations are converted to equivalent queries containing only left join operations. In the general case, the conversion is performed such that this right join:
(T1, ...) RIGHT JOIN (T2, ...) ON P(T1, ..., T2, ...)
Becomes this equivalent left join:
(T2, ...) LEFT JOIN (T1, ...) ON P(T1, ..., T2, ...)
All inner join expressions of the form T1 INNER JOIN
T2 ON P(T1,T2)
are replaced by the list
T1,T2
, P(T1,T2)
being
joined as a conjunct to the WHERE
condition
(or to the join condition of the embedding join, if there is
any).
When the optimizer evaluates plans for outer join operations, it takes into consideration only plans where, for each such operation, the outer tables are accessed before the inner tables. The optimizer choices are limited because only such plans enable outer joins to be executed using the nested-loop algorithm.
Consider a query of this form, where R(T2)
greatly narrows the number of matching rows from table
T2
:
SELECT * T1 FROM T1
LEFT JOIN T2 ON P1(T1,T2)
WHERE P(T1,T2) AND R(T2)
If the query is executed as written, the optimizer has no
choice but to access the less-restricted table
T1
before the more-restricted table
T2
, which may produce a very inefficient
execution plan.
Instead, MySQL converts the query to a query with no outer
join operation if the WHERE
condition is
null-rejected. (That is, it converts the outer join to an
inner join.) A condition is said to be null-rejected for an
outer join operation if it evaluates to
FALSE
or UNKNOWN
for any
NULL
-complemented row generated for the
operation.
Thus, for this outer join:
T1 LEFT JOIN T2 ON T1.A=T2.A
Conditions such as these are null-rejected because they cannot
be true for any NULL
-complemented row (with
T2
columns set to NULL
):
T2.B IS NOT NULL
T2.B > 3
T2.C <= T1.C
T2.B < 2 OR T2.C > 1
Conditions such as these are not null-rejected because they
might be true for a NULL
-complemented row:
T2.B IS NULL
T1.B < 3 OR T2.B IS NOT NULL
T1.B < 3 OR T2.B > 3
The general rules for checking whether a condition is null-rejected for an outer join operation are simple:
It is of the form
A IS NOT NULL
, whereA
is an attribute of any of the inner tablesIt is a predicate containing a reference to an inner table that evaluates to
UNKNOWN
when one of its arguments isNULL
It is a conjunction containing a null-rejected condition as a conjunct
It is a disjunction of null-rejected conditions
A condition can be null-rejected for one outer join operation
in a query and not null-rejected for another. In this query,
the WHERE
condition is null-rejected for
the second outer join operation but is not null-rejected for
the first one:
SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
LEFT JOIN T3 ON T3.B=T1.B
WHERE T3.C > 0
If the WHERE
condition is null-rejected for
an outer join operation in a query, the outer join operation
is replaced by an inner join operation.
For example, in the preceding query, the second outer join is null-rejected and can be replaced by an inner join:
SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
INNER JOIN T3 ON T3.B=T1.B
WHERE T3.C > 0
For the original query, the optimizer evaluates only plans
compatible with the single table-access order
T1,T2,T3
. For the rewritten query, it
additionally considers the access order
T3,T1,T2
.
A conversion of one outer join operation may trigger a conversion of another. Thus, the query:
SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
LEFT JOIN T3 ON T3.B=T2.B
WHERE T3.C > 0
Is first converted to the query:
SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
INNER JOIN T3 ON T3.B=T2.B
WHERE T3.C > 0
Which is equivalent to the query:
SELECT * FROM (T1 LEFT JOIN T2 ON T2.A=T1.A), T3
WHERE T3.C > 0 AND T3.B=T2.B
The remaining outer join operation can also be replaced by an
inner join because the condition T3.B=T2.B
is null-rejected. This results in a query with no outer joins
at all:
SELECT * FROM (T1 INNER JOIN T2 ON T2.A=T1.A), T3
WHERE T3.C > 0 AND T3.B=T2.B
Sometimes the optimizer succeeds in replacing an embedded outer join operation, but cannot convert the embedding outer join. The following query:
SELECT * FROM T1 LEFT JOIN
(T2 LEFT JOIN T3 ON T3.B=T2.B)
ON T2.A=T1.A
WHERE T3.C > 0
Is converted to:
SELECT * FROM T1 LEFT JOIN
(T2 INNER JOIN T3 ON T3.B=T2.B)
ON T2.A=T1.A
WHERE T3.C > 0
That can be rewritten only to the form still containing the embedding outer join operation:
SELECT * FROM T1 LEFT JOIN
(T2,T3)
ON (T2.A=T1.A AND T3.B=T2.B)
WHERE T3.C > 0
Any attempt to convert an embedded outer join operation in a
query must take into account the join condition for the
embedding outer join together with the
WHERE
condition. In this query, the
WHERE
condition is not null-rejected for
the embedded outer join, but the join condition of the
embedding outer join T2.A=T1.A AND
T3.C=T1.C
is null-rejected:
SELECT * FROM T1 LEFT JOIN
(T2 LEFT JOIN T3 ON T3.B=T2.B)
ON T2.A=T1.A AND T3.C=T1.C
WHERE T3.D > 0 OR T1.D > 0
Consequently, the query can be converted to:
SELECT * FROM T1 LEFT JOIN
(T2, T3)
ON T2.A=T1.A AND T3.C=T1.C AND T3.B=T2.B
WHERE T3.D > 0 OR T1.D > 0