Documentation Home
MySQL 8.0 Reference Manual
Related Documentation Download this Manual
PDF (US Ltr) - 43.1Mb
PDF (A4) - 43.2Mb
Man Pages (TGZ) - 295.4Kb
Man Pages (Zip) - 400.6Kb
Info (Gzip) - 4.3Mb
Info (Zip) - 4.3Mb
Excerpts from this Manual

10.2.1.10 Outer Join Simplification

Table expressions in the FROM clause of a query are simplified in many cases.

At the parser stage, queries with right outer join operations are converted to equivalent queries containing only left join operations. In the general case, the conversion is performed such that this right join:

(T1, ...) RIGHT JOIN (T2, ...) ON P(T1, ..., T2, ...)

Becomes this equivalent left join:

(T2, ...) LEFT JOIN (T1, ...) ON P(T1, ..., T2, ...)

All inner join expressions of the form T1 INNER JOIN T2 ON P(T1,T2) are replaced by the list T1,T2, P(T1,T2) being joined as a conjunct to the WHERE condition (or to the join condition of the embedding join, if there is any).

When the optimizer evaluates plans for outer join operations, it takes into consideration only plans where, for each such operation, the outer tables are accessed before the inner tables. The optimizer choices are limited because only such plans enable outer joins to be executed using the nested-loop algorithm.

Consider a query of this form, where R(T2) greatly narrows the number of matching rows from table T2:

SELECT * T1 FROM T1
  LEFT JOIN T2 ON P1(T1,T2)
  WHERE P(T1,T2) AND R(T2)

If the query is executed as written, the optimizer has no choice but to access the less-restricted table T1 before the more-restricted table T2, which may produce a very inefficient execution plan.

Instead, MySQL converts the query to a query with no outer join operation if the WHERE condition is null-rejected. (That is, it converts the outer join to an inner join.) A condition is said to be null-rejected for an outer join operation if it evaluates to FALSE or UNKNOWN for any NULL-complemented row generated for the operation.

Thus, for this outer join:

T1 LEFT JOIN T2 ON T1.A=T2.A

Conditions such as these are null-rejected because they cannot be true for any NULL-complemented row (with T2 columns set to NULL):

T2.B IS NOT NULL
T2.B > 3
T2.C <= T1.C
T2.B < 2 OR T2.C > 1

Conditions such as these are not null-rejected because they might be true for a NULL-complemented row:

T2.B IS NULL
T1.B < 3 OR T2.B IS NOT NULL
T1.B < 3 OR T2.B > 3

The general rules for checking whether a condition is null-rejected for an outer join operation are simple:

  • It is of the form A IS NOT NULL, where A is an attribute of any of the inner tables

  • It is a predicate containing a reference to an inner table that evaluates to UNKNOWN when one of its arguments is NULL

  • It is a conjunction containing a null-rejected condition as a conjunct

  • It is a disjunction of null-rejected conditions

A condition can be null-rejected for one outer join operation in a query and not null-rejected for another. In this query, the WHERE condition is null-rejected for the second outer join operation but is not null-rejected for the first one:

SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
                 LEFT JOIN T3 ON T3.B=T1.B
  WHERE T3.C > 0

If the WHERE condition is null-rejected for an outer join operation in a query, the outer join operation is replaced by an inner join operation.

For example, in the preceding query, the second outer join is null-rejected and can be replaced by an inner join:

SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
                 INNER JOIN T3 ON T3.B=T1.B
  WHERE T3.C > 0

For the original query, the optimizer evaluates only plans compatible with the single table-access order T1,T2,T3. For the rewritten query, it additionally considers the access order T3,T1,T2.

A conversion of one outer join operation may trigger a conversion of another. Thus, the query:

SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
                 LEFT JOIN T3 ON T3.B=T2.B
  WHERE T3.C > 0

Is first converted to the query:

SELECT * FROM T1 LEFT JOIN T2 ON T2.A=T1.A
                 INNER JOIN T3 ON T3.B=T2.B
  WHERE T3.C > 0

Which is equivalent to the query:

SELECT * FROM (T1 LEFT JOIN T2 ON T2.A=T1.A), T3
  WHERE T3.C > 0 AND T3.B=T2.B

The remaining outer join operation can also be replaced by an inner join because the condition T3.B=T2.B is null-rejected. This results in a query with no outer joins at all:

SELECT * FROM (T1 INNER JOIN T2 ON T2.A=T1.A), T3
  WHERE T3.C > 0 AND T3.B=T2.B

Sometimes the optimizer succeeds in replacing an embedded outer join operation, but cannot convert the embedding outer join. The following query:

SELECT * FROM T1 LEFT JOIN
              (T2 LEFT JOIN T3 ON T3.B=T2.B)
              ON T2.A=T1.A
  WHERE T3.C > 0

Is converted to:

SELECT * FROM T1 LEFT JOIN
              (T2 INNER JOIN T3 ON T3.B=T2.B)
              ON T2.A=T1.A
  WHERE T3.C > 0

That can be rewritten only to the form still containing the embedding outer join operation:

SELECT * FROM T1 LEFT JOIN
              (T2,T3)
              ON (T2.A=T1.A AND T3.B=T2.B)
  WHERE T3.C > 0

Any attempt to convert an embedded outer join operation in a query must take into account the join condition for the embedding outer join together with the WHERE condition. In this query, the WHERE condition is not null-rejected for the embedded outer join, but the join condition of the embedding outer join T2.A=T1.A AND T3.C=T1.C is null-rejected:

SELECT * FROM T1 LEFT JOIN
              (T2 LEFT JOIN T3 ON T3.B=T2.B)
              ON T2.A=T1.A AND T3.C=T1.C
  WHERE T3.D > 0 OR T1.D > 0

Consequently, the query can be converted to:

SELECT * FROM T1 LEFT JOIN
              (T2, T3)
              ON T2.A=T1.A AND T3.C=T1.C AND T3.B=T2.B
  WHERE T3.D > 0 OR T1.D > 0