WL#3634: Recursive WITH (Common Table Expression)

Affects: Server-8.0 — Status: Complete

Description
Requirements
Dependent Tasks
High Level Architecture
Low Level Design

WHAT?
=====

Support recursive CTE. Example:

Numbers from 1 to 10:

with recursive qn as
 (select 1 as a union distinct select 1+a from qn where a<10)
select * from qn;
prints 1 to 10.

Fibonacci numbers:

with recursive qn as
 (select 1 as n, 1 as un, 1 as unp1 union all
  select 1+n, unp1, un+unp1 from qn where n<10)
select * from qn;
n	un	unp1
1	1	1
2	1	2
3	2	3
4	3	5
5	5	8
6	8	13
7	13	21
8	21	34
9	34	55
10	55	89 # 89 is the 10th Fibonacci number

Hierarchy traversal like parts explosions or ancestors-descendants; here's a
company's org chart:

CREATE TABLE EMPLOYEES (
ID INT PRIMARY KEY,
NAME VARCHAR(100),
MANAGER_ID INT,
INDEX (MANAGER_ID),
FOREIGN KEY (MANAGER_ID) REFERENCES EMPLOYEES(ID)
);
INSERT INTO EMPLOYEES VALUES
(333, "Yasmina", NULL),  # Yasmina is the CEO (as her manager_id is NULL)
(198, "John", 333),      # John is 198 and reports to 333 (=Yasmina)
(692, "Tarek", 333),
(29, "Pedro", 198),
(4610, "Sarah", 29),
(72, "Pierre", 29),
(123, "Adil", 692);
# Give me the org chart with the management chain for every employee (= the path
from CEO to employee):

WITH RECURSIVE EMPLOYEES_EXTENDED(ID, NAME, PATH)
AS
(
SELECT ID, NAME, CAST(ID AS CHAR(200))
FROM EMPLOYEES
WHERE MANAGER_ID IS NULL
UNION ALL
SELECT S.ID, S.NAME, CONCAT(M.PATH, ",", S.ID)
FROM EMPLOYEES_EXTENDED M JOIN EMPLOYEES S ON M.ID=S.MANAGER_ID
)
SELECT * FROM EMPLOYEES_EXTENDED ORDER BY PATH;
ID	NAME	PATH
333	Yasmina	333
198	John	333,198
29	Pedro	333,198,29
4610	Sarah	333,198,29,4610 # Sarah->Pedro->John->Yasmina
72	Pierre	333,198,29,72
692	Tarek	333,692
123	Adil	333,692,123

Except by creating a large stored procedure, it is not possible to achieve this
result without this feature.
It is similar in idea to Oracle's CONNECT BY.

DO OTHERS HAVE IT?
==================

yes: standard SQL99, DB2, SQL Server, Oracle, SQLite, PostgreSQL.

MISC
====

Slides 40-59 of
http://fr.slideshare.net/MarkusWinand/modern-sql

In SQL2011 it is optional feature T131/132 described in paragraph .

It is possible to emulate this with a generic SP:
http://guilhembichot.blogspot.co.uk/2013/11/with-recursive-and-mysql.html
but this SP has to create persistent tables, requiring privileges, requiring
proper cleanup, etc.

F-1 support the WITH RECURSIVE syntax

F-2 Recursive query blocks should have a "recursive" mark in EXPLAIN

F-3 Same reference practice as for non-recursive CTEs, with the only addition
that a recursive CTE is allowed to reference itself. For example, mutually
recursive CTEs are not supported (t1 referencing t2 referencing t1).

WL#883: Non-recursive WITH clause (Common Table Expression)

See WL#883 for necessary background.

Syntax
======

Coverage in this description
will only be about the necessary addition for WITH
that enables recursion. This is for a minimal
requirement -- recursive queries can be very
complex.

A WITH clause is, as in WL#883, a list of CTEs.
One or more of those CTEs may reference itself in its subquery: it's a
"recursive CTE". A WITH clause may define a mix of non-recursive
CTEs and recursive CTEs; if it has recursive ones it must have the
RECURSIVE keyword.

The elementary description, with one single (recursive) CTE, is:

WITH RECURSIVE cte AS
(SELECT ... FROM table_name              /* "seed" SELECT */
 UNION [ALL | DISTINCT]
 SELECT ... FROM cte,table_name)  /* "recursive" SELECT */
SELECT ... FROM cte;

The parenthesized "(SELECT ... UNION ALL SELECT ...)"
has two query blocks. The first is the "non-recursive" (also known as "anchor"
or "seed" in some other products), which is
what one "starts with". The second is the "recursive" one,
which explains how the DBMS navigates from the seed's result. The "recursive"
one references the CTE. The "non-recursive" doesn't.
Execution does like this:
- non-recursive query block is evaluated, result goes into an internal tmp table
- if no rows, exit
- (A): recursive query block is evaluated over the tmp table's lastly inserted
rows, and it produces new rows which are appended to the tmp table (if UNION
ALL; only distinct not-already-there rows if UNION DISTINCT)
- if the last step didn't produce new rows, exit
- goto (A)

This algorithm is good to traverse a hierarchy: the non-recursive query block
returns the top parent,
recursive query block's first run returns children of top parent, second run
returns children of those children, etc, until no more offspring.

The following constraints are from the SQL standard, or exist in important
DBMSs; to simplify our implementation some constraints are added and
marked with star-ed notes:
- the CTE's subquery is a chain of unioned query blocks, which can be split in
two parts: a head which contains no reference to the CTE, and a tail
which has references to the CTE. The head's query blocks are named 
"non-recursive query blocks"; the tail's query blocks are named "recursive query
blocks".
- It is forbidden to have RQBi UNION DISTINCT RQBj UNION ALL RQBk where RQBx are
consecutive recursive query blocks (*)
- If there is more than one recursive query block, in one
iteration they all operate on the data produced by the previous iteration, and
their results are concatenated.
- a recursive query block cannot have aggregate functions, GROUP BY
- a recursive query block cannot have ORDER BY, LIMIT, SELECT DISTINCT (**)
- the query expression of the CTE cannot have a global ORDER BY, LIMIT (***)
- a recursive query block must reference the CTE only once, this reference
must be in the FROM clause, not in any subquery or derived table. For a
recursive query block, the execution plan should access the CTE first, so the
reference to the CTE should neither be on the right (left) side of a LEFT
(RIGHT) JOIN, nor forced by join hints (STRAIGHT_JOIN) to be non-first.
- types of the CTE's result columns are inferred from the types of the
union of non-recursive query blocks only (recursive query blocks are ignored),
and they are all NULLable.
- Notes:
(*) Note that Oracle/DB2/MSSQL don't allow UNION DISTINCT between recursive
query blocks, which is a stronger limitation.
(**) I could neither see the meaning of those in a recursive query
block, nor find examples on the web.
(***) In examples on the web, the ordering is usually done in the outer query.
LIMITing in the subquery could have the intention to stop the recursion early,
but it seems to be instead done by having an increasing "depth level" column in
the query. If needed, a workaround is to wrap in a second CTE:
WITH RECURSIVE qn AS
  (SELECT nonrec1 UNION ALL SELECT rec ORDER BY x LIMIT y)
SELECT * FROM qn;
->
WITH RECURSIVE qn_no_order AS
  (SELECT nonrec1 UNION ALL SELECT rec),
               qn AS
  (SELECT * FROM qn_no_order ORDER BY x LIMIT y)
SELECT * FROM qn;

More Syntax
===========

Optional column name list: in the standard it's mandatory for WITH RECURSIVE
(not mandatory for non-recursive WITH). We
won't require it, as it seems reasonable if the user instead uses AS alias
clauses in his first non-recursive query block:
with recursive qn(a) as
 (select 1 union all select 1+a from qn where a<10)
select * from qn;
can be written as
with recursive qn as
 (select 1 as a union all select 1+a from qn where a<10)
select * from qn;
In the list of column names, there must be no duplication.

Materialization
===============

As any derived table containing UNION, a recursive CTE is always materialized.
Even if referenced multiple times, it's materialized once.

Unsupported queries
===================

First see the starred notes in "Syntax" section.

In SQL2011 mutual recursion between N CTEs is possible. It is complex.
Oracle and PG don't allow it, we won't. We haven't found any example of someone
using this, on the Web.
SQL2011 allows a CTE to be dependent on a CTE coming after it in the WITH
RECURSIVE clause:
with cte1 as (select * from cte2), cte2 as (...)...
we don't allow this. cte2 is allowed to depend on cte1 (we thus apply the
same rule as for non-recursive WITH).

: in SQL2011  can be added at
the end of the WITH clause. It allows
1) to order the CTE's result rows depth-first or breadth-first in the
query expression having the WITH clause
2) to mark starts of cycles (to stop infinite recursion).
Oracle has this clause.
PG doesn't; they give tips in the docs how to emulate that in the query,
http://www.postgresql.org/docs/current/static/queries-with.html ;
we will do like them.
SQL Server only has a MAXRECURSION hint (a number):
https://technet.microsoft.com/en-us/library/ms175972%28v=sql.105%29.aspx
which is 100 by default. We will not have it: if a query takes too long, the
existing max_execution_time will kill it, or a manual KILL QUERY will. It's also
possible for the user to stop recursion by having a counter in the recursive
query. It's for these reasons that PG devs have refused such hint.

Automatic cycle detection: Oracle automatically detects cycles: it throws an
error when it sees a row which is the same as a row already present in the
result; it does not detect infinite-loop queries not due to a cycle, like those
generating an infinite sequence of numbers). PG doesn't detect anything.
We won't detect anything.

Recursive VIEW: In SQL99 A view can be created with CREATE RECURSIVE VIEW:
create recursive view v as select 1 as a union select a+1 from v where a<10;
is equivalent to
create view v as
  with recursive qn (select 1 as a union select a+1 from qn where a<10)
  select * from qn;
RECURSIVE view is not merge-able in the outer query and cannot have CHECK OPTION
(See 11.32 ).
We will not support it; workaround: use the equivalent CREATE VIEW above; it
will also be non-mergeable (because based on UNION), and even though it will
support CHECK OPTION it will not be insertable/updatable (due to UNION again) so
CHECK OPTION will in fact be ineffective.

Requirements on storage engine used for materializing the CTE
=============================================================

These are in addition to those of WL#883.
The table may have indexes.
During materialization of the recursive CTE, the readers do table scan only
(rnd_init(), then rnd_next() or rnd_pos()+rnd_next()). The writer does:
- write_row()
- no update
- no delete
- index_read_map()/index_next_same() (only if UNION DISTINCT with hash_field).
It is critical that a reader cursor:
(a) is not disturbed by the other readers' rnd_init/rnd_next()
(b) is not disturbed by the writer's write_row() or the writer's index lookups.
(c) has visibility of rows which the writer has previously written: reader must
be able to catch up continuously, and must receive rows in the order
they were inserted (otherwise it will hopelessly miss rows
or process rows twice). As an exception, when the reader reports
EOF, it is not required to see later-written rows.
(d) is not disturbed by a conversion from too-large in-memory to on-disk
(create_ondisk_from_heap() - thus the on-disk engine must offer a way to
re-position the reader's scan on the on-disk row matching the in-mem row where
the reader was before the conversion). To help this, the Server is able to tell
the number of the in-mem row in insertion order.
(e) is able to start a scan both from the first row, and from a certain row
indicated with rnd_pos(), then continue with rnd_next().
(f) An implication of (e) is that positions reported by position() must be
stable: even if a write happens later, the position of the row mustn't be changed.

Modifications to MEMORY and Innodb storage engines
==================================================

As the CTE is materialized either in MEMORY or in InnoDB "intrinsic tables", for
those two engines to support the requirements above, small code changes are
done in the engines. When later MEMORY is replaced by InnoDB "in-memory tables"
some changes to the latter may be needed. MEMORY doesn't support (e) of the
previous paragraph if a hash_field is used (which happens in practice when there
is UNION DISTINCT and the concatenation of columns is >500 bytes); we then use
InnoDB.

Modifications to the optimizer trace
====================================

The existing flag 'repeated_subselect' in @@optimizer_trace_features is already
used to disable tracing of the non-first executions of a subquery; we also make
it disable tracing of the non-first executions of a recursive query block.

CONNECT BY
==========

Equivalence of WITH with Oracle's non-standard CONNECT BY:
http://rajeshwaranbtech.blogspot.co.uk/2012/12/recursive-with-clause-to-implement.html

References
==========

http://docs.oracle.com/database/121/SQLRF/statements_10002.htm#i2077142
https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0059217.html
https://msdn.microsoft.com/en-us/library/ms175972.aspx
https://wiki.postgresql.org/wiki/CTEReadme
https://books.google.co.in/books?id=wyhXvU0Eyg0C&pg=PA352&hl=en#v=onepage&q&f=false
sums up SQL99 quite well.

see LLD of WL#883.