This section contains information about JSON functions that
convert JSON data to tabular data. MySQL 8.4 supports
one such function, JSON_TABLE()
.
JSON_TABLE(
expr
,
path
COLUMNS
(column_list
) [AS]
alias
)
Extracts data from a JSON document and returns it as a relational table having the specified columns. The complete syntax for this function is shown here:
JSON_TABLE(
expr,
path COLUMNS (column_list)
) [AS] alias
column_list:
column[, column][, ...]
column:
name FOR ORDINALITY
| name type PATH string path [on_empty] [on_error]
| name type EXISTS PATH string path
| NESTED [PATH] path COLUMNS (column_list)
on_empty:
{NULL | DEFAULT json_string | ERROR} ON EMPTY
on_error:
{NULL | DEFAULT json_string | ERROR} ON ERROR
expr
: This is an expression that
returns JSON data. This can be a constant
('{"a":1}'
), a column
(t1.json_data
, given table
t1
specified prior to
JSON_TABLE()
in the FROM
clause), or a function call
(JSON_EXTRACT(t1.json_data,'$.post.comments')
).
path
: A JSON path expression, which is
applied to the data source. We refer to the JSON value matching
the path as the row source; this is used to
generate a row of relational data. The COLUMNS
clause evaluates the row source, finds specific JSON values within
the row source, and returns those JSON values as SQL values in
individual columns of a row of relational data.
The alias
is required. The usual rules
for table aliases apply (see Section 11.2, “Schema Object Names”).
This function compares column names in case-insensitive fashion.
JSON_TABLE()
supports four types of columns,
described in the following list:
: This type enumerates rows in thename
FOR ORDINALITYCOLUMNS
clause; the column namedname
is a counter whose type isUNSIGNED INT
, and whose initial value is 1. This is equivalent to specifying a column asAUTO_INCREMENT
in aCREATE TABLE
statement, and can be used to distinguish parent rows with the same value for multiple rows generated by aNESTED [PATH]
clause.
: Columns of this type are used to extract values specified byname
type
PATHstring_path
[on_empty
] [on_error
]string_path
.type
is a MySQL scalar data type (that is, it cannot be an object or array).JSON_TABLE()
extracts data as JSON then coerces it to the column type, using the regular automatic type conversion applying to JSON data in MySQL. A missing value triggers theon_empty
clause. Saving an object or array triggers the optionalon error
clause; this also occurs when an error takes place during coercion from the value saved as JSON to the table column, such as trying to save the string'asd'
to an integer column.
: This column returns 1 if any data is present at the location specified byname
type
EXISTS PATHpath
path
, and 0 otherwise.type
can be any valid MySQL data type, but should normally be specified as some variety ofINT
.NESTED [PATH]
: This flattens nested objects or arrays in JSON data into a single row along with the JSON values from the parent object or array. Using multiplepath
COLUMNS (column_list
)PATH
options allows projection of JSON values from multiple levels of nesting into a single row.The
path
is relative to the parent path row path ofJSON_TABLE()
, or the path of the parentNESTED [PATH]
clause in the event of nested paths.
on empty
, if specified, determines what
JSON_TABLE()
does in the event that data is
missing (depending on type). This clause is also triggered on a
column in a NESTED PATH
clause when the latter
has no match and a NULL
complemented row is
produced for it. on empty
takes one of
the following values:
NULL ON EMPTY
: The column is set toNULL
; this is the default behavior.DEFAULT
: the providedjson_string
ON EMPTYjson_string
is parsed as JSON, as long as it is valid, and stored instead of the missing value. Column type rules also apply to the default value.ERROR ON EMPTY
: An error is thrown.
If used, on_error
takes one of the
following values with the corresponding result as shown here:
NULL ON ERROR
: The column is set toNULL
; this is the default behavior.DEFAULT
: Thejson string
ON ERRORjson_string
is parsed as JSON (provided that it is valid) and stored instead of the object or array.ERROR ON ERROR
: An error is thrown.
Specifying ON ERROR
before ON
EMPTY
is nonstandard and deprecated in MySQL; trying to
do so causes the server to issue a warning. Expect support for the
nonstandard syntax to be removed in a future version of MySQL.
When a value saved to a column is truncated, such as saving
3.14159 in a DECIMAL(10,1)
column,
a warning is issued independently of any ON
ERROR
option. When multiple values are truncated in a
single statement, the warning is issued only once.
When the expression and path passed to this function resolve to
JSON null, JSON_TABLE()
returns SQL
NULL
, in accordance with the SQL standard, as
shown here:
mysql> SELECT *
-> FROM
-> JSON_TABLE(
-> '[ {"c1": null} ]',
-> '$[*]' COLUMNS( c1 INT PATH '$.c1' ERROR ON ERROR )
-> ) as jt;
+------+
| c1 |
+------+
| NULL |
+------+
1 row in set (0.00 sec)
The following query demonstrates the use of ON
EMPTY
and ON ERROR
. The row
corresponding to {"b":1}
is empty for the path
"$.a"
, and attempting to save
[1,2]
as a scalar produces an error; these rows
are highlighted in the output shown.
mysql> SELECT *
-> FROM
-> JSON_TABLE(
-> '[{"a":"3"},{"a":2},{"b":1},{"a":0},{"a":[1,2]}]',
-> "$[*]"
-> COLUMNS(
-> rowid FOR ORDINALITY,
-> ac VARCHAR(100) PATH "$.a" DEFAULT '111' ON EMPTY DEFAULT '999' ON ERROR,
-> aj JSON PATH "$.a" DEFAULT '{"x": 333}' ON EMPTY,
-> bx INT EXISTS PATH "$.b"
-> )
-> ) AS tt;
+-------+------+------------+------+
| rowid | ac | aj | bx |
+-------+------+------------+------+
| 1 | 3 | "3" | 0 |
| 2 | 2 | 2 | 0 |
| 3 | 111 | {"x": 333} | 1 |
| 4 | 0 | 0 | 0 |
| 5 | 999 | [1, 2] | 0 |
+-------+------+------------+------+
5 rows in set (0.00 sec)
Column names are subject to the usual rules and limitations governing table column names. See Section 11.2, “Schema Object Names”.
All JSON and JSON path expressions are checked for validity; an invalid expression of either type causes an error.
Each match for the path
preceding the
COLUMNS
keyword maps to an individual row in
the result table. For example, the following query gives the
result shown here:
mysql> SELECT *
-> FROM
-> JSON_TABLE(
-> '[{"x":2,"y":"8"},{"x":"3","y":"7"},{"x":"4","y":6}]',
-> "$[*]" COLUMNS(
-> xval VARCHAR(100) PATH "$.x",
-> yval VARCHAR(100) PATH "$.y"
-> )
-> ) AS jt1;
+------+------+
| xval | yval |
+------+------+
| 2 | 8 |
| 3 | 7 |
| 4 | 6 |
+------+------+
The expression "$[*]"
matches each element of
the array. You can filter the rows in the result by modifying the
path. For example, using "$[1]"
limits
extraction to the second element of the JSON array used as the
source, as shown here:
mysql> SELECT *
-> FROM
-> JSON_TABLE(
-> '[{"x":2,"y":"8"},{"x":"3","y":"7"},{"x":"4","y":6}]',
-> "$[1]" COLUMNS(
-> xval VARCHAR(100) PATH "$.x",
-> yval VARCHAR(100) PATH "$.y"
-> )
-> ) AS jt1;
+------+------+
| xval | yval |
+------+------+
| 3 | 7 |
+------+------+
Within a column definition, "$"
passes the
entire match to the column; "$.x"
and
"$.y"
pass only the values corresponding to the
keys x
and y
, respectively,
within that match. For more information, see
JSON Path Syntax.
NESTED PATH
(or simply
NESTED
; PATH
is optional)
produces a set of records for each match in the
COLUMNS
clause to which it belongs. If there is
no match, all columns of the nested path are set to
NULL
. This implements an outer join between the
topmost clause and NESTED [PATH]
. An inner join
can be emulated by applying a suitable condition in the
WHERE
clause, as shown here:
mysql> SELECT *
-> FROM
-> JSON_TABLE(
-> '[ {"a": 1, "b": [11,111]}, {"a": 2, "b": [22,222]}, {"a":3}]',
-> '$[*]' COLUMNS(
-> a INT PATH '$.a',
-> NESTED PATH '$.b[*]' COLUMNS (b INT PATH '$')
-> )
-> ) AS jt
-> WHERE b IS NOT NULL;
+------+------+
| a | b |
+------+------+
| 1 | 11 |
| 1 | 111 |
| 2 | 22 |
| 2 | 222 |
+------+------+
Sibling nested paths—that is, two or more instances of
NESTED [PATH]
in the same
COLUMNS
clause—are processed one after
another, one at a time. While one nested path is producing
records, columns of any sibling nested path expressions are set to
NULL
. This means that the total number of
records for a single match within a single containing
COLUMNS
clause is the sum and not the product
of all records produced by NESTED [PATH]
modifiers, as shown here:
mysql> SELECT *
-> FROM
-> JSON_TABLE(
-> '[{"a": 1, "b": [11,111]}, {"a": 2, "b": [22,222]}]',
-> '$[*]' COLUMNS(
-> a INT PATH '$.a',
-> NESTED PATH '$.b[*]' COLUMNS (b1 INT PATH '$'),
-> NESTED PATH '$.b[*]' COLUMNS (b2 INT PATH '$')
-> )
-> ) AS jt;
+------+------+------+
| a | b1 | b2 |
+------+------+------+
| 1 | 11 | NULL |
| 1 | 111 | NULL |
| 1 | NULL | 11 |
| 1 | NULL | 111 |
| 2 | 22 | NULL |
| 2 | 222 | NULL |
| 2 | NULL | 22 |
| 2 | NULL | 222 |
+------+------+------+
A FOR ORDINALITY
column enumerates records
produced by the COLUMNS
clause, and can be used
to distinguish parent records of a nested path, especially if
values in parent records are the same, as can be seen here:
mysql> SELECT *
-> FROM
-> JSON_TABLE(
-> '[{"a": "a_val",
'> "b": [{"c": "c_val", "l": [1,2]}]},
'> {"a": "a_val",
'> "b": [{"c": "c_val","l": [11]}, {"c": "c_val", "l": [22]}]}]',
-> '$[*]' COLUMNS(
-> top_ord FOR ORDINALITY,
-> apath VARCHAR(10) PATH '$.a',
-> NESTED PATH '$.b[*]' COLUMNS (
-> bpath VARCHAR(10) PATH '$.c',
-> ord FOR ORDINALITY,
-> NESTED PATH '$.l[*]' COLUMNS (lpath varchar(10) PATH '$')
-> )
-> )
-> ) as jt;
+---------+---------+---------+------+-------+
| top_ord | apath | bpath | ord | lpath |
+---------+---------+---------+------+-------+
| 1 | a_val | c_val | 1 | 1 |
| 1 | a_val | c_val | 1 | 2 |
| 2 | a_val | c_val | 1 | 11 |
| 2 | a_val | c_val | 2 | 22 |
+---------+---------+---------+------+-------+
The source document contains an array of two elements; each of
these elements produces two rows. The values of
apath
and bpath
are the same
over the entire result set; this means that they cannot be used to
determine whether lpath
values came from the
same or different parents. The value of the ord
column remains the same as the set of records having
top_ord
equal to 1, so these two values are
from a single object. The remaining two values are from different
objects, since they have different values in the
ord
column.
Normally, you cannot join a derived table which depends on columns
of preceding tables in the same FROM
clause.
MySQL, per the SQL standard, makes an exception for table
functions; these are considered lateral derived tables. This is
implicit, and for this reason is not allowed before
JSON_TABLE()
, also according to the standard.
Suppose you have a table t1
created and
populated using the statements shown here:
CREATE TABLE t1 (c1 INT, c2 CHAR(1), c3 JSON);
INSERT INTO t1 () VALUES
ROW(1, 'z', JSON_OBJECT('a', 23, 'b', 27, 'c', 1)),
ROW(1, 'y', JSON_OBJECT('a', 44, 'b', 22, 'c', 11)),
ROW(2, 'x', JSON_OBJECT('b', 1, 'c', 15)),
ROW(3, 'w', JSON_OBJECT('a', 5, 'b', 6, 'c', 7)),
ROW(5, 'v', JSON_OBJECT('a', 123, 'c', 1111))
;
You can then execute joins, such as this one, in which
JSON_TABLE()
acts as a derived table while at
the same time it refers to a column in a previously referenced
table:
SELECT c1, c2, JSON_EXTRACT(c3, '$.*')
FROM t1 AS m
JOIN
JSON_TABLE(
m.c3,
'$.*'
COLUMNS(
at VARCHAR(10) PATH '$.a' DEFAULT '1' ON EMPTY,
bt VARCHAR(10) PATH '$.b' DEFAULT '2' ON EMPTY,
ct VARCHAR(10) PATH '$.c' DEFAULT '3' ON EMPTY
)
) AS tt
ON m.c1 > tt.at;
Attempting to use the LATERAL
keyword with this
query raises ER_PARSE_ERROR
.