The MySQL 5.7.7 JSON Lab release introduces a native JSON datatype. In part 1 of this blog post series, Rick Hillegas introduced the new functions for creating and manipulating JSON documents using the new native JSON data type. In this blog post we will be using some of the same sample tables and JSON documents as in part 1, so it will be helpful to read that blog post now, if you haven’t already.
Here we look closer at the functions provided to search through and peer into JSON documents to find values inside of them, and we’ll also cover a few utility functions as we go along. The functions described here are also summarized at the end of this blog for ease of reference.
Please note that the exact set of functions and their semantics may evolve before the final 5.7 GA release, so your feedback will be very valuable as we settle upon the best set of functions!
At the end of part 1, we have a set of thermostat readings inside our JSON column within the ‘THERMOSTAT_READING’ table. Here are the contents of that table where we left off in part 1:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select * from thermostat_reading; +-----------------------------------------------------------------------------------------------------+-----------+ | reading | device_id | +-----------------------------------------------------------------------------------------------------+-----------+ | {"spread": 6.95, "setting": 69, "unixtime": 1428462061, "device_id": 3001, "current_temp": 62.05} | 3001 | | {"spread": -6.25, "setting": 64, "unixtime": 1428483661, "device_id": 3001, "current_temp": 70.25} | 3001 | | {"spread": 6.95, "setting": 68, "unixtime": 1428462061, "device_id": 3002, "current_temp": 61.05} | 3002 | | {"spread": -9.25, "setting": 62, "unixtime": 1428483661, "device_id": 3002, "current_temp": 71.25} | 3002 | | {"spread": 6.95, "setting": 69, "unixtime": 1428548461, "device_id": 3001, "current_temp": 62.05} | 3001 | | {"spread": -11.25, "setting": 64, "unixtime": 1428570061, "device_id": 3001, "current_temp": 75.25} | 3001 | | {"spread": 6.95, "setting": 68, "unixtime": 1428548461, "device_id": 3002, "current_temp": 61.05} | 3002 | | {"spread": -14.25, "setting": 62, "unixtime": 1428570061, "device_id": 3002, "current_temp": 76.25} | 3002 | | {"spread": 3.95, "setting": 69, "unixtime": 1428634861, "device_id": 3001, "current_temp": 65.05} | 3001 | | {"spread": -12.25, "setting": 64, "unixtime": 1428656461, "device_id": 3001, "current_temp": 76.25} | 3001 | | {"spread": 1.95, "setting": 68, "unixtime": 1428634861, "device_id": 3002, "current_temp": 66.05} | 3002 | | {"spread": -15.25, "setting": 62, "unixtime": 1428656461, "device_id": 3002, "current_temp": 77.25} | 3002 | +-----------------------------------------------------------------------------------------------------+-----------+ 12 rows in set (0.00 sec) |
We can inquire what the length of the documents are, that is, how many elements they have, using the JSN_LENGTH()
function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_length(reading) from thermostat_reading; +---------------------+ | jsn_length(reading) | +---------------------+ | 5 | | 5 | | 5 | | 5 | | 5 | | 5 | | 5 | | 5 | | 5 | | 5 | | 5 | | 5 | +---------------------+ 12 rows in set (0.00 sec) |
There is also a companion function called JSN_DEPTH()
that can be used to inquire about the maximum depth in a document:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_depth(reading) from thermostat_reading; +--------------------+ | jsn_depth(reading) | +--------------------+ | 2 | | 2 | | 2 | | 2 | | 2 | | 2 | | 2 | | 2 | | 2 | | 2 | | 2 | | 2 | +--------------------+ 12 rows in set (0.00 sec) |
In this case it reports 2 since the JSON document consists of a JSON object with only scalar values, that is has no nested objects. The depth value of 1 is reserved for a JSON scalar value.
Now, in part 1 we modified the set of reading data, by adding and removing keys. We might also want to verify or check the set of keys in the readings we have by using the the JSN_KEYS()
function. This will return rows of JSON arrays containing the keys of our thermostat reading objects:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_keys(reading) from thermostat_reading; +----------------------------------------------------------------+ | jsn_keys(reading) | +----------------------------------------------------------------+ | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | | ["spread", "setting", "unixtime", "device_id", "current_temp"] | +----------------------------------------------------------------+ 12 rows in set (0.01 sec) |
Now let’s look at another interesting function, JSN_CONTAINS_PATH()
. This can be used to determine if a specific element is present in a document, for example the ‘spread’ element:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_contains_path(reading, 'all', '$.spread') from thermostat_reading; +-----------------------------------------------+ | jsn_contains_path(reading, 'all', '$.spread') | +-----------------------------------------------+ | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | +-----------------------------------------------+ 12 rows in set (0.00 sec) |
So what is the meaning of the second parameter used there: ‘all’? While not useful in this particular case, the JSN_CONTAINS_PATH()
function can inquire about the presence of several elements at once, e.g. both the ‘spread’ and the the ‘on’ elements:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_contains_path(reading, 'all', '$.spread', '$.on') from thermostat_reading; +-------------------------------------------------------+ | jsn_contains_path(reading, 'all', '$.spread', '$.on') | +-------------------------------------------------------+ | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | | 0 | +-------------------------------------------------------+ 12 rows in set (0.00 sec) |
This time it returns 0 since we had previously removed the ‘on’ element in part 1. Passing in a value of ‘one’ instead of ‘all’ would be asking if just one of the two (OR instead of AND) elements were present. Thus would give the result of 1 in our case since they have the ‘spread’ element:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_contains_path(reading, 'one', '$.spread', '$.on') from thermostat_reading; +-------------------------------------------------------+ | jsn_contains_path(reading, 'one', '$.spread', '$.on') | +-------------------------------------------------------+ | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | +-------------------------------------------------------+ 12 rows in set (0.00 sec) |
Now, we might want to inquire whether a specific element and value are present instead. For this we can use the JSN_CONTAINS()
function. Let’s first construct a JSON object containing the element we are interested in using the JSN_OBJECT()
constructor function in that was introduced in part 1:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_contains(reading, jsn_object('spread', 6.95)) from thermostat_reading; +---------------------------------------------------+ | jsn_contains(reading, jsn_object('spread', 6.95)) | +---------------------------------------------------+ | 1 | | 0 | | 1 | | 0 | | 1 | | 0 | | 1 | | 0 | | 0 | | 0 | | 0 | | 0 | +---------------------------------------------------+ 12 rows in set (0.00 sec) |
In this case, we could have achieved this by simply using JSN_EXTRACT()
and selecting the field value from our table instead:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
mysql> select jsn_extract(reading, '$.spread') = 6.95 from thermostat_reading; +-----------------------------------------+ | jsn_extract(reading, '$.spread') = 6.95 | +-----------------------------------------+ | 1 | | 0 | | 1 | | 0 | | 1 | | 0 | | 1 | | 0 | | 0 | | 0 | | 0 | | 0 | +-----------------------------------------+ 12 rows in set (0.00 sec) |
But … JSN_CONTAINS()
can also handle arrays and nested objects! So let’s say that we want to check if the model has the two capabilities: “fan” and “ac”. We can check that this way:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
mysql> select * from thermostat_model; +----------+-------------------------------------------------------------------------------------------------------------+ | model_id | capabilities | +----------+-------------------------------------------------------------------------------------------------------------+ | abc123 | ["fan", "furnace"] | | xyzzy | ["programmable", "fan", "ac", "furnace", "smart_fan", "https", "password_protected", "intrusion_detection"] | +----------+-------------------------------------------------------------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> select jsn_contains(capabilities, jsn_array('ac', 'fan')) from thermostat_model; +----------------------------------------------------+ | jsn_contains(capabilities, jsn_array('ac', 'fan')) | +----------------------------------------------------+ | 0 | | 1 | +----------------------------------------------------+ 2 rows in set (0.00 sec) |
Note that the order of the capabilities is not significant. If one of the thermostat model’s features has a complex description, e.g. say the fan capability was described by a tuple:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
mysql> insert into thermostat_model values -> ( "Acme basic", '[{"rpm": 2000, "voltage": 110, "capability": "fan"}, "furnace"]' ), -> ( "Acme super", '["programmable", {"rpm": 3000, "voltage": 220, "capability": "fan"}, "ac", "furnace"]' ); Query OK, 2 rows affected (0.05 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select * from thermostat_model; +------------+-------------------------------------------------------------------------------------------------------------+ | model_id | capabilities | +------------+-------------------------------------------------------------------------------------------------------------+ | abc123 | ["fan", "furnace"] | | Acme basic | [{"rpm": 2000, "voltage": 110, "capability": "fan"}, "furnace"] | | Acme super | ["programmable", {"rpm": 3000, "voltage": 220, "capability": "fan"}, "ac", "furnace"] | | xyzzy | ["programmable", "fan", "ac", "furnace", "smart_fan", "https", "password_protected", "intrusion_detection"] | +------------+-------------------------------------------------------------------------------------------------------------+ 4 rows in set (0.00 sec) |
We could then query for models that had features of “furnace” and “voltage”=110 in this way:
1
2
3
4
5
6
7
8
9
10
|
mysql> select jsn_contains(capabilities, jsn_array('furnace', jsn_object('voltage', 110))) from thermostat_model; +------------------------------------------------------------------------------+ | jsn_contains(capabilities, jsn_array('furnace', jsn_object('voltage', 110))) | +------------------------------------------------------------------------------+ | 0 | | 1 | | 0 | | 0 | +------------------------------------------------------------------------------+ 4 rows in set (0.00 sec) |
As you can see the query checks for the presence of a “slice” or fully contained part of the document. The contained object we searched for is a subset of both the outer array and the nested object value.
Another interesting function is the JSN_SEARCH()
function. This can be used to locate scalar values inside of a document. Currently we can search for JSON string values. If found, the function will return a JSON path to the found occurrence. We can get one or all occurrences of the string depending on what we ask for (‘one’ below).
For example, let’s say we want to know where, if present, the string “furnace” can be found in the thermostat_model table, we could issue this query:
1
2
3
4
5
6
7
8
9
10
|
mysql> select jsn_search(capabilities, 'one', 'furnace') from thermostat_model; +--------------------------------------------+ | jsn_search(capabilities, 'one', 'furnace') | +--------------------------------------------+ | "$[1]" | | "$[1]" | | "$[3]" | | "$[3]" | +--------------------------------------------+ 4 rows in set (0.00 sec) |
The returned values here are MySQL JSON paths into a top level array at index 1 and 3. The location of that capability differs between the two “Acme” models. The search argument can also contain SQL wild cards (default ‘%’) similar to the LIKE
operator.
JSON values are typed in JSON, and MySQL has a JSN_TYPE()
function to inquire about the type of a JSON value. This function returns a MySQL string. Let’s check some types of the thermostat_model data:
1
2
3
4
5
6
7
8
9
10
|
mysql> select jsn_type(capabilities), jsn_type(jsn_extract(capabilities, '$[0]')) from thermostat_model; +------------------------+---------------------------------------------+ | jsn_type(capabilities) | jsn_type(jsn_extract(capabilities, '$[0]')) | +------------------------+---------------------------------------------+ | ARRAY | STRING | | ARRAY | OBJECT | | ARRAY | STRING | | ARRAY | STRING | +------------------------+---------------------------------------------+ 4 rows in set (0.01 sec) |
This makes it possible to perform an action on JSON values depending on their type, using for example a CASE
test:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
mysql> select case( jsn_type( jsn_extract(capabilities, '$[0]'))) -> when "OBJECT" then jsn_extract( jsn_extract(capabilities, '$[0]'), '$.capability') -> else jsn_extract(capabilities, '$[0]') -> end as cap from thermostat_model; +----------------+ | cap | +----------------+ | "fan" | | "fan" | | "programmable" | | "programmable" | +----------------+ 4 rows in set (0.00 sec) |
The current set of JSON types in MySQL is a super set of the types in the ECMA standard for JSON. We currently support NULL, DECIMAL, INTEGER, UNSIGNED INTEGER, DOUBLE, STRING, OBJECT, ARRAY, BOOLEAN, DATE, TIME, DATETIME, TIMESTAMP
and OPAQUE
(raw bits).
When encountering some data, presumably a JSON document, we can check if the document is a valid JSON document by using the provided JSN_VALID()
function. For example, the following check yields 0 because the text uses an unquoted string value of “fan” as a value of the ‘capability’ element:
1
2
3
4
5
6
7
|
mysql> select jsn_valid('{"capability": fan, "rpm": 3000, "voltage": 220}'); +---------------------------------------------------------------+ | jsn_valid('{"capability": fan, "rpm": 3000, "voltage": 220}') | +---------------------------------------------------------------+ | 0 | +---------------------------------------------------------------+ 1 row in set (0.03 sec) |
That’s also something that would be evident if we tried to parse that is a JSON document:
1
2
|
mysql> select cast('{"capability": fan, "rpm": 3000, "voltage": 220}' as json); ERROR 3131 (22032): Invalid JSON text in argument 1 to function cast_as_json: "Invalid value" at position 17 in '{"capability": fan, "rpm": 3000, "voltage": 220}'. |
If we add text quotes around “fan”, however, then it will work as expected:
1
2
3
4
5
6
7
|
mysql> select jsn_valid('{"capability": "fan", "rpm": 3000, "voltage": 220}'); +-----------------------------------------------------------------+ | jsn_valid('{"capability": "fan", "rpm": 3000, "voltage": 220}') | +-----------------------------------------------------------------+ | 1 | +-----------------------------------------------------------------+ 1 row in set (0.00 sec) |
Finally, to wrap things up, let us look at two small but useful functions: JSN_QUOTE()
and JSN_UNQUOTE()
. These can provide the “glue” needed to handle the impedance mismatch between MySQL and JSON types.
When inserting data into a JSON column, we have seen that we can just embed the JSON text inside a MySQL string. As we just saw above, we can also explicitly CAST
text to a JSON type. Sometimes, however, we have a MySQL string that we do not want to parse as a JSON document, but we instead want to embed it as a JSON string “as is”. Let’s compare the difference between the two JSON values inserted below to demonstrate the point:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
mysql> create table t(doc1 json, doc2 json); Query OK, 0 rows affected (0.20 sec) mysql> insert into t values('[1,2,3]', jsn_quote('[1,2,3]')); Query OK, 1 row affected (0.03 sec) mysql> select doc1, jsn_type(doc1), doc2, jsn_type(doc2) from t; +-----------+----------------+-----------+----------------+ | doc1 | jsn_type(doc1) | doc2 | jsn_type(doc2) | +-----------+----------------+-----------+----------------+ | [1, 2, 3] | ARRAY | "[1,2,3]" | STRING | +-----------+----------------+-----------+----------------+ 1 row in set (0.00 sec) |
Notice the difference here: the first value is parsed as a JSON document with structure (an array), the second is inserted as a simple JSON string scalar. Notice that the function does more than wrap text quotes around the argument: it will ensure that the resulting MySQL string is acceptable to a JSON parser by performing any needed internal quoting. Also, an embedded text quote is not an issue in an SQL string since single quotes are used as a string delimiter in SQL.
Conversely, JSN_UNQUOTE()
will strip away surrounding quotes. This function can be used to turn a valid JSON string literal into an SQL string literal.
This is usually not necessary though, since MySQL tries to be helpful when doing comparisons:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
mysql> select jsn_extract('{"userName": "fred"}', '$.userName'); +---------------------------------------------------+ | jsn_extract('{"userName": "fred"}', '$.userName') | +---------------------------------------------------+ | "fred" | +---------------------------------------------------+ 1 row in set (0.00 sec) mysql> select jsn_extract('{"userName":"fred"}', '$.userName') = 'fred'; +-----------------------------------------------------------+ | jsn_extract('{"userName":"fred"}', '$.userName') = 'fred' | +-----------------------------------------------------------+ | 1 | +-----------------------------------------------------------+ 1 row in set (0.00 sec) |
Here the comparison happens in the JSON context, and the MySQL string “fred” is automatically converted to a JSON string before the comparison. We could though instead use the JSN_UNQUOTE()
function to perform the comparison in the MySQL string domain in this way:
1
2
3
4
5
6
7
|
mysql> select jsn_unquote(jsn_extract('{"userName": "fred" }', '$.userName')) = 'fred'; +--------------------------------------------------------------------------+ | jsn_unquote(jsn_extract('{"userName": "fred" }', '$.userName')) = 'fred' | +--------------------------------------------------------------------------+ | 1 | +--------------------------------------------------------------------------+ 1 row in set (0.00 sec) |
But the former is obviously easier and the latter should usually be unnecessary.
In summary, here were the functions covered:
- JSN_SEARCH()
- JSN_CONTAINS()
- JSN_CONTAINS_PATH()
- JSN_VALID()
- JSN_TYPE()
- JSN_KEYS()
- JSN_LENGTH()
- JSN_DEPTH()
- JSN_UNQUOTE()
- JSN_QUOTE()
This concludes this brief introduction to the query related JSON functions available in this lab release! More details can be found in the worklog entry: WL#7909.
We expect to further refine and polish the JSON functions before 5.7 GA, so please keep those comments coming, and please start playing with them! Your input and contributions will no doubt improve the final features. In the meantime, have fun, and please let us know what you think of these new JSON features! We’d love to hear your feedback on what else you’d like to see related to our wider JSON support. If you encounter any problems with these new features, please let us know here in the comments, open a bug report at bugs.mysql.com, or open a support ticket.
Thank you for using MySQL!