WL#7987: JSON aggregation functions
Affects: Server-8.0
—
Status: Complete
Add aggregation functions to generate JSON arrays and objects. This makes it possible to combine JSON documents in multiple rows into a JSON array or a JSON object. Examples: CREATE TABLE t1(id INT, grp INT, jsoncol JSON); INSERT INTO t1 VALUES(1, 1, '{"key1":"value1","key2":"value2"}'); INSERT INTO t1 VALUES(2, 1, '{"keyA":"valueA","keyB":"valueB"}'); INSERT INTO t1 VALUES(3, 2, '{"keyX":"valueX","keyY":"valueY"}'); SELECT JSON_ARRAYAGG(jsoncol) AS json FROM t1; Result: [{"key1":"value1","key2":"value2"}, {"keyA":"valueA","keyB":"valueB"}, {"keyX":"valueX","keyY":"valueY"}] SELECT JSON_ARRAYAGG(jsoncol) AS json FROM t1 GROUP BY grp; Result: 1 | [{"key1":"value1","key2":"value2"}, {"keyA":"valueA","keyB":"valueB"}] 2 | [{"keyX":"valueX","keyY":"valueY"}] SELECT JSON_OBJECTAGG(id, jsoncol) AS json FROM t1; Result: {"1":{"key1":"value1","key2":"value2"}, "2":{"keyA":"valueA","keyB":"valueB"}, "3":{"keyX":"valueX","keyY":"valueY"}} SELECT JSON_OBJECTAGG(id, jsoncol) AS json FROM t1 GROUP BY grp; Result: 1 | {"1":{"key1":"value1","key2":"value2"}, "2":{"keyA":"valueA","keyB":"valueB"}} 2 | {"3":{"keyX":"valueX","keyY":"valueY"}}
F1 - Provide a function called JSON_ARRAYAGG that aggregates data into JSON array. F2 - The JSON_ARRAYAGG function should return either a valid JSON array, NULL in case there are no rows or error. F3 - Provide a function JSON_OBJECTAGG that aggregates data into JSON object. F4 - The JSON_OBJECTAGG function should return either a valid JSON array, NULL in case there are no rows or error.
The functions are defined based on the current specifications found in the 2015 SQL draft standard. The standard syntax for JSON_ARRAYAGG:::= JSON_ARRAYAGG [ ] [ ] [ ] - value expression: values to be used to create the array. - order by clause: Not supported. This means that the order of the elements in the array is undefined. No known workaround. - null clause: Not supported. The implementation behaves as NULL ON NULL. The standard says that ABSENT ON NULL is implicit. Workaround: can be emulated by adding "WHERE x IS NOT NULL" to the query - output clause: Not supported. We return JSON, so it's implicitly RETURNING JSON FORMAT. Standard says it's an implementation-defined string type, so there's no problem. The standard syntax for JSON_OBJECTAGG: ::= JSON_OBJECTAGG [ ] [ ] - name: Key to be used for creating new property inside the object. - value expression: Value to be used as the property's value. - null clause: Not supported. Standard says implicit NULL ON NULL the same as it is implemented. - output clause: Not supported. We have implicit JSON return type, as for JSON_ARRAYAGG. The standard expects for exception in case the is null and the implementation returns error. The standard also expects exception for duplicate (aka key) but we do not comply with this and instead skip the insert for the duplicate (JSON DOM implementation). New Item_sum* classes will be introduced: - Item_sum_json - Item_sum_json_array - Item_sum_json_object And the hierarchy will look like this: Item ^ | Item_result_field ^ | Item_sum ^ | Item_sum_json ^ | ---------------------- ^ ^ | | Item_sum_json_array Item_sum_json_object The implementation will take advantage of the existing Item_sum aggregation framework. Item_sum_json will: - take care of the conversions(all the val_ ... functions) - hold the container(either a json_array or a json_object) inside m_wrapper - grouping mechanisms(reset_field, update_field) - take care of the details regarding the temporary result_field creation and output type( using field_type(), sum_func()- type, result_type()) Item_sum_json_array and Item_sum_json_object will take care of the specifics: - implement the clear() function which will create the container(Json_array or Json_object) - implement the add() function which will know how to append to the container.
The implementation will use the Item_sum aggregation framework and it will work like this: When a query will call JSON_ARRAYAGG or JSON_OBJECTAGG without GROUP BY: - init_sum_functions will be called which will call reset_and_add which will call Item_sum's reset_and_add - inside reset_and_add, Item_sum_json_array::clear and Item_sum_json_array::add will be called which will create the container and then append the first element - update_sum_func will call Item_sum_json_array::add which will for each of the rows and it will append the elements to the container - when all the rows in the table will have been processed and the send_result_set_row is called, the Item_sum_json::val_str will be called and the container will be converted to string and returned. When a query will call JSON_ARRAYAGG or JSON_OBJECTAGG with GROUP BY: - a Field_json will be created to hold each group's value - depending if the row is from an existing group or from a new group, either Item_sum_json::reset_field or Item_sum_json::update_field will be called: (1) Item_sum_json::reset_field() will: (*) call Item_sum_json_array::clear to create a container(Json_array) and reference it to m_wrapper (*) call Item_sum_json_array::add and append the first element of the group to the container(m_wrapper) (*) serialize and store the current value of the container(m_wrapper) inside the result_field (2) Item_sum_json::update_field() will: (*) de-serialize result_field's value to the container(m_wrapper) calling the val_json function over result_field (*) call Item_sum_json_array::add and append the current element to the container (*) serialize and store the current value inside result_field(calling the store_json function with the m_wrapper as argument) - when the results will be send to the client the result will already be saved in result_field so val_str function will be called to get the string value. The presence of JSON aggregates in the query blocks makes the optimizer change the plan from using tmp table + index and update to always do grouping with filesort. This was done by introducing with_json_agg boolean member which will be set to true when the group by contains a JSON aggregate function and it will force the use of filesort.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.