WL#10797: Mysqlx.Crud.Update: MERGE_PATCH

Affects: Server-8.0   —   Status: Complete   —   Priority: Medium

Support for modifying matching documents using the JSON_MERGE_PATCH() function.

Currently, a MERGE operation is supported for the Modify CRUD command. However the merge algorithm used by the underlying JSON_MERGE() function is uncommon and does not provide the behavior most users 
would expect.

On the other hand, the new JSON_MERGE_PATCH() function aims to provide a merge 
function with a more well defined and interesting merge algorithm, which is 
suitable for performing incremental document modifications through an easier to 
use API.

The MERGE_PATCH allows performing the following types of changes in a single 
operation, using a JSON document as input:

- delete fields from document
- insert fields to document
- pass-through existing fields
- rename fields
- move fields
- modify field values

For example:

    original document:
      "_id": "1138794022",
      "name": {
        "first": "Alice",
        "last": "Cooper"
      "country": "USA",
      "city": "New York",
      "zip": "1234567",
      "birthdate": "1950-01-01"

    patch document/expression:
      "_id": null,
      "name" : concat($.name.last, ',', $.name.first),
      "age": datediff(current_date(), $.birthdate)/365,
      "address": {
        "city": $.city,
        "zip": $.zip
      "greeting": "'hello!!!'"

    resulting document:
      "_id": "1138794022",
      "name": "Cooper, Alice",
      "country": "USA",
      "address": {
        "city": "New York",
        "zip": "1234567"
      "age": 67,
      "greeting": "hello!!!"

* F1 - Must accept changes to be applied to matching documents of a collection
in the form of a "patch" object, similar to a JSON object
* F2 - Must allow renaming of a field
* F3 - Must allow deletion of a field
* F4 - Must allow setting the value of a field to a literal value
* F5 - Must allow setting the value of a field to the result of an expression
* F6 - Must allow above operations to be performed on nested documents
* F7 - Must apply patch on all matching documents
* F8 - Must work for collections
* F9 - When working on collections, must preserve the value of the _id field and ignore any attempts to change it
* F10 - Must work for JSON document fields in tables
* F11 - Patching a NULL column value in a table is equivalent to patching an empty {} object.
* F12 - When working on JSON document fields of a table, must produce an error when attempting to patch a column that is not an Object
X Protocol

The Mysqlx.Crud.Update protocol message needs to be updated with new operation type (MERGE_PATCH):

    --- a/rapid/plugin/x/protocol/mysqlx_crud.proto
    +++ b/rapid/plugin/x/protocol/mysqlx_crud.proto
    @@ -94,6 +94,7 @@ message UpdateOperation {
         ITEM_MERGE = 5;     // source and value must be documents
         ARRAY_INSERT = 6;   // insert the value in the array at the index identified in the source path
         ARRAY_APPEND = 7;   // append the value on the array at the identified path
    +    MERGE_PATCH = 8;    // merge JSON object value with the provided patch expression
       required Mysqlx.Expr.ColumnIdentifier source = 1;
       required UpdateType operation = 2;

* The interpretation of the value expression is as described in the Document Patch Expressions section.

* For DOCUMENT requests, the source field must be empty or contain reference to the document.

* For TABLE requests, the source field contains the name of the column to be modified.

Document Patch Expressions

A document patch is similar to a JSON document, with the key difference that document field values can be nested expressions in addition to literal values. 

* The patch contains instructions of how the source document is to be modified producing a derived document. 
* By default, all fields from the source document are copied to the resulting document
* If patch sets a field to NULL, the field of that name in the source is skipped from the result
* Expressions, identifiers and function calls are evaluated against the original source document
* Arrays expressions are evaluated normally
* Object expressions recursively evaluate against the matching nested object in the source

Field values are interpreted as described in the following table:
| Field value in patch    | Field value in source | Resulting field value      | Description                                         |
| field: any'             | missing/invalid       | field: any'                | See note 1                                          |
|                         | field: any"           | field: any"                | Missing field in patch means NO-OP, or pass-through |
| field: LITERAL/NULL     | field: any"           |                            | null/NULL patch value means delete field            |
| field: LITERAL/scalar'  | field: any"           | field: scalar'             | Set value to literal scalar                         |
| field: LITERAL/JSON'    | field: any"           | field: cast(JSON' as JSON) | Cast the JSON string literal into JSON              |
| field: IDENTIFIER       | field: any"           | field: eval()              | Value of the identifier (e.g. value from source)    |
| field: VARIABLE         | field: any"           | ERROR                      | Variable not allowed                                |
| field: FUNC_CALL        | field: any"           | field: eval()              | Evaluate function call and use its result           |
| field: OPERATOR         | field: any"           | field: eval()              | Evaluate the operation and use its result           |
| field: PLACEHOLDER      | field: any"           | field: placeholder         | Set field to value of placeholder                   |
| field: OBJECT           | field: object"        | field: eval()              | Evaluate the sub-patch                              |
| field: ARRAY            | field: any"           | field: eval()              | Evaluate the array                                  |

Note 1: If the source field referred to by the patch does not exist, either because it's missing from the source document or because the field is not an object in the source, it will act as a pass-through to the patch value.

Protocol Representation

Document patches are represented as Mysqlx.Expr.Expr objects, as already defined in the X protocol. Specifically, the feature where Objects can represent JSON Object like structures is used to allow nestable key/expression-value pairs.
When operation is done on collection the result of patch must be an object, thus root node of the Mysqlx.Expr.Expr message must be:

* Mysqlx.Expr.object
* literal that represents JSON object
* any expression that returns JSON object

In case of tables, there aren't any limitations concerning patch expressions.

SQL Mapping of MERGE_PATCH operation

SQL mapping takes advantage of the JSON_MERGE_PATCH() function, which has the desired semantics for merging a "patch" document against another JSON document.
In short, the mapping takes the form of:

  @result = JSON_MERGE_PATCH(source, @patch_expr)

where @patch_expr is the expression generated for the patch object.



      "a": "literal",
      "b": 1 + sqrt($.x),
      "c": {
        "d": [1, 2, null],
        "e": "1+1"
      "f": [$.a]

Generated SQL Expression:

      JSON_OBJECT('a', 'literal',
                  'b', 1 + sqrt(doc->>'$.x'),
                  'c', JSON_OBJECT(
                    'd', JSON_ARRAY(1, 2, null),
                    'e', '1+1'
                  'f', JSON_ARRAY(doc->>'$.a')))

SQL Mapping of ITEM_MERGE operation

"ITEM_MERGE" operation maps to "JSON_MERGE" SQL function, which currently is deprecated and generates a warning. X Plugin must map the operation to "JSON_MERGE_PRESERVE" which works the same as "JSON_MERGE" but a warning is not generated.

Error and Warnings

When collection is updated with value that is not a JSON object (result of patching) it must return following error:

|Property    |Value                                         |
|Name        |ER_X_BAD_UPDATE_DATA                          |
|Error code  |5050                                          |
|New         |No                                            |
|Error text  |Invalid data for update operation on document |
|            |collection table                              |

When collection is updated using MERGE_PATCH and an "source" field of the operation is neither not set nor empty string path, then following error must be returned:

|Property    |Value                                         |
|Name        |ER_X_BAD_UPDATE_DATA                          |
|Error code  |5050                                          |
|New         |No                                            |
|Error text  |Unexpected source for ITEM_MERGE operation    |