WL#11434: X Protocol Crud.Insert with upsert, behavior redefined

Affects: Server-8.0   —   Status: Complete

Motivation

WL#9807 introduces an "upsert" (insert-or-update) functionality for Documents.

Following functional requirement of WL#9807:

FR2 - When an upsert leads to an Update, the primary key of the original item must be preserved

which leads to the upserted document to loose its document-id:

Given collection contains unique-index on field "name"
  And collection contains a document with { _id = "a", name = "foo", city = "Rome" }
 When user sends insert-or-update { _id = "b", name = "foo", city = "Rio" }
 Then ...

Expected result:

Insert document "b" and Error (unique-key violation on "name")

Current result:

Update document "a": _id = "a" (or "b"), name = "foo", city = "Rio"

Goal

Redefine the functional requirements in a way that upsert triggered by secondary key causes error when primary keys of both documents doesn't match.

FR1 - When a document inserted, if another item already exists with the same unique key value (that is, inserting it would cause a duplicate key error), the document should replace the original duplicate item.

FR2 - When an upsert leads to an Update, the primary key of the original item must match the new primary key from the new document.

FR3 - Upsert operations on tables should be rejected with an Error.

FR4 - If an upsert on collections with one or more unique keys in addition to the PK results in multiple matches/duplicate keys on the same row, the operation must succeed.

FR5 - If an upsert on collections with one or more unique keys in addition to the PK results in multiple matches/duplicate keys on different rows, the operation must fail with an error.

Notes

Most requirements are reused from WL#9807. Specification parts that were changed are bloded.

Protocol

There in no change in definition X Protocol messaged that triggers upsert, only behavior triggered by it, is impacted.

Behavior

The following table hashes out the expected action for different combinations of duplicate primary key (PK) and duplicate UNIQUE secondary key cases when there are 0, 1 and 2+ unique key conflicts.

PK SK Note Action
O N/A No matches Normal INSERT
O O No matches, SK OK Normal INSERT
O X SK matches, SK OK Unique key error
X N/A PK matches UPDATE on the matching PK
X O PK matches, SK OK UPDATE on the matching PK
X X PK and SK match same row UPDATE on the matching PK
X Y PK and SK match diff rows Unique key error

Legend:

  • N/A - No key
  • O - OK, no conflict, the value is unique
  • X - Conflict
  • Y - Conflict with different rows

If there are multiple secondary keys, the case where they all match the same row or none at all is treated the same as having a single secondary key match (X X). If the secondary keys match different rows, it is treated as matches on different rows (X Y).

SQL Mapping

If the upsert flag is true, the INSERT statement shall generate with following " ON DUPLICATE KEY UPDATE" clause:

INSERT INTO collection (doc) VALUES (val1)
   ON DUPLICATE KEY UPDATE
     doc = IF(JSON_EXTRACT(doc, '$._id') = JSON_EXTRACT(VALUES(doc), 
                                                           '$._id'),
              VALUES(doc),
              mysqlx_error(127))");

The "ON DUPLICATE" must fail when "_id" columns of old and new documents, doesn't match.

Notes

Specification parts that were changed are in bold.

SQL

X Plugin builds a SQL from "Crud.Insert" message and passes it to MySQL Server. There is a need to fail the SQL when "ON DUPLICATE" clause is executed with not matching "_id" fields.

X Plugin can realize it three ways:

  • assign to "doc" column a value that isn't a json object (when SQL fails, the reason of failure must be guessed by X Plugin).
  • create a SQL function using "CREATE FUNCTION" statement which executes "SIGNAL" command (function need to be applied through upgrade scripts, it is associated with schema).
  • create an UDF function

UDF function is the clearest way of achieving required functionality. It must be defined as following:

  • mysqlx_error(INTERNAL_ERROR_CODE) - function capable of rising an error from SELECT statement, its defined only for internal use in X Plugin.