WL#9807: X Protocol Crud.Insert with upsert
Affects: Server-8.0
—
Status: Complete
The UPSERT command is a very useful operation that allows performing an insert or update, depending on whether the record already exists or not, in a single step, atomically. MySQL supports the ON DUPLICATE KEY UPDATE extension to INSERT, which has this behaviour, and should be used for implementation. Limitations ----------- The intended behaviour is: * if records exists, replace it with our the values presented, other side create Building a ON DUPLICATE KEY UPDATE for collections is straight-forward: INSERT INTO collection VALUES (key, value) ON DUPLICATE KEY UPDATE SET _doc=value; as the structure of the underlying table is known (value goes into _doc) and unique keys are based on generated columns and aren't updated directly. For UPSERTS into tables the behaviour is different: INSERT INTO collection VALUES ("a", "b", "c") ON DUPLICATE KEY UPDATE SET ??="f", ??="b", ??="c"; The column names and keys are not known, unless they come with the query or they're queried from MySQL. The 2nd introduces the need for metadata locking, caching and additional queries, which would be a performance issue. Therefore this worklog limits the feature to collections only.
FR1 - When a document inserted, if another item already exists with the same unique key value (that is, inserting it would cause a duplicate key error), the document should replace the original duplicate item. FR2 - When an upsert leads to an Update, the primary key of the original item must be preserved FR3 - Upsert operations on tables should be rejected with an Error. FR4 - If an upsert on collections with one or more unique keys in addition to the PK results in multiple matches/duplicate keys on the same row, the operation must succeed. FR5 - If an upsert on collections with one or more unique keys in addition to the PK results in multiple matches/duplicate keys on different rows, the operation must fail with an error.
Notes ===== - For collections, it is only possible to replace the entire document. Partial changes/updates are not supported for Upserts at the moment, for the sake of simplicity. - Upsert for tables is not supported. The reason is that for upsert to work, we need to know the structure of the table beforehand, at the time that the SQL translation is done. For collections, we know the structure, but for tables, the column names and keys are not known, unless they come with the query or they're queried from MySQL. The 2nd introduces the need for metadata locking, caching and additional queries, which would be a performance issue. In the end, the effort needed to make it work and/or the usability of the feature would outweight the potential benefits. When a key conflict occurs during insert, the following behavior will take place: - For collections with no additional unique keys other than the primary key, the entire document will be replaced. In such cases, the _id is the only column that can conflict, so its value will be preserved anyway. - For collections with a unique key in addition to the _id primary key: - If the conflict is on the _id, the document must be replaced while keeping the _id (same case as if there was a primary key only) - If the conflict is on a key other than _id, the operation must fail with an error, to preserve the immutability of the document _id Summary ------- The following table hashes out the expected action for different combinations of duplicate primary key (PK) and duplicate UNIQUE secondary key cases when there are 0, 1 and 2+ unique key conflicts. Values mean: - No key O OK, no conflict, the value is unique X Conflict Y Conflict with different rows (eg PK matches a row and SK matches another) PK SK Note Action -------------------------------------------------------------- O - No matches Normal INSERT O O No matches, SK OK Normal INSERT O X SK matches, PK OK UPDATE on the matching SK, maintain old PK X - PK matches UPDATE on the matching PK X O PK matches, SK OK UPDATE on the matching PK X X PK and SK match same row UPDATE on the matching PK X Y PK and SK match diff rows Unique key error If there are multiple secondary keys, the case where they all match the same row or none at all is treated the same as having a single secondary key match (X X). If the secondary keys match different rows, it is treated as matches on different rows (X Y). Protocol ======== diff --git a/rapid/plugin/x/protocol/mysqlx_crud.proto b/rapid/plugin/x/protocol/mysqlx_crud.proto index e046f2c..b5c4992 100644 --- a/rapid/plugin/x/protocol/mysqlx_crud.proto +++ b/rapid/plugin/x/protocol/mysqlx_crud.proto @@ -137,6 +137,7 @@ message Find { // :param projection: name of the columns to insert data into (empty if data_model is DOCUMENT) // :param row: set of rows to insert into the collection/table (a single expression with a JSON document literal or an OBJECT expression) // :param args: values for parameters used in row expressions +// :param upsert: true if this should be treated as an Upsert (that is, update on duplicate key) // :Returns: :protobuf:msg:`Mysqlx.Resultset::` message Insert { required Collection collection = 1; @@ -148,7 +149,8 @@ message Insert { repeated Mysqlx.Expr.Expr field = 1; }; repeated TypedRow row = 4; repeated Mysqlx.Datatypes.Scalar args = 5; + optional bool upsert = 6; }; // Update documents/rows in a collection/table SQL Mapping =========== If the upsert flag is true, the INSERT statement shall generate. INSERT INTO collection (doc) VALUES (val1) ON DUPLICATE KEY UPDATE doc = JSON_SET(VALUES(doc), '$._id', JSON_EXTRACT(doc, '$._id')); For Tables, an error should be generated in case a upsert is requested. Compatibility ============= Compatibility assurance lies on client side. Clients that use new features (in this case XMessage - field) must ensure backward compatibility by follow the procedure described under: X Protocol expectations for supported protobuf fields
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.