WL#8963: Support for partial update of JSON in the optimizer

Affects: Server-8.0   —   Status: Complete

The main goal of this worklog is to speed up update of JSON documents, which are stored as BLOBs. If only some parts of a JSON document are updated, we want to give information to the handler about what was changed, so that the storage engine and replication don't need to write the full document.

This worklog is for the optimizer's part only. Storage engines (at least InnoDB) and replication will be changed in other worklogs to take advantage of the changes made in this worklog.

How?

The executor will recognize that some updates of JSON values can be performed in-place instead of regenerating and overwriting the full document. When it detects that such an update is possible, it will collect a list of binary diffs that describe the changes made to the document. The binary diffs include information about which columns are updated, the offset of the changed data, the length of the changed data, and a pointer to the new data. The list of binary diffs is made available to the handler::ha_update_row() function via the TABLE object.

Which statements will be affected?

UPDATE statements that use JSON_SET or JSON_REPLACE to update JSON columns are candidates for partial update. We will attempt to perform a partial update of a column if the column's type is JSON and all updates of the column are expressed as col = JSON_SET(col, ...) or col = JSON_REPLACE(col, ...). The input column (col here) and the target column must be the same for partial update to be considered.

Partial update can also be performed if the first argument of the JSON_SET or JSON_REPLACE call is itself a call to JSON_SET or JSON_REPLACE, as long as the input column matches the target column. For example: col = JSON_SET(JSON_REPLACE(col, ...), ...).

The update is performed as a partial update only if all the changes are replacing an existing value with a new one (not adding new elements to the parent object or array), and there is a "hole" big enough to write the new value where the replaced value was located. This typically means that the replaced value must be at least as big as the replacement value. It could be possible to insert a bigger value if some previous partial update left some unused bytes at that position, or immediately before or after it. (Or, in the future, if we insert padding to allow values to grow. Padding will not be added in this worklog, though.)

For example:

   UPDATE t SET col1 = JSON_SET(col1, '$[1].name', 'Orange'),
                col1 = JSON_SET(col1, '$[1].price', 22)

Here all updates of col1 are on the form col1 = JSON_SET(col1, ...), so partial update can be attempted if col1's type is JSON.

The above statement can alternatively be written like one of the following statements, and still be performed as partial update:

   UPDATE t SET col1 = JSON_SET(col1,
                                '$[1].name', 'Orange',
                                '$[1].price', 22)
   UPDATE t SET col1 = JSON_SET(JSON_SET(col1, '$[1].name', 'Orange'),
                                '$[1].price', 22)

Partial update will be performed on all the rows in t where col1 contains values at the paths '$[1].name' and '$[1].price', and where the values at those paths are big enough to contain the string 'Orange' and the integer 22, respectively.

If a row contains the following value in col1

   [ {...}, { "name": "Strawberry", "price": 10, ... }, ... ]

the JSON document can be partially updated. 'Orange' is shorter than 'Strawberry', so it fits as a replacement, and both 22 and 10 are stored as 16-bit unsigned integers in the JSON document.

If a row contains the following value in col1

   [ {...}, { "name": "Apple", "price": 10, ... }, ... ]

partial update is skipped for this row, as 'Orange' is longer than 'Apple', so that there is not enough room to write the replacement string. (Except if there is spare room in the document after a previous partial update of the value at that path, that is.)

When to enable partial update?

Since the performance benefit from partial update is expected to come from reduced I/O in the storage engine and row-based replication, partial update in the optimizer should only be enabled if the storage engine supports partial update, or if row-based replication with partial update is enabled.

TBD: Should we limit partial update to documents above a certain threshold size? Should we disable partial update if more than a certain percentage of the document is changed in one operation? These questions can only be answered after we have run performance tests with the SE support included, so this worklog will not add any such thresholds.

Functional requirements

F-1: When an UPDATE statement updates existing elements of JSON columns with JSON_SET and/or JSON_REPLACE, and that update can be performed in-place in the binary representation of the document without increasing the size of the document or moving data around inside the document, the optimizer should perform that operation in-place in memory, and provide handler::ha_update_row() with a list of binary diffs that describe the changes.

F-2: EXPLAIN FORMAT=JSON should tell if an UPDATE statement will attempt to perform partial update, and which columns are eligible for partial update.

Non-functional requirements

NF-1: There should be no performance regressions if partial update isn't supported or used by the storage engine.

Data structures for representing partial updates

The partial updates will be represented as a vector of binary differences. Each difference from the original value is represented by a tuple called Binary_diff. The tuple contains information about which parts of a column that have been changed.

/**
  Class that represents a single change to a column value.
*/
class Binary_diff
{
  /// The offset of the start of the change.
  size_t m_offset;
  /// The size of the portion that is to be replaced.
  size_t m_length;
public:
  /// @return the offset of the changed data
  size_t offset() const;
  /// @return the length of the changed data
  size_t length() const;
};

The Binary_diff objects don't contain the new data for the changed section. The new data will be available in table->record[0], which can be accessed through the column's corresponding Field object. The new data for a section can be found by combining the offset/length information from the Binary_diff object with the data from the Field object.

The size of the replacement data is always the same as the size of the replaced data, so that none of the data that comes after the changed section will have to be moved.

The Binary_diff objects are collected in a Binary_diff_vector (a Mem_root_array stored on the execution mem_root), and there is one vector per column that is eligible for partial update. The binary diffs are stored in sorted order in the vector (sorted on increasing offsets). None of the diffs in the vector refer to overlapping or adjacent sections of the BLOB.

An empty Binary_diff_vector means that the partial update of the column is a no-op.

Contents


sql/field.h and sql/field.cc

Add a new function Field_json::get_binary() which returns a read-only pointer to the binary representation of a JSON column.

sql/item.h

Add a new virtual function to the Item class which checks if an expression can be used for partial update of a given column, and if so, mark the expression as used in partial update.

 /**
   Mark this expression as used in partial update, if it is on the right
   form. For example, the expression `JSON_REPLACE(col, '$.foo', 'bar')`
   can be used to partially update the column `foo`.
   @param field   the target column of the update
   @retval false  if successfully marked for partial update
   @retval true   if this expression cannot be used for partial update
                  of @a field
 */
 virtual bool mark_for_partial_update(Field *field);

sql/item_json_func.h and sql/item_json_func.cc

Override Item::mark_for_partial_update() in the Item_func_json_set_replace class, and mark the expression for partial update if its input is the same as the target column, or if it is a sequence of nested JSON_SET/JSON_REPLACE calls around the target column.

Add code to Item_func_json_set_replace::val_json() so that it first attempts to update the document in-place using the new Json_wrapper::attempt_partial_update() function described in the high-level specification. If the attempt to update the document in place fails, it falls back to updating the document using DOMs as before, and it will also disable partial updates of that column for the current row.

sql/json_binary.h and sql/json_binary.cc

Add the new functions described in the high-level specification.

sql/json_dom.h and sql/json_dom.cc

Add the new Json_wrapper::attempt_partial_update() function described in the high-level specification.

Add some more helper functions to the Json_wrapper class to make it possible to inspect more of its internals needed for partial update (is_dom(), temporal_field_type() and decimal_binary_size()).

Make seek_no_ellipsis()/Json_wrapper::seek() able to seek on a prefix of a path, which is useful for finding the parent of the value that is being replaced in a partial update.

sql/sql_update.cc

Add a new function prepare_partial_update(), which gets called from Sql_cmd_update::prepare_inner(). It checks if the conditions for partial update are fulfilled, and marks the eligible columns for partial update. The columns/expressions are only marked as /potential/ partial updates, since the ability to perform a partial update also depends on the data in the column, which is not known until execution.

Set up execution data structures for collecting binary diffs by calling TABLE::setup_partial_update() from Sql_cmd_update::update_single_table() and Query_result_update::optimize().

In Sql_cmd_update::update_single_table() and Query_result_update::send_data(), make sure the Binary_diff_vector objects in the TABLE are cleared before processing each new row, so that they don't contain information about the previous row.

sql/table.h and sql/table.cc

Add the new Binary_diff and Binary_diff_vector classes described in the high-level specification.

Add new functions to the TABLE class as described in the high-level specification.

New member variables added to the TABLE class:

 /**
   Bitmap that tells which columns are eligible for partial update in an
   update statement.
   The bitmap is lazily allocated in the TABLE's mem_root when
   #mark_column_for_partial_update() is called.
 */
 MY_BITMAP *m_partial_update_columns;
 /**
   Object which contains execution time state used for partial update
   of JSON columns.
   It is allocated in the execution mem_root by #setup_partial_update() if
   there are columns that have been marked as eligible for partial update.
 */
 Partial_update_info *m_partial_update_info;

Partial_update_info is a new struct that holds a bitmap of columns that are temporarily disabled (for use by TABLE::disable_partial_update_for_current_row()), a list of Binary_diff_vector objects for the columns that can be partially updated, and a String buffer that can be used for storing the partially updated column value while performing the update.

sql/opt_explain.cc, sql/opt_explain.cc, sql/opt_explain_json.cc

Add code to register information about partial update columns in the output from EXPLAIN FORMAT=JSON.

sql/handler.h

Add a new table flag, HA_BLOB_PARTIAL_UPDATE.

storage/innobase/handler/ha_innodb.cc

Make InnoDB report that it supports partial update by setting the HA_BLOB_PARTIAL_UPDATE table flag. This is not correct at the moment, but is done to ease testing until the InnoDB support has been pushed.

unittest/gunit/json_binary-t.cc

Add test cases for the new interfaces in the json_binary module.

unittest/gunit/json_dom-t.cc

Add test cases for the new interfaces in the Json_wrapper class.

mysql-test/suite/json/inc/json_functions.inc

Add test cases that exercise partial update from the SQL level.

Examples of binary changes

Given the JSON document [ "abc", "def" ], which is serialized like this in a JSON column:

        0x02 - type: small JSON array
        0x02 - number of elements (low byte)
        0x00 - number of elements (high byte)
        0x12 - number of bytes (low byte)
        0x00 - number of bytes (high byte)
        0x0C - type of element 0 (string)
        0x0A - offset of element 0 (low byte)
        0x00 - offset of element 0 (high byte)
        0x0C - type of element 1 (string)
        0x0E - offset of element 1 (low byte)
        0x00 - offset of element 1 (high byte)
        0x03 - length of element 0
        'a'
        'b'  - content of element 0
        'c'
        0x03 - length of element 1
        'd'
        'e'  - content of element 1
        'f'

Let's change element 0 from "abc" to "XY" using the following statement:

UPDATE t SET j = JSON_SET(j, '$[0]', 'XY')

Since we're replacing one string with a shorter one, we can just overwrite the length byte with the new length, and the beginning of the original string data. Since the original string "abc" is longer than the new string "XY", we'll have a free byte at the end of the string. This byte is left as is ('c'). The resulting binary representation looks like this:

        0x02 - type: small JSON array
        0x02 - number of elements (low byte)
        0x00 - number of elements (high byte)
        0x12 - number of bytes (low byte)
        0x00 - number of bytes (high byte)
        0x0C - type of element 0 (string)
        0x0A - offset of element 0 (low byte)
        0x00 - offset of element 0 (high byte)
        0x0C - type of element 1 (string)
        0x0E - offset of element 1 (low byte)
        0x00 - offset of element 1 (high byte)
CHANGED 0x02 - length of element 0
CHANGED 'X'
CHANGED 'Y'  - content of element 0
(free)  'c'
        0x03 - length of element 1
        'd'
        'e'  - content of element 1
        'f'

This change will be represented as one binary diff that covers the three changed bytes.

Let's now change element 1 from "def" to "XYZW":

UPDATE t SET j = JSON_SET(j, '$[1]', 'XYZW')

Since the new string is one byte longer than the original string, we cannot simply overwrite the old one. But we can reuse the free byte from the previous update, which is immediately preceding the original value.

To make use of this, we need to change the offset of element 1 to point to the free byte. Then we can overwrite the free byte and the original string data with the new length and string contents. Resulting binary representation:

        0x02 - type: small JSON array
        0x02 - number of elements (low byte)
        0x00 - number of elements (high byte)
        0x12 - number of bytes (low byte)
        0x00 - number of bytes (high byte)
        0x0C - type of element 0 (string)
        0x0A - offset of element 0 (low byte)
        0x00 - offset of element 0 (high byte)
        0x0C - type of element 1 (string)
CHANGED 0x0D - offset of element 1 (low byte)
        0x00 - offset of element 1 (high byte)
        0x02 - length of element 0
        'X'  - content of element 0
        'Y'  - content of element 0
CHANGED 0x04 - length of element 1
CHANGED 'X' 
CHANGED 'Y'
CHANGED 'Z'  - content of element 1
CHANGED 'W'

This change will be represented as two binary diffs. One diff for changing the offset, and one for changing the contents of the string.

Then let's replace the string in element 1 with a small number:

UPDATE t SET j = JSON_SET(j, '$[1]', 456)

This will change the type of element 1 from string to int16. Such small numbers are inlined in the value entry, where we normally store the offset to the value. The offset section of the value entry is therefore changed to hold the number 456. The length and contents of the original value ("XYZW") are not touched, but they are now unused and free to be reused. Resulting binary representation:

        0x02 - type: small JSON array
        0x02 - number of elements (low byte)
        0x00 - number of elements (high byte)
        0x12 - number of bytes (low byte)
        0x00 - number of bytes (high byte)
        0x0C - type of element 0 (string)
        0x0A - offset of element 0 (low byte)
        0x00 - offset of element 0 (high byte)
CHANGED 0x05 - type of element 1 (int16)
CHANGED 0xC8 - value of element 1 (low byte)
CHANGED 0x01 - value of element 1 (high byte)
        0x02 - length of element 0
        'X'  - content of element 0
        'Y'  - content of element 0
(free)  0x04 - length of element 1
(free)  'X' 
(free)  'Y'
(free)  'Z'  - content of element 1
(free)  'W'

The change is represented as one binary diff that changes the value entry (type and inlined value).