WL#11452: Support for BLOBs in temptable engine
Affects: Server-8.0
—
Status: Complete
The purpose of this task is to add support for blob-storage columns (blob, text, json, geometry, ...) in temptables. Currently temptable engine do not support storing such columns so disk engine (InnoDB or MyISAM) must be used to store temporary tables. Extending the temptable engine could speed up queries that uses temporary tables which contains blob data.
Functional Requirements: 1. It must be possible to create tables with all types of columns stored as blobs in temptables engine (BLOB, TEXT, JSON, geometry). 2. Defined limits (configuration) for memory usage must be respected. 3. All requirements from WL#8117 must remain valid. Non-Functional requirements: 1. Performance impact for tables without blob columns must be minimized. 2. Performance for tables with blob columns should be better than for tables using disk storage. Notes: 1. Support for spatial (rtree) indexes is not part of this task. 2. Support for partial BLOB update is not part of this task. 3. Temptable implementation assumes it is not required to copy the blob data when reading rows (blobs data is returned in rows as pointers). The returned pointers are pointing to data stored in table and remain valid until the row is removed (by remove or update). As there should be no concurrent handlers to same table it should be safe.
High-Level Specification 1. Design from WL#8117 is a base for this changes. 2. Handler must report in table_flags() that blobs are supported. 3. Columns with a BLOB-storage scheme (BLOB, JSON, ...) in temptable rows will be stored the same way as VARCHAR columns. 4. Proper handling of all BLOB storage lengths will be added - VARCHAR only support 1 and 2 byte length, while blobs could have length encoded on 1-4 bytes. 5. Server code to be modified to allow using the temptables for tables with blobs. 6. There is no need to copy the BLOB data when read from table. - Rationale: -- Pointers to be stored in mysql row must only remain valid until next call to handler. -- Temptable row data is stored in memory already (nothing is flushed etc.). -- Calls that read data do not remove the data. - Reference: -- "For blob columns (see Field_blob), the record buffer stores length of the data, following by memory pointer to the blob data. The pointer is owned by the storage engine and is valid until the next operation." 7. If temptable engine is set as the engine used for in-memory temporary tables (internal-tmp-mem-storage-engine) it will be used for tables with blobs wherever possible - there are limitations (big results, disk storage requested by caller, initialization stage, etc. - see setup_tmp_table_handler() for details).
Elements to be done to support BLOBs in temptable engine: 1. Extending Column class to hold required information for BLOB columns. 2. Copying BLOBs from mysql_row to temptable::Row. - Proper reading of all allowed length sizes (1-4 bytes). - Deep copy (move to temptable row). 3. Copying BLOBs from temptable::Row to mysql_row. - Proper writing of all allowed length sizes (1-4 bytes). - Shallow copy (write address to mysql rows). - No deep copy needed at the moment (to be double checked during tests). 4. Getting BLOB data address from MySQL row. - Used e.g. when searching for a row to get cell contents for comparison. 4. Extending Row class to (for mysql rows): - Construct cells with proper pointer to BLOB data. - Handle BLOB columns correctly (during copying). 5. Change the handler to report proper table flags (HA_NO_BLOBS). 6. Change the server code to properly detect blobs support and use temptable engine whenever possible when creating temporary tables (setup_tmp_table_handler). 7. Table update code needs to be modified to hold previous row data valid (shallow copy for BLOB data - see HLS) until the update is finished.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.