WL#8985: InnoDB: Refactor compressed BLOB code to facilitate partial fetch/update
                    Affects: Server-8.0
                                              —  
                        Status: Complete
                
    Introductory Note: ================== The main worklog is WL#8960. Currently optimizer must fetch a complete BLOB even if they need only few bytes. Likewise, the optimizer needs to update a complete BLOB even if only a small portion of it is modified. The main worklog WL#8960 will change this and make it possible to fetch or modify a small portion of the BLOB. This worklog WL#8985 is the first step to WL#8960. Description: =========== This worklog will refactor the compressed BLOB code. The idea is to provide a C++ interface to compressed BLOB functionality. This will help to implement the WL#8960 InnoDB: Partial Fetch and Update of BLOB. The changes done in this worklog are as follows: include/lob.h - a new file containing BLOB interface. btr/lob.cc - a new file containing BLOB implementation. btr_blob_context_t - hold context information for a BLOB insert operation. blob_delete_context_t - context information for BLOB delete operation. zblob_writer_t - a class to insert the full compressed BLOB. zblob_reader_t - a class to fetch the full compressed BLOB or a prefix of it. zblob_delete_t - a class to delete/free compressed BLOB. There is no functionality change done in this worklog. Insert Operation: ================ The compressed BLOB is currently stored as a single zlib stream into as many BLOB pages as necessary, forming a singly linked list. The first compressed BLOB page is of type FIL_PAGE_TYPE_ZBLOB and subsequent compressed BLOB pages are of type FIL_PAGE_TYPE_ZBLOB2. Because of this storage format, it is not possible to do random access of the compressed BLOB page. Currently we only insert full BLOBs. So this format was sufficient. But when we want to move to partial fetch and update of BLOBs (refer to WL#8960) then this format will need to be changed. Previously, the insert was done by btr_store_big_rec_extern_fields() directly. There were some if conditions to check whether it is compressed or uncompressed BLOB and operations were done in this function itself. This worklog moved out the functionality of inserting a compressed BLOB into the class zblob_writer_t. They are also placed in the new files include/lob.h and btr/lob.cc. This isolation will help to ease extending the BLOB functionality. Remember that these refactoring converts C-style functions into C++ style classes and member functions. This will help to introduce new functionalities. Fetch Operation: =============== Currently we either fetch the full BLOBs or a prefix of it. Previously this operation was done by btr_copy_zblob_prefix(), but now the functionality has been moved to zblob_reader_t. Since the compressed BLOB is a single stream, we cannot extract data from a random BLOB page. We need to extract the full BLOB or a prefix of it. We cannot fetch randomly from the middle of the BLOB. The zblob_reader_t is used to read a single BLOB. Its input is a BLOB reference object. How does row_ext_cache_fill() work? The function row_ext_cache_fill() is used to obtain a prefix of externally stored column or BLOB data. It does this by calling the function btr_copy_externally_stored_field_prefix() which previously used to call btr_copy_zblob_prefix() for compressed BLOB. But now, with this change, it will internally call zblob_reader_t. Delete Operation: ================ The compressed BLOB delete operation is now done by zblob_delete_t class. It was previously done as part of btr_free_externally_stored_field(). The code for the compressed BLOB is isolated and put in zblob_delete_t class. The context information necessary to carry out this delete operation is available in blob_delete_context_t class. The delete operation works by repeatedly deleting the first page of the singly linked list of BLOB pages and updated the BLOB reference in the clustered index record.
There are no functional requirements for this worklog. No new functionalities are added. No existing functionalities are modified. Non-functional requirements: 1. The code refactoring must not affect the performance of blob operations adversely. No performance gain is expected as well. 2. Provide a convenient C++ interface to operate on blob data.
The changes done in this worklog are as follows: include/lob.h - a new file containing BLOB interface. btr/lob.cc - a new file containing BLOB implementation. btr_blob_context_t - hold context information for a BLOB insert operation. blob_delete_context_t - context information for BLOB delete operation. zblob_writer_t - a class to insert the full compressed BLOB. zblob_reader_t - a class to fetch the full compressed BLOB or a prefix of it. zblob_delete_t - a class to delete/free compressed BLOB.
        Copyright (c) 2000, 2025, Oracle Corporation and/or its affiliates. All rights reserved.