WL#7740: InnoDB GIS: Enhance Check Table for InnoDB Spatial index
Status: Complete — Priority: Medium
We currently just do minimum checks on InnoDB spatial indexes. There should be more works related to the R-tree index when comes consistency check. Here are some notable work items are following: 1. When check parent/child entry relations, the parent entry contains a MBR (minimum bounding rectangle) that covers all entries in the child page. So it is NOT the first entry of the child page. So the check should be on this MBR, to see if covers all items in the child page, rather than assert it is identical to the first entry of child. 2. There is no ordering relationship between rows in neighboring pages. So the last entry of left page could be smaller than the right page first item. 3. There is no ordering relationship between rows in parent and child page layout (how they are linked). So as you walk the entries in the parent pages, the child page they pointed might not directly linked. 4. In the leaf level, it does not contain any data, but a MBR covers the data in the primary index, plus the primary key. This needs to be checked. We also need to check no 2 NON-DELETED index entries pointed to the same PK entry. 5. For checking index entry count, since the leaf level recs of spatial index are not ordered, so, we can't do consistent read leaf level recs like btree, because this consistent read need store/restore cursor based on the recs ordering. We can use current rtree search mechanism to do the rtree scan. The plan is build a search key which contains all mbrs in rtree, and do rtree search, then we can get the rtree entry count. 6. For minimal rec flag REC_INFO_MIN_REC_FLAG, we still need to keep the current check, that is to check if it's be marked on the first rec of each non-leaf level. This is already be done in previous related Btree code. 7. And in this worklog, we also want to optimize the non-leaf level node store of R-tree. Currently, like B-tree, in non-leaf level, for each rec, we still store as this format: key(which is MBR for R-tree) + PK(which is the PK from first rec of child page) + page_no. But, actually, we don't need to store PK, since it's useless. For every search, we only need MBR + page_no to find the right leaf page. So, if we removed the PK field from non-leaf page rec, we not only can save space, especially for variable length PK, we also don't need to handle the recursive modify the PK field.
Functional requirements: F-1: Checking spatial index is valid or not. R-tree structure is valid, entry count is matched with cluster index.. Non-Functional requirements: NF-1: Implicit requirements: No new SQL needed, work on all platforms, do not break replication, backup, partitioning, FK, or any other exiting features. NF-2: No change in semantics expected.
1: For checking spatial index is valid or not, we just need to modify exist btr_validate* functions, like:btr_validate_spatial_index, btr_validate_level don't need to add or modify any exist interfaces. Just need to add some check logic for spatial index. 2: For removing pk in non-leaf page rec, we also don't need to add any new interfaces too. We just need to modify the code in rtr_index_build_node_ptr and special handle it in the related places.
The code change includes: 1: Modify btr_validate_level, add these code for checking spatail index. a: The MBR of upper level rtree rec contains all MBRs in its child page. b: Check no 2 NON-DELETED index entries pointed to the same PK entry. 2: Modify row_count_rtree_recs, use the check logic described in HLD. 3: Modify rtr_index_build_node_ptr, remove building pk field for non-leaf page rec. And spacial handle this in related places.
Copyright (c) 2000, 2019, Oracle Corporation and/or its affiliates. All rights reserved.