WL#7280: WKB geometry container
Status: Complete
Implement wkb containers which conforms to boost.range concept, so as to be used as adapters between existing wkb geometry data and boost geometry algorhtms. The aim is to avoid conversions between wkb enconded byte string and boost geometry objects, which is how existing #7220/#7221/#7236 work logs handle the interactions between bg geometries and mysql geometries now. Such conversions can be expensive, especially considering the amount of geometry objects can be huge. We can achieve this by implementing STL-like containers using the wkb encoded byte string as container data, and implementing iterators which can iterate in the wkb encoded byte string. Then we can use these containers to implement our version of point/linestring/polygon/multipoint/multilinestring/multipolygon classes, and objects of these classes take a wkb pointer to obtain its data. And then via standard iterator interface, they can be used by boost geometry algorithms to do gis calculation. Because the boost.range needs random access iterators, and boost geometry by default inherit from std::vector, I decided to implement a simplified vector (my_wkb_vector) which has an iterator type of random access concept, the container object is created using wkb pointer, and then caller can create iterators with begin/end/rbegin/rend and do random iterator access. The iterators can be readable and writable, the vector has a minimal set of required methods. User Documentation ================== None required.
I-1: New files: Added two code files: gis_wkb_vector.h and gis_bg_traits.h I-2: No new syntax I-3: No new commands I-4: No new tools. I-5: No impact on existing functionality. I-6: Interface change: The Geometry class and its children classes in spatial.h now has another set of member functions, which are to be called by Boost Geometry indirectly via the traits class templates defined in gis_bg_traits.h
We can achieve this by implementing STL-like containers using the wkb encoded byte string as container data, and implementing iterators which can iterate in the wkb encoded byte string. Then we can use these containers to implement our version of point/linestring/polygon/multipoint/multilinestring/multipolygon classes, and objects of these classes take a wkb pointer to obtain its data. And then via standard iterator interface, they can be used by boost geometry algorithms to do gis calculation. Because the boost.range needs random access iterators, and boost geometry by default inherit from std::vector, I decided to implement a simplified vector (gis_wkb_vector) which has an iterator type of random access concept, the container object is created using wkb pointer, and then caller can create iterators with begin/end/rbegin/rend and do random iterator access. The iterators can be readable and writable, the vector has a minimal set of required methods. Gis_wkb_vector derives from Geometry, it represents a geometry composed of multiple Geometry objects of the component type, so it is the base class for linestring, multilinestring, multipolygon, multipoint, and the component for them are respectively point, linestring, polygon and point. And the Gis_line_string, Gis_multi_line_string, Gis_multi_polygon and Gis_multi_point classes now derive from Gis_wkb_vector instead of Geometry. Gis_polygon derives from Geometry as before because BG requires accessing the outer ring and inner rings separately, thus a polygon can't be seen as a sequence of rings, rather, it's seen as an outer ring and a sequence of inner rings. Gis_geometry_collection can't derive from Gis_wkb_vector because its component can be of any of the 7 types of geometry, there is no proper 'component type' for it, and fortunately BG doesn't use geometry collection type at all. When creating a Gis_wkb_vector object, the WKB byte string is parsed and its internal components is stored into the created Gis_wkb_vector object, recursively if possible, and the WKB memory is used as is, no copy or conversion is done. For example, when the wkb is a multilinestring, after parsing, the Gis_multi_line_string object stores a vector of Gis_line_string objects, each of which also stores a vector of Gis_point objects, all the Geometry objects refer to its own WKB memory, the pointers are all various parts of the input WKB buffer. Since the geometry structure is clear after the Gis_multi_line_string object is created, following access of components is quite fast. BG does a lot of such accesses in its algorithms. When the WKB is a polygon, the polygon is parsed so that the Gis_polygon object holds an outer ring (Gis_polygon_ring) object and a Gis_polygon_inner_rings object, the former holds a vector of Gis_point components, the latter has a vector of Gis_polygon_ring object each of which has a vector of Gis_point objects. BG uses the outer ring and inner rings separately in different ways. The Gis_polygon_ring and Gis_polygon_inner_rings are only used as BG adapter code, never used in other ways. We can't simply use Gis_line_string or Gis_multi_line_string for this, there would be type conflicts, and WKB strcuture isn't the same either(for inner rings). Update support We suppose updating a geometry can happen in the following ways: 1. create an empty geo, then append components into it, the geo must be a topmost one; a complex geometry such as a multilinestring can be seen as a tree of geometry components, and the mlstr is the topmost geometry, i.e. the root of the tree, its lstrs are next layer of nodes, their points are the 3rd layer of tree nodes. Only the root owns the wkb buffer, other components point somewhere into the buffer, and can only read the data. Polygons are only used by getting its exterior ring or inner rings and then work on that/those rings, never used as a whole. 2. *itr=value, each geo::m_owner can be used to track the topmost memory owner, and do reallocation to accormodate the value. This is for now not supported, will be if needed. So far geometry assignment are only used for point objects in boost geometry, thus only Geometry and Gis_point have operator=, no other classes need so, and thus there is no need for reallocation. 3. call resize() to append some objects at the end, then assign/append values to the added objects using push_back. Objects added this way are out of line(unless the object is a point), and user need to call reassemble() to make them inline, i.e. stored in its owner's memory. Use sql_alloc to allocate memory for WKB data as well as the component vector(an std::vector), including the vector's inner allocator. When the Geometry object is created with existing WKB data, the WKB data is directly used without any copy/conversion; when it's created by updating in one of the 3 ways above, the object holds its own memory and such memory is freed when the object is destroyed.
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.