WL#6455: InnoDB: GEOMETRY datatypes support
Status: Complete — Priority: Medium
In order to implement GIS indexing in InnoDB, InnoDB needs to support GEOMETRY datatypes, so that InnoDB stores them properly and understands them. Currently, InnoDB store all geometry types in BLOB. We can continue to use BLOB (other than point) as underlying datatype, but the column needs to be marked as proper datatypes. Since we use BLOB as underlying datatype, we still can use the prefix index on the geometry data column. Also, for the same reason, nothing needs to be done for recovery. Requirements: 1) Add new InnoDB internal data type DATA_GEOMETRY for geometry data. 2: All MySQL Geometry datatypes shall be mapped properly to DATA_GEOMETRY datatype, from DATA_BOLB. 3) An prefix index still can be created on all Geometry datatype columns. And the Index column shall have capability to do acurate select like below . Ex: SELECT AsText(g) FROM t_wl6455 WHERE g = POINT(2,2);
This worklog allows InnoDB to be able to "undertand" and access of current GIS datatypes supported by MySQL. InnoDB will not redefind GIS datatype definitions already defined by MySQL, instead, we will include the necessary .h file and re-use them, except for a generic InnoDB GEOMETRY datatype. In the first phase of our InnoDB GIS effort, we will just support existing MySQL GIS datatypes (Point, Line_string..). These are standard datatype as described in OGC standard (http://www.opengeospatial.org/standards/sfa). Most of them are in the standard Well-Known Binary(WKB) format (for more information, please see attached file "GIS_types"). MySQL server follows the standard define closely, the related classes and structures are defined in sql/spatial.h; Changes in InnoDB: 1) A new generic "Geometry" datatype In order to handle the geometry data in InnoDB, we need to separate geometry data with other data, so that we can handle it separately. Thus, a new data type DATA_GEOMETRY will be added in InnoDB. 2) The storage format Then, we need to store them in InnoDB. As the HLD described, since most of the geometry datatypes are not fixed, we still use BLOB as underlying datatype. So, when geometry data need to be stored in InnoDB, it will be treated like BLOB data. Also, when we need to read geometry data from InnoDB, it will be handled like BLOB data too. And in the first phase, we will keep it as WKB format. (Please find the detail information of WKB format in attached file WKB_format"). We may consider some other ways to store geo data for saving space in the future phase. 3) The interpretation of data After that, for understanding what's the geometry data meaning, we need to interpret it to what InnoDB can understand. Currently, we can just reuse the classes defined in sql/spatial.h. Though these classes, we can interpret WKB format data to some class instances, and use their member functions to operate geometry data. Testing: Before the changes, all MySQL geometry data type(Point, LineString..) map to InnoDB internal data type DATA_BLOB. After we add the new innoDB internal data type DATA_GEOMETRY, they will map to DATA_GEOMETRY. So, we need to test all possible operations on geometry data. For example: DDL: CREATE TABLE t_wl6455 ( i INT, g GEOMETRY NOT NULL) ENGINE=InnoDB; CREATE INDEX i_p ON t_wl6455 (g(10)); (Since the prefix index is still worked on geometry data, we can use prefix index to do accurate select like below). Select: SELECT AsText(g) FROM t_wl6455 WHERE g = POINT(2,2); SELECT AsText(g) FROM t1 where ST_Within(g, GeomFromText('POLYGON((5 1, 7 1, 7 7, 5 7, 3 3, 5 3, 5 1))')); DML: INSERT INTO t_wl6455 VALUES(1, POINT(1,1)); DELETE ... UPDATE .. .. Crash/recovery: After some operations done on geometry data, crash the server. Then restart the server to recover the geometry data.
In the first phase of our InnoDB GIS effort, we will just support existing MySQL GIS datatypes, they are 1) point 2) line_string 3) polygon 4) multi_point 5) multi_polygon 6) multi_line_string 7) geometry_collection 8) geometry The related classes and structures can be seen in sql/spatial.h and sql/spatial.cc; Classes defined in sql/spatial.h: Gis_point Gis_line_string Gis_polygon Gis_multi_point Gis_multi_polygon Gis_multi_line_string Gis_geometry_collection Gis_geometry For support all these geometry types, we need to do the following step: 1) Add new data type DATA_GEOMETRY: The datatype definition is in storage/innobase/include/data0type.h, we can add the new data type DATA_GEOMETRY in it. Accordingly, we need to modify some places, like storage/innobase/include/data0type.ic ... 2) Store geometry data: Following two functions are the interfaces of datatype exchange between InnoDB and MySQL: row_mysql_store_col_in_innobase_format () ====> MySQL to InnoDB row_sel_field_store_in_mysql_format () ====> InnoDB to MySQL We need to special handle the DATA_GEOMETRY in this two functions. And as described in HLD, we still use BLOB as underlying datatype, so we can just handle it like BLOB. Also, we need to modify some other places accordingly. 3)Interpret geometry data: sql/spatial.h and sql/spatial.cc defined some classes for geometry data. We can include it directly and use them to interpret geometry data. As described in HLD, we still can use the prefix index on the geometry data column. And since no need to create a new log data type, nothing needs to be done for recovery.
Copyright (c) 2000, 2017, Oracle Corporation and/or its affiliates. All rights reserved.